MediaTek-Research/TCEval-v2
收藏Hugging Face2024-04-02 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/MediaTek-Research/TCEval-v2
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: drcd
features:
- name: id
dtype: string
- name: paragraph
dtype: string
- name: question
dtype: string
- name: references
list: string
splits:
- name: test
num_bytes: 4899369
num_examples: 3493
- name: dev
num_bytes: 5845
num_examples: 5
download_size: 1168539
dataset_size: 4905214
- config_name: mt_bench_tw-coding
features:
- name: id
dtype: string
- name: turns
list: string
- name: reference
list: string
- name: category
dtype: string
splits:
- name: test
num_bytes: 11252
num_examples: 10
download_size: 10860
dataset_size: 11252
- config_name: mt_bench_tw-extraction
features:
- name: id
dtype: string
- name: turns
list: string
- name: reference
list: string
- name: category
dtype: string
splits:
- name: test
num_bytes: 10882
num_examples: 10
download_size: 17098
dataset_size: 10882
- config_name: mt_bench_tw-humanities
features:
- name: id
dtype: string
- name: turns
list: string
- name: reference
list: string
- name: category
dtype: string
splits:
- name: test
num_bytes: 2996
num_examples: 10
download_size: 5049
dataset_size: 2996
- config_name: mt_bench_tw-math
features:
- name: id
dtype: string
- name: turns
list: string
- name: reference
list: string
- name: category
dtype: string
splits:
- name: test
num_bytes: 3041
num_examples: 10
download_size: 5054
dataset_size: 3041
- config_name: mt_bench_tw-reasoning
features:
- name: id
dtype: string
- name: turns
list: string
- name: reference
list: string
- name: category
dtype: string
splits:
- name: test
num_bytes: 4492
num_examples: 10
download_size: 8402
dataset_size: 4492
- config_name: mt_bench_tw-roleplay
features:
- name: id
dtype: string
- name: turns
list: string
- name: reference
list: string
- name: category
dtype: string
splits:
- name: test
num_bytes: 4134
num_examples: 10
download_size: 6634
dataset_size: 4134
- config_name: mt_bench_tw-stem
features:
- name: id
dtype: string
- name: turns
list: string
- name: reference
list: string
- name: category
dtype: string
splits:
- name: test
num_bytes: 3103
num_examples: 10
download_size: 5430
dataset_size: 3103
- config_name: mt_bench_tw-writing
features:
- name: id
dtype: string
- name: turns
list: string
- name: reference
list: string
- name: category
dtype: string
splits:
- name: test
num_bytes: 3469
num_examples: 10
download_size: 6701
dataset_size: 3469
- config_name: penguin_table
features:
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: E
dtype: string
- name: answer
dtype: string
- name: id
dtype: string
splits:
- name: dev
num_bytes: 2588
num_examples: 5
- name: test
num_bytes: 74241
num_examples: 144
download_size: 21218
dataset_size: 76829
- config_name: tmmluplus-accounting
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 134876
num_examples: 191
- name: dev
num_bytes: 3764
num_examples: 5
download_size: 87921
dataset_size: 138640
- config_name: tmmluplus-administrative_law
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 169553
num_examples: 420
- name: dev
num_bytes: 2567
num_examples: 5
download_size: 107897
dataset_size: 172120
- config_name: tmmluplus-advance_chemistry
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 33891
num_examples: 123
- name: dev
num_bytes: 1581
num_examples: 5
download_size: 34210
dataset_size: 35472
- config_name: tmmluplus-agriculture
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 46502
num_examples: 151
- name: dev
num_bytes: 1715
num_examples: 5
download_size: 40849
dataset_size: 48217
- config_name: tmmluplus-anti_money_laundering
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 54293
num_examples: 134
- name: dev
num_bytes: 2552
num_examples: 5
download_size: 47614
dataset_size: 56845
- config_name: tmmluplus-auditing
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 272426
num_examples: 550
- name: dev
num_bytes: 1947
num_examples: 5
download_size: 147664
dataset_size: 274373
- config_name: tmmluplus-basic_medical_science
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 312503
num_examples: 954
- name: dev
num_bytes: 1599
num_examples: 5
download_size: 194337
dataset_size: 314102
- config_name: tmmluplus-business_management
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 45074
num_examples: 139
- name: dev
num_bytes: 1403
num_examples: 5
download_size: 39338
dataset_size: 46477
- config_name: tmmluplus-chinese_language_and_literature
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 127469
num_examples: 199
- name: dev
num_bytes: 2054
num_examples: 5
download_size: 103909
dataset_size: 129523
- config_name: tmmluplus-clinical_psychology
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 55748
num_examples: 125
- name: dev
num_bytes: 2029
num_examples: 5
download_size: 51770
dataset_size: 57777
- config_name: tmmluplus-computer_science
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 57883
num_examples: 174
- name: dev
num_bytes: 1894
num_examples: 5
download_size: 49090
dataset_size: 59777
- config_name: tmmluplus-culinary_skills
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 94564
num_examples: 292
- name: dev
num_bytes: 1540
num_examples: 5
download_size: 69998
dataset_size: 96104
- config_name: tmmluplus-dentistry
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 152113
num_examples: 399
- name: dev
num_bytes: 1684
num_examples: 5
download_size: 105595
dataset_size: 153797
- config_name: tmmluplus-economics
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 145972
num_examples: 393
- name: dev
num_bytes: 1946
num_examples: 5
download_size: 91284
dataset_size: 147918
- config_name: tmmluplus-education
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 44729
num_examples: 124
- name: dev
num_bytes: 1760
num_examples: 5
download_size: 41837
dataset_size: 46489
- config_name: tmmluplus-education_(profession_level)
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 208632
num_examples: 486
- name: dev
num_bytes: 3183
num_examples: 5
download_size: 136861
dataset_size: 211815
- config_name: tmmluplus-educational_psychology
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 71860
num_examples: 176
- name: dev
num_bytes: 2314
num_examples: 5
download_size: 56964
dataset_size: 74174
- config_name: tmmluplus-engineering_math
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 35214
num_examples: 103
- name: dev
num_bytes: 1954
num_examples: 5
download_size: 33378
dataset_size: 37168
- config_name: tmmluplus-finance_banking
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 59005
num_examples: 135
- name: dev
num_bytes: 2232
num_examples: 5
download_size: 47576
dataset_size: 61237
- config_name: tmmluplus-financial_analysis
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 128903
num_examples: 382
- name: dev
num_bytes: 1537
num_examples: 5
download_size: 68492
dataset_size: 130440
- config_name: tmmluplus-fire_science
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 37661
num_examples: 124
- name: dev
num_bytes: 1690
num_examples: 5
download_size: 33612
dataset_size: 39351
- config_name: tmmluplus-general_principles_of_law
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 47582
num_examples: 106
- name: dev
num_bytes: 1777
num_examples: 5
download_size: 40369
dataset_size: 49359
- config_name: tmmluplus-geography_of_taiwan
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 242009
num_examples: 768
- name: dev
num_bytes: 1689
num_examples: 5
download_size: 144499
dataset_size: 243698
- config_name: tmmluplus-human_behavior
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 132226
num_examples: 309
- name: dev
num_bytes: 2149
num_examples: 5
download_size: 93526
dataset_size: 134375
- config_name: tmmluplus-insurance_studies
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 349058
num_examples: 760
- name: dev
num_bytes: 2023
num_examples: 5
download_size: 174957
dataset_size: 351081
- config_name: tmmluplus-introduction_to_law
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 93914
num_examples: 237
- name: dev
num_bytes: 3868
num_examples: 5
download_size: 72390
dataset_size: 97782
- config_name: tmmluplus-jce_humanities
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 95795
num_examples: 90
- name: dev
num_bytes: 6230
num_examples: 5
download_size: 79879
dataset_size: 102025
- config_name: tmmluplus-junior_chemistry
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 56079
num_examples: 209
- name: dev
num_bytes: 1472
num_examples: 5
download_size: 44646
dataset_size: 57551
- config_name: tmmluplus-junior_chinese_exam
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 169271
num_examples: 175
- name: dev
num_bytes: 7581
num_examples: 5
download_size: 139825
dataset_size: 176852
- config_name: tmmluplus-junior_math_exam
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 51452
num_examples: 175
- name: dev
num_bytes: 1511
num_examples: 5
download_size: 38704
dataset_size: 52963
- config_name: tmmluplus-junior_science_exam
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 105830
num_examples: 213
- name: dev
num_bytes: 2473
num_examples: 5
download_size: 78758
dataset_size: 108303
- config_name: tmmluplus-junior_social_studies
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 92873
num_examples: 126
- name: dev
num_bytes: 4171
num_examples: 5
download_size: 76559
dataset_size: 97044
- config_name: tmmluplus-logic_reasoning
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 40639
num_examples: 139
- name: dev
num_bytes: 1591
num_examples: 5
download_size: 31931
dataset_size: 42230
- config_name: tmmluplus-macroeconomics
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 125238
num_examples: 411
- name: dev
num_bytes: 1510
num_examples: 5
download_size: 76559
dataset_size: 126748
- config_name: tmmluplus-management_accounting
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 105401
num_examples: 215
- name: dev
num_bytes: 2212
num_examples: 5
download_size: 63286
dataset_size: 107613
- config_name: tmmluplus-marketing_management
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 32431
num_examples: 93
- name: dev
num_bytes: 1802
num_examples: 5
download_size: 32600
dataset_size: 34233
- config_name: tmmluplus-mechanical
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 32709
num_examples: 118
- name: dev
num_bytes: 1112
num_examples: 5
download_size: 30409
dataset_size: 33821
- config_name: tmmluplus-music
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 91304
num_examples: 278
- name: dev
num_bytes: 1598
num_examples: 5
download_size: 68538
dataset_size: 92902
- config_name: tmmluplus-national_protection
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 55256
num_examples: 211
- name: dev
num_bytes: 1186
num_examples: 5
download_size: 42755
dataset_size: 56442
- config_name: tmmluplus-nautical_science
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 163848
num_examples: 551
- name: dev
num_bytes: 1131
num_examples: 5
download_size: 97058
dataset_size: 164979
- config_name: tmmluplus-occupational_therapy_for_psychological_disorders
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 268018
num_examples: 543
- name: dev
num_bytes: 2198
num_examples: 5
download_size: 152382
dataset_size: 270216
- config_name: tmmluplus-official_document_management
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 67868
num_examples: 222
- name: dev
num_bytes: 1752
num_examples: 5
download_size: 42263
dataset_size: 69620
- config_name: tmmluplus-optometry
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 367273
num_examples: 920
- name: dev
num_bytes: 1756
num_examples: 5
download_size: 197708
dataset_size: 369029
- config_name: tmmluplus-organic_chemistry
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 29720
num_examples: 109
- name: dev
num_bytes: 1316
num_examples: 5
download_size: 31856
dataset_size: 31036
- config_name: tmmluplus-pharmacology
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 164131
num_examples: 577
- name: dev
num_bytes: 1040
num_examples: 5
download_size: 94751
dataset_size: 165171
- config_name: tmmluplus-pharmacy
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 113563
num_examples: 391
- name: dev
num_bytes: 1252
num_examples: 5
download_size: 77275
dataset_size: 114815
- config_name: tmmluplus-physical_education
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 47469
num_examples: 179
- name: dev
num_bytes: 1202
num_examples: 5
download_size: 39538
dataset_size: 48671
- config_name: tmmluplus-physics
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 30030
num_examples: 97
- name: dev
num_bytes: 1191
num_examples: 5
download_size: 30370
dataset_size: 31221
- config_name: tmmluplus-politic_science
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 279612
num_examples: 995
- name: dev
num_bytes: 1444
num_examples: 5
download_size: 155705
dataset_size: 281056
- config_name: tmmluplus-real_estate
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 38600
num_examples: 92
- name: dev
num_bytes: 2599
num_examples: 5
download_size: 36955
dataset_size: 41199
- config_name: tmmluplus-secondary_physics
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 45698
num_examples: 112
- name: dev
num_bytes: 1686
num_examples: 5
download_size: 41917
dataset_size: 47384
- config_name: tmmluplus-statistics_and_machine_learning
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 83999
num_examples: 224
- name: dev
num_bytes: 2368
num_examples: 5
download_size: 64213
dataset_size: 86367
- config_name: tmmluplus-taiwanese_hokkien
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 40896
num_examples: 129
- name: dev
num_bytes: 2197
num_examples: 5
download_size: 40308
dataset_size: 43093
- config_name: tmmluplus-taxation
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 154730
num_examples: 375
- name: dev
num_bytes: 1924
num_examples: 5
download_size: 97906
dataset_size: 156654
- config_name: tmmluplus-technical
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 94384
num_examples: 402
- name: dev
num_bytes: 1084
num_examples: 5
download_size: 60659
dataset_size: 95468
- config_name: tmmluplus-three_principles_of_people
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 33261
num_examples: 139
- name: dev
num_bytes: 1234
num_examples: 5
download_size: 28540
dataset_size: 34495
- config_name: tmmluplus-trade
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 179952
num_examples: 502
- name: dev
num_bytes: 1679
num_examples: 5
download_size: 98998
dataset_size: 181631
- config_name: tmmluplus-traditional_chinese_medicine_clinical_medicine
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 115490
num_examples: 278
- name: dev
num_bytes: 1922
num_examples: 5
download_size: 76367
dataset_size: 117412
- config_name: tmmluplus-trust_practice
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 155403
num_examples: 401
- name: dev
num_bytes: 2556
num_examples: 5
download_size: 94795
dataset_size: 157959
- config_name: tmmluplus-ttqav2
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 41379
num_examples: 113
- name: dev
num_bytes: 2246
num_examples: 5
download_size: 40353
dataset_size: 43625
- config_name: tmmluplus-tve_chinese_language
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 539326
num_examples: 483
- name: dev
num_bytes: 5360
num_examples: 5
download_size: 401013
dataset_size: 544686
- config_name: tmmluplus-tve_design
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 182865
num_examples: 480
- name: dev
num_bytes: 2304
num_examples: 5
download_size: 119979
dataset_size: 185169
- config_name: tmmluplus-tve_mathematics
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 42519
num_examples: 150
- name: dev
num_bytes: 1290
num_examples: 5
download_size: 36304
dataset_size: 43809
- config_name: tmmluplus-tve_natural_sciences
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 139853
num_examples: 424
- name: dev
num_bytes: 2163
num_examples: 5
download_size: 100220
dataset_size: 142016
- config_name: tmmluplus-veterinary_pathology
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 91700
num_examples: 283
- name: dev
num_bytes: 1803
num_examples: 5
download_size: 59000
dataset_size: 93503
- config_name: tmmluplus-veterinary_pharmacology
features:
- name: id
dtype: string
- name: question
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: answer
dtype: string
- name: category
dtype: string
- name: subcategory
dtype: string
- name: subject
dtype: string
splits:
- name: test
num_bytes: 151825
num_examples: 540
- name: dev
num_bytes: 1419
num_examples: 5
download_size: 81980
dataset_size: 153244
configs:
- config_name: drcd
data_files:
- split: test
path: drcd/test-*
- split: dev
path: drcd/dev-*
- config_name: mt_bench_tw-coding
data_files:
- split: test
path: mt_bench_tw-coding/test-*
- config_name: mt_bench_tw-extraction
data_files:
- split: test
path: mt_bench_tw-extraction/test-*
- config_name: mt_bench_tw-humanities
data_files:
- split: test
path: mt_bench_tw-humanities/test-*
- config_name: mt_bench_tw-math
data_files:
- split: test
path: mt_bench_tw-math/test-*
- config_name: mt_bench_tw-reasoning
data_files:
- split: test
path: mt_bench_tw-reasoning/test-*
- config_name: mt_bench_tw-roleplay
data_files:
- split: test
path: mt_bench_tw-roleplay/test-*
- config_name: mt_bench_tw-stem
data_files:
- split: test
path: mt_bench_tw-stem/test-*
- config_name: mt_bench_tw-writing
data_files:
- split: test
path: mt_bench_tw-writing/test-*
- config_name: penguin_table
data_files:
- split: dev
path: penguin_table/dev-*
- split: test
path: penguin_table/test-*
- config_name: tmmluplus-accounting
data_files:
- split: test
path: tmmluplus-accounting/test-*
- split: dev
path: tmmluplus-accounting/dev-*
- config_name: tmmluplus-administrative_law
data_files:
- split: test
path: tmmluplus-administrative_law/test-*
- split: dev
path: tmmluplus-administrative_law/dev-*
- config_name: tmmluplus-advance_chemistry
data_files:
- split: test
path: tmmluplus-advance_chemistry/test-*
- split: dev
path: tmmluplus-advance_chemistry/dev-*
- config_name: tmmluplus-agriculture
data_files:
- split: test
path: tmmluplus-agriculture/test-*
- split: dev
path: tmmluplus-agriculture/dev-*
- config_name: tmmluplus-anti_money_laundering
data_files:
- split: test
path: tmmluplus-anti_money_laundering/test-*
- split: dev
path: tmmluplus-anti_money_laundering/dev-*
- config_name: tmmluplus-auditing
data_files:
- split: test
path: tmmluplus-auditing/test-*
- split: dev
path: tmmluplus-auditing/dev-*
- config_name: tmmluplus-basic_medical_science
data_files:
- split: test
path: tmmluplus-basic_medical_science/test-*
- split: dev
path: tmmluplus-basic_medical_science/dev-*
- config_name: tmmluplus-business_management
data_files:
- split: test
path: tmmluplus-business_management/test-*
- split: dev
path: tmmluplus-business_management/dev-*
- config_name: tmmluplus-chinese_language_and_literature
data_files:
- split: test
path: tmmluplus-chinese_language_and_literature/test-*
- split: dev
path: tmmluplus-chinese_language_and_literature/dev-*
- config_name: tmmluplus-clinical_psychology
data_files:
- split: test
path: tmmluplus-clinical_psychology/test-*
- split: dev
path: tmmluplus-clinical_psychology/dev-*
- config_name: tmmluplus-computer_science
data_files:
- split: test
path: tmmluplus-computer_science/test-*
- split: dev
path: tmmluplus-computer_science/dev-*
- config_name: tmmluplus-culinary_skills
data_files:
- split: test
path: tmmluplus-culinary_skills/test-*
- split: dev
path: tmmluplus-culinary_skills/dev-*
- config_name: tmmluplus-dentistry
data_files:
- split: test
path: tmmluplus-dentistry/test-*
- split: dev
path: tmmluplus-dentistry/dev-*
- config_name: tmmluplus-economics
data_files:
- split: test
path: tmmluplus-economics/test-*
- split: dev
path: tmmluplus-economics/dev-*
- config_name: tmmluplus-education
data_files:
- split: test
path: tmmluplus-education/test-*
- split: dev
path: tmmluplus-education/dev-*
- config_name: tmmluplus-education_(profession_level)
data_files:
- split: test
path: tmmluplus-education_(profession_level)/test-*
- split: dev
path: tmmluplus-education_(profession_level)/dev-*
- config_name: tmmluplus-educational_psychology
data_files:
- split: test
path: tmmluplus-educational_psychology/test-*
- split: dev
path: tmmluplus-educational_psychology/dev-*
- config_name: tmmluplus-engineering_math
data_files:
- split: test
path: tmmluplus-engineering_math/test-*
- split: dev
path: tmmluplus-engineering_math/dev-*
- config_name: tmmluplus-finance_banking
data_files:
- split: test
path: tmmluplus-finance_banking/test-*
- split: dev
path: tmmluplus-finance_banking/dev-*
- config_name: tmmluplus-financial_analysis
data_files:
- split: test
path: tmmluplus-financial_analysis/test-*
- split: dev
path: tmmluplus-financial_analysis/dev-*
- config_name: tmmluplus-fire_science
data_files:
- split: test
path: tmmluplus-fire_science/test-*
- split: dev
path: tmmluplus-fire_science/dev-*
- config_name: tmmluplus-general_principles_of_law
data_files:
- split: test
path: tmmluplus-general_principles_of_law/test-*
- split: dev
path: tmmluplus-general_principles_of_law/dev-*
- config_name: tmmluplus-geography_of_taiwan
data_files:
- split: test
path: tmmluplus-geography_of_taiwan/test-*
- split: dev
path: tmmluplus-geography_of_taiwan/dev-*
- config_name: tmmluplus-human_behavior
data_files:
- split: test
path: tmmluplus-human_behavior/test-*
- split: dev
path: tmmluplus-human_behavior/dev-*
- config_name: tmmluplus-insurance_studies
data_files:
- split: test
path: tmmluplus-insurance_studies/test-*
- split: dev
path: tmmluplus-insurance_studies/dev-*
- config_name: tmmluplus-introduction_to_law
data_files:
- split: test
path: tmmluplus-introduction_to_law/test-*
- split: dev
path: tmmluplus-introduction_to_law/dev-*
- config_name: tmmluplus-jce_humanities
data_files:
- split: test
path: tmmluplus-jce_humanities/test-*
- split: dev
path: tmmluplus-jce_humanities/dev-*
- config_name: tmmluplus-junior_chemistry
data_files:
- split: test
path: tmmluplus-junior_chemistry/test-*
- split: dev
path: tmmluplus-junior_chemistry/dev-*
- config_name: tmmluplus-junior_chinese_exam
data_files:
- split: test
path: tmmluplus-junior_chinese_exam/test-*
- split: dev
path: tmmluplus-junior_chinese_exam/dev-*
- config_name: tmmluplus-junior_math_exam
data_files:
- split: test
path: tmmluplus-junior_math_exam/test-*
- split: dev
path: tmmluplus-junior_math_exam/dev-*
- config_name: tmmluplus-junior_science_exam
data_files:
- split: test
path: tmmluplus-junior_science_exam/test-*
- split: dev
path: tmmluplus-junior_science_exam/dev-*
- config_name: tmmluplus-junior_social_studies
data_files:
- split: test
path: tmmluplus-junior_social_studies/test-*
- split: dev
path: tmmluplus-junior_social_studies/dev-*
- config_name: tmmluplus-logic_reasoning
data_files:
- split: test
path: tmmluplus-logic_reasoning/test-*
- split: dev
path: tmmluplus-logic_reasoning/dev-*
- config_name: tmmluplus-macroeconomics
data_files:
- split: test
path: tmmluplus-macroeconomics/test-*
- split: dev
path: tmmluplus-macroeconomics/dev-*
- config_name: tmmluplus-management_accounting
data_files:
- split: test
path: tmmluplus-management_accounting/test-*
- split: dev
path: tmmluplus-management_accounting/dev-*
- config_name: tmmluplus-marketing_management
data_files:
- split: test
path: tmmluplus-marketing_management/test-*
- split: dev
path: tmmluplus-marketing_management/dev-*
- config_name: tmmluplus-mechanical
data_files:
- split: test
path: tmmluplus-mechanical/test-*
- split: dev
path: tmmluplus-mechanical/dev-*
- config_name: tmmluplus-music
data_files:
- split: test
path: tmmluplus-music/test-*
- split: dev
path: tmmluplus-music/dev-*
- config_name: tmmluplus-national_protection
data_files:
- split: test
path: tmmluplus-national_protection/test-*
- split: dev
path: tmmluplus-national_protection/dev-*
- config_name: tmmluplus-nautical_science
data_files:
- split: test
path: tmmluplus-nautical_science/test-*
- split: dev
path: tmmluplus-nautical_science/dev-*
- config_name: tmmluplus-occupational_therapy_for_psychological_disorders
data_files:
- split: test
path: tmmluplus-occupational_therapy_for_psychological_disorders/test-*
- split: dev
path: tmmluplus-occupational_therapy_for_psychological_disorders/dev-*
- config_name: tmmluplus-official_document_management
data_files:
- split: test
path: tmmluplus-official_document_management/test-*
- split: dev
path: tmmluplus-official_document_management/dev-*
- config_name: tmmluplus-optometry
data_files:
- split: test
path: tmmluplus-optometry/test-*
- split: dev
path: tmmluplus-optometry/dev-*
- config_name: tmmluplus-organic_chemistry
data_files:
- split: test
path: tmmluplus-organic_chemistry/test-*
- split: dev
path: tmmluplus-organic_chemistry/dev-*
- config_name: tmmluplus-pharmacology
data_files:
- split: test
path: tmmluplus-pharmacology/test-*
- split: dev
path: tmmluplus-pharmacology/dev-*
- config_name: tmmluplus-pharmacy
data_files:
- split: test
path: tmmluplus-pharmacy/test-*
- split: dev
path: tmmluplus-pharmacy/dev-*
- config_name: tmmluplus-physical_education
data_files:
- split: test
path: tmmluplus-physical_education/test-*
- split: dev
path: tmmluplus-physical_education/dev-*
- config_name: tmmluplus-physics
data_files:
- split: test
path: tmmluplus-physics/test-*
- split: dev
path: tmmluplus-physics/dev-*
- config_name: tmmluplus-politic_science
data_files:
- split: test
path: tmmluplus-politic_science/test-*
- split: dev
path: tmmluplus-politic_science/dev-*
- config_name: tmmluplus-real_estate
data_files:
- split: test
path: tmmluplus-real_estate/test-*
- split: dev
path: tmmluplus-real_estate/dev-*
- config_name: tmmluplus-secondary_physics
data_files:
- split: test
path: tmmluplus-secondary_physics/test-*
- split: dev
path: tmmluplus-secondary_physics/dev-*
- config_name: tmmluplus-statistics_and_machine_learning
data_files:
- split: test
path: tmmluplus-statistics_and_machine_learning/test-*
- split: dev
path: tmmluplus-statistics_and_machine_learning/dev-*
- config_name: tmmluplus-taiwanese_hokkien
data_files:
- split: test
path: tmmluplus-taiwanese_hokkien/test-*
- split: dev
path: tmmluplus-taiwanese_hokkien/dev-*
- config_name: tmmluplus-taxation
data_files:
- split: test
path: tmmluplus-taxation/test-*
- split: dev
path: tmmluplus-taxation/dev-*
- config_name: tmmluplus-technical
data_files:
- split: test
path: tmmluplus-technical/test-*
- split: dev
path: tmmluplus-technical/dev-*
- config_name: tmmluplus-three_principles_of_people
data_files:
- split: test
path: tmmluplus-three_principles_of_people/test-*
- split: dev
path: tmmluplus-three_principles_of_people/dev-*
- config_name: tmmluplus-trade
data_files:
- split: test
path: tmmluplus-trade/test-*
- split: dev
path: tmmluplus-trade/dev-*
- config_name: tmmluplus-traditional_chinese_medicine_clinical_medicine
data_files:
- split: test
path: tmmluplus-traditional_chinese_medicine_clinical_medicine/test-*
- split: dev
path: tmmluplus-traditional_chinese_medicine_clinical_medicine/dev-*
- config_name: tmmluplus-trust_practice
data_files:
- split: test
path: tmmluplus-trust_practice/test-*
- split: dev
path: tmmluplus-trust_practice/dev-*
- config_name: tmmluplus-ttqav2
data_files:
- split: test
path: tmmluplus-ttqav2/test-*
- split: dev
path: tmmluplus-ttqav2/dev-*
- config_name: tmmluplus-tve_chinese_language
data_files:
- split: test
path: tmmluplus-tve_chinese_language/test-*
- split: dev
path: tmmluplus-tve_chinese_language/dev-*
- config_name: tmmluplus-tve_design
data_files:
- split: test
path: tmmluplus-tve_design/test-*
- split: dev
path: tmmluplus-tve_design/dev-*
- config_name: tmmluplus-tve_mathematics
data_files:
- split: test
path: tmmluplus-tve_mathematics/test-*
- split: dev
path: tmmluplus-tve_mathematics/dev-*
- config_name: tmmluplus-tve_natural_sciences
data_files:
- split: test
path: tmmluplus-tve_natural_sciences/test-*
- split: dev
path: tmmluplus-tve_natural_sciences/dev-*
- config_name: tmmluplus-veterinary_pathology
data_files:
- split: test
path: tmmluplus-veterinary_pathology/test-*
- split: dev
path: tmmluplus-veterinary_pathology/dev-*
- config_name: tmmluplus-veterinary_pharmacology
data_files:
- split: test
path: tmmluplus-veterinary_pharmacology/test-*
- split: dev
path: tmmluplus-veterinary_pharmacology/dev-*
---
# TCEval v2
TCEval-v2 is a Traditional Chinese evaluation suite for foundation models derived from TCEval-v1. It covers 5 capabilities, including contextual QA, knowledge, classification, and table understanding.
## Benchmark
- **Contextual QA**
- **drcd** : DRCD is a Traditional Chinese machine reading comprehension dataset containing 10,014 paragraphs from 2,108 Wikipedia articles and over 30,000 questions.
- **Knowledge**
- **tmmluplus** (provided by MediaTek Research and iKala): Taiwan Massive Multitask Language Understanding + (TMMLU+) is curated from examinations in Taiwan, consisting of 67 subjects spanning across multiple disciplines, from vocational to academic fields, and covering elementary to professional proficiency levels. It is designed to identify a model’s knowledge and problem-solving blind spots similar to human evaluations. It is categorized into STEM, humanties, social sciences and other (similar to MMLU), for a higher level overview of the model capabilities.
- **Table Understanding**
- **penguin_table** (translate from a subset of [BIG-Bench](https://github.com/google/BIG-bench/tree/main/bigbench/benchmark_tasks/penguins_in_a_table)): The “penguins in a table” task contained in BIG-bench asks a language model to answer questions about the animals contained in a table, or multiple tables, described in the context.
- **Chat and instruction following**
- **mt_bench_tw** (translated from [MT Bench](https://huggingface.co/spaces/lmsys/mt-bench)): MT-Bench-TW is a Traditional Chinese version of MT-bench, which is a series of open-ended questions that evaluate a chatbot’s multi-turn conversational and instruction-following ability. MT-Bench-TW inherits the categorization of MT-Bench, which includes a wide variety of core capabilities, such as reasoning and writing.
If you find the dataset useful in your work, please cite:
```
@misc{hsu2023advancing,
title={Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite},
author={Chan-Jan Hsu and Chang-Le Liu and Feng-Ting Liao and Po-Chun Hsu and Yi-Chang Chen and Da-shan Shiu},
year={2023},
eprint={2309.08448},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
提供机构:
MediaTek-Research
原始信息汇总
数据集概述
数据集配置列表
1. drcd
- 特征:
- id: string
- paragraph: string
- question: string
- references: list[string]
- 分割:
- test: 4899369 bytes, 3493 examples
- dev: 5845 bytes, 5 examples
- 下载大小: 1168539 bytes
- 数据集大小: 4905214 bytes
2. mt_bench_tw-coding
- 特征:
- id: string
- turns: list[string]
- reference: list[string]
- category: string
- 分割:
- test: 11252 bytes, 10 examples
- 下载大小: 10860 bytes
- 数据集大小: 11252 bytes
3. mt_bench_tw-extraction
- 特征:
- id: string
- turns: list[string]
- reference: list[string]
- category: string
- 分割:
- test: 10882 bytes, 10 examples
- 下载大小: 17098 bytes
- 数据集大小: 10882 bytes
4. mt_bench_tw-humanities
- 特征:
- id: string
- turns: list[string]
- reference: list[string]
- category: string
- 分割:
- test: 2996 bytes, 10 examples
- 下载大小: 5049 bytes
- 数据集大小: 2996 bytes
5. mt_bench_tw-math
- 特征:
- id: string
- turns: list[string]
- reference: list[string]
- category: string
- 分割:
- test: 3041 bytes, 10 examples
- 下载大小: 5054 bytes
- 数据集大小: 3041 bytes
6. mt_bench_tw-reasoning
- 特征:
- id: string
- turns: list[string]
- reference: list[string]
- category: string
- 分割:
- test: 4492 bytes, 10 examples
- 下载大小: 8402 bytes
- 数据集大小: 4492 bytes
7. mt_bench_tw-roleplay
- 特征:
- id: string
- turns: list[string]
- reference: list[string]
- category: string
- 分割:
- test: 4134 bytes, 10 examples
- 下载大小: 6634 bytes
- 数据集大小: 4134 bytes
8. mt_bench_tw-stem
- 特征:
- id: string
- turns: list[string]
- reference: list[string]
- category: string
- 分割:
- test: 3103 bytes, 10 examples
- 下载大小: 5430 bytes
- 数据集大小: 3103 bytes
9. mt_bench_tw-writing
- 特征:
- id: string
- turns: list[string]
- reference: list[string]
- category: string
- 分割:
- test: 3469 bytes, 10 examples
- 下载大小: 6701 bytes
- 数据集大小: 3469 bytes
10. penguin_table
- 特征:
- question: string
- A: string
- B: string
- C: string
- D: string
- E: string
- answer: string
- id: string
- 分割:
- dev: 2588 bytes, 5 examples
- test: 74241 bytes, 144 examples
- 下载大小: 21218 bytes
- 数据集大小: 76829 bytes
11. tmmluplus-accounting
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 134876 bytes, 191 examples
- dev: 3764 bytes, 5 examples
- 下载大小: 87921 bytes
- 数据集大小: 138640 bytes
12. tmmluplus-administrative_law
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 169553 bytes, 420 examples
- dev: 2567 bytes, 5 examples
- 下载大小: 107897 bytes
- 数据集大小: 172120 bytes
13. tmmluplus-advance_chemistry
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 33891 bytes, 123 examples
- dev: 1581 bytes, 5 examples
- 下载大小: 34210 bytes
- 数据集大小: 35472 bytes
14. tmmluplus-agriculture
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 46502 bytes, 151 examples
- dev: 1715 bytes, 5 examples
- 下载大小: 40849 bytes
- 数据集大小: 48217 bytes
15. tmmluplus-anti_money_laundering
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 54293 bytes, 134 examples
- dev: 2552 bytes, 5 examples
- 下载大小: 47614 bytes
- 数据集大小: 56845 bytes
16. tmmluplus-auditing
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 272426 bytes, 550 examples
- dev: 1947 bytes, 5 examples
- 下载大小: 147664 bytes
- 数据集大小: 274373 bytes
17. tmmluplus-basic_medical_science
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 312503 bytes, 954 examples
- dev: 1599 bytes, 5 examples
- 下载大小: 194337 bytes
- 数据集大小: 314102 bytes
18. tmmluplus-business_management
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 45074 bytes, 139 examples
- dev: 1403 bytes, 5 examples
- 下载大小: 39338 bytes
- 数据集大小: 46477 bytes
19. tmmluplus-chinese_language_and_literature
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 127469 bytes, 199 examples
- dev: 2054 bytes, 5 examples
- 下载大小: 103909 bytes
- 数据集大小: 129523 bytes
20. tmmluplus-clinical_psychology
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 55748 bytes, 125 examples
- dev: 2029 bytes, 5 examples
- 下载大小: 51770 bytes
- 数据集大小: 57777 bytes
21. tmmluplus-computer_science
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 57883 bytes, 174 examples
- dev: 1894 bytes, 5 examples
- 下载大小: 49090 bytes
- 数据集大小: 59777 bytes
22. tmmluplus-culinary_skills
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 94564 bytes, 292 examples
- dev: 1540 bytes, 5 examples
- 下载大小: 69998 bytes
- 数据集大小: 96104 bytes
23. tmmluplus-dentistry
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 152113 bytes, 399 examples
- dev: 1684 bytes, 5 examples
- 下载大小: 105595 bytes
- 数据集大小: 153797 bytes
24. tmmluplus-economics
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 145972 bytes, 393 examples
- dev: 1946 bytes, 5 examples
- 下载大小: 91284 bytes
- 数据集大小: 147918 bytes
25. tmmluplus-education
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 44729 bytes, 124 examples
- dev: 1760 bytes, 5 examples
- 下载大小: 41837 bytes
- 数据集大小: 46489 bytes
26. tmmluplus-education_(profession_level)
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 208632 bytes, 486 examples
- dev: 3183 bytes, 5 examples
- 下载大小: 136861 bytes
- 数据集大小: 211815 bytes
27. tmmluplus-educational_psychology
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 71860 bytes, 176 examples
- dev: 2314 bytes, 5 examples
- 下载大小: 56964 bytes
- 数据集大小: 74174 bytes
28. tmmluplus-engineering_math
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 35214 bytes, 103 examples
- dev: 1954 bytes, 5 examples
- 下载大小: 33378 bytes
- 数据集大小: 37168 bytes
29. tmmluplus-finance_banking
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 59005 bytes, 135 examples
- dev: 2232 bytes, 5 examples
- 下载大小: 47576 bytes
- 数据集大小: 61237 bytes
30. tmmluplus-financial_analysis
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 128903 bytes, 382 examples
- dev: 1537 bytes, 5 examples
- 下载大小: 68492 bytes
- 数据集大小: 130440 bytes
31. tmmluplus-fire_science
- 特征:
- id: string
- question: string
- A: string
- B: string
- C: string
- D: string
- answer: string
- category: string
- subcategory: string
- subject: string
- 分割:
- test: 37661 bytes, 124 examples
- dev: 1690 bytes, 5 examples
- 下载大小: 33612 bytes
- 数据集大小:



