lukaemon/mmlu
收藏Hugging Face2024-03-04 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/lukaemon/mmlu
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: abstract_algebra
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 18616
num_examples: 100
- name: validation
num_bytes: 1935
num_examples: 11
- name: train
num_bytes: 783
num_examples: 5
download_size: 166184960
dataset_size: 21334
- config_name: anatomy
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 32164
num_examples: 135
- name: validation
num_bytes: 3030
num_examples: 14
- name: train
num_bytes: 920
num_examples: 5
download_size: 166184960
dataset_size: 36114
- config_name: astronomy
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 45695
num_examples: 152
- name: validation
num_bytes: 4903
num_examples: 16
- name: train
num_bytes: 2029
num_examples: 5
download_size: 166184960
dataset_size: 52627
- config_name: business_ethics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 32540
num_examples: 100
- name: validation
num_bytes: 2949
num_examples: 11
- name: train
num_bytes: 2143
num_examples: 5
download_size: 166184960
dataset_size: 37632
- config_name: clinical_knowledge
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 60887
num_examples: 265
- name: validation
num_bytes: 6449
num_examples: 29
- name: train
num_bytes: 1163
num_examples: 5
download_size: 166184960
dataset_size: 68499
- config_name: college_biology
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 47777
num_examples: 144
- name: validation
num_bytes: 4695
num_examples: 16
- name: train
num_bytes: 1485
num_examples: 5
download_size: 166184960
dataset_size: 53957
- config_name: college_chemistry
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 23996
num_examples: 100
- name: validation
num_bytes: 2260
num_examples: 8
- name: train
num_bytes: 1284
num_examples: 5
download_size: 166184960
dataset_size: 27540
- config_name: college_computer_science
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 41927
num_examples: 100
- name: validation
num_bytes: 4574
num_examples: 11
- name: train
num_bytes: 2718
num_examples: 5
download_size: 166184960
dataset_size: 49219
- config_name: college_mathematics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 23996
num_examples: 100
- name: validation
num_bytes: 2579
num_examples: 11
- name: train
num_bytes: 1446
num_examples: 5
download_size: 166184960
dataset_size: 28021
- config_name: college_medicine
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 81174
num_examples: 173
- name: validation
num_bytes: 7743
num_examples: 22
- name: train
num_bytes: 1623
num_examples: 5
download_size: 166184960
dataset_size: 90540
- config_name: college_physics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 29454
num_examples: 102
- name: validation
num_bytes: 3401
num_examples: 11
- name: train
num_bytes: 1365
num_examples: 5
download_size: 166184960
dataset_size: 34220
- config_name: computer_security
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 26412
num_examples: 100
- name: validation
num_bytes: 4460
num_examples: 11
- name: train
num_bytes: 1054
num_examples: 5
download_size: 166184960
dataset_size: 31926
- config_name: conceptual_physics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 39052
num_examples: 235
- name: validation
num_bytes: 4279
num_examples: 26
- name: train
num_bytes: 887
num_examples: 5
download_size: 166184960
dataset_size: 44218
- config_name: econometrics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 45737
num_examples: 114
- name: validation
num_bytes: 4871
num_examples: 12
- name: train
num_bytes: 1597
num_examples: 5
download_size: 166184960
dataset_size: 52205
- config_name: electrical_engineering
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 24111
num_examples: 145
- name: validation
num_bytes: 2778
num_examples: 16
- name: train
num_bytes: 925
num_examples: 5
download_size: 166184960
dataset_size: 27814
- config_name: elementary_mathematics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 67450
num_examples: 378
- name: validation
num_bytes: 8689
num_examples: 41
- name: train
num_bytes: 1393
num_examples: 5
download_size: 166184960
dataset_size: 77532
- config_name: formal_logic
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 48891
num_examples: 126
- name: validation
num_bytes: 6142
num_examples: 14
- name: train
num_bytes: 1710
num_examples: 5
download_size: 166184960
dataset_size: 56743
- config_name: global_facts
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 17691
num_examples: 100
- name: validation
num_bytes: 1783
num_examples: 10
- name: train
num_bytes: 1182
num_examples: 5
download_size: 166184960
dataset_size: 20656
- config_name: high_school_biology
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 107550
num_examples: 310
- name: validation
num_bytes: 10786
num_examples: 32
- name: train
num_bytes: 1626
num_examples: 5
download_size: 166184960
dataset_size: 119962
- config_name: high_school_chemistry
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 57031
num_examples: 203
- name: validation
num_bytes: 6926
num_examples: 22
- name: train
num_bytes: 1173
num_examples: 5
download_size: 166184960
dataset_size: 65130
- config_name: high_school_computer_science
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 43764
num_examples: 100
- name: validation
num_bytes: 3268
num_examples: 9
- name: train
num_bytes: 2871
num_examples: 5
download_size: 166184960
dataset_size: 49903
- config_name: high_school_european_history
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 269133
num_examples: 165
- name: validation
num_bytes: 29494
num_examples: 18
- name: train
num_bytes: 11517
num_examples: 5
download_size: 166184960
dataset_size: 310144
- config_name: high_school_geography
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 40636
num_examples: 198
- name: validation
num_bytes: 4166
num_examples: 22
- name: train
num_bytes: 1356
num_examples: 5
download_size: 166184960
dataset_size: 46158
- config_name: high_school_government_and_politics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 64711
num_examples: 193
- name: validation
num_bytes: 6904
num_examples: 21
- name: train
num_bytes: 1732
num_examples: 5
download_size: 166184960
dataset_size: 73347
- config_name: high_school_macroeconomics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 114945
num_examples: 390
- name: validation
num_bytes: 12707
num_examples: 43
- name: train
num_bytes: 1281
num_examples: 5
download_size: 166184960
dataset_size: 128933
- config_name: high_school_mathematics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 52952
num_examples: 270
- name: validation
num_bytes: 5550
num_examples: 29
- name: train
num_bytes: 1250
num_examples: 5
download_size: 166184960
dataset_size: 59752
- config_name: high_school_microeconomics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 74025
num_examples: 238
- name: validation
num_bytes: 7359
num_examples: 26
- name: train
num_bytes: 1251
num_examples: 5
download_size: 166184960
dataset_size: 82635
- config_name: high_school_physics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 58469
num_examples: 151
- name: validation
num_bytes: 6640
num_examples: 17
- name: train
num_bytes: 1442
num_examples: 5
download_size: 166184960
dataset_size: 66551
- config_name: high_school_psychology
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 155580
num_examples: 545
- name: validation
num_bytes: 16837
num_examples: 60
- name: train
num_bytes: 1858
num_examples: 5
download_size: 166184960
dataset_size: 174275
- config_name: high_school_statistics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 109178
num_examples: 216
- name: validation
num_bytes: 9824
num_examples: 23
- name: train
num_bytes: 2481
num_examples: 5
download_size: 166184960
dataset_size: 121483
- config_name: high_school_us_history
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 295294
num_examples: 204
- name: validation
num_bytes: 31540
num_examples: 22
- name: train
num_bytes: 8817
num_examples: 5
download_size: 166184960
dataset_size: 335651
- config_name: high_school_world_history
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 376946
num_examples: 237
- name: validation
num_bytes: 45307
num_examples: 26
- name: train
num_bytes: 4835
num_examples: 5
download_size: 166184960
dataset_size: 427088
- config_name: human_aging
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 44525
num_examples: 223
- name: validation
num_bytes: 4534
num_examples: 23
- name: train
num_bytes: 961
num_examples: 5
download_size: 166184960
dataset_size: 50020
- config_name: human_sexuality
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 31181
num_examples: 131
- name: validation
num_bytes: 2325
num_examples: 12
- name: train
num_bytes: 1030
num_examples: 5
download_size: 166184960
dataset_size: 34536
- config_name: international_law
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 52672
num_examples: 121
- name: validation
num_bytes: 6370
num_examples: 13
- name: train
num_bytes: 2371
num_examples: 5
download_size: 166184960
dataset_size: 61413
- config_name: jurisprudence
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 33218
num_examples: 108
- name: validation
num_bytes: 3640
num_examples: 11
- name: train
num_bytes: 1256
num_examples: 5
download_size: 166184960
dataset_size: 38114
- config_name: logical_fallacies
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 48964
num_examples: 163
- name: validation
num_bytes: 4965
num_examples: 18
- name: train
num_bytes: 1526
num_examples: 5
download_size: 166184960
dataset_size: 55455
- config_name: machine_learning
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 33084
num_examples: 112
- name: validation
num_bytes: 3143
num_examples: 11
- name: train
num_bytes: 2276
num_examples: 5
download_size: 166184960
dataset_size: 38503
- config_name: management
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 19269
num_examples: 103
- name: validation
num_bytes: 1731
num_examples: 11
- name: train
num_bytes: 851
num_examples: 5
download_size: 166184960
dataset_size: 21851
- config_name: marketing
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 61375
num_examples: 234
- name: validation
num_bytes: 7207
num_examples: 25
- name: train
num_bytes: 1434
num_examples: 5
download_size: 166184960
dataset_size: 70016
- config_name: medical_genetics
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 20152
num_examples: 100
- name: validation
num_bytes: 2916
num_examples: 11
- name: train
num_bytes: 1042
num_examples: 5
download_size: 166184960
dataset_size: 24110
- config_name: miscellaneous
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 142211
num_examples: 783
- name: validation
num_bytes: 13716
num_examples: 86
- name: train
num_bytes: 652
num_examples: 5
download_size: 166184960
dataset_size: 156579
- config_name: moral_disputes
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 105384
num_examples: 346
- name: validation
num_bytes: 12142
num_examples: 38
- name: train
num_bytes: 1708
num_examples: 5
download_size: 166184960
dataset_size: 119234
- config_name: moral_scenarios
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 367749
num_examples: 895
- name: validation
num_bytes: 41626
num_examples: 100
- name: train
num_bytes: 2011
num_examples: 5
download_size: 166184960
dataset_size: 411386
- config_name: nutrition
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 90256
num_examples: 306
- name: validation
num_bytes: 8193
num_examples: 33
- name: train
num_bytes: 2038
num_examples: 5
download_size: 166184960
dataset_size: 100487
- config_name: philosophy
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 77884
num_examples: 311
- name: validation
num_bytes: 8934
num_examples: 34
- name: train
num_bytes: 941
num_examples: 5
download_size: 166184960
dataset_size: 87759
- config_name: prehistory
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 87314
num_examples: 324
- name: validation
num_bytes: 10028
num_examples: 35
- name: train
num_bytes: 1831
num_examples: 5
download_size: 166184960
dataset_size: 99173
- config_name: professional_accounting
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 122564
num_examples: 282
- name: validation
num_bytes: 14143
num_examples: 31
- name: train
num_bytes: 2101
num_examples: 5
download_size: 166184960
dataset_size: 138808
- config_name: professional_law
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 1881012
num_examples: 1534
- name: validation
num_bytes: 202317
num_examples: 170
- name: train
num_bytes: 6563
num_examples: 5
download_size: 166184960
dataset_size: 2089892
- config_name: professional_medicine
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 215645
num_examples: 272
- name: validation
num_bytes: 23618
num_examples: 31
- name: train
num_bytes: 3760
num_examples: 5
download_size: 166184960
dataset_size: 243023
- config_name: professional_psychology
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 221603
num_examples: 612
- name: validation
num_bytes: 28606
num_examples: 69
- name: train
num_bytes: 2220
num_examples: 5
download_size: 166184960
dataset_size: 252429
- config_name: public_relations
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 27978
num_examples: 110
- name: validation
num_bytes: 4470
num_examples: 12
- name: train
num_bytes: 1449
num_examples: 5
download_size: 166184960
dataset_size: 33897
- config_name: security_studies
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 203117
num_examples: 245
- name: validation
num_bytes: 22436
num_examples: 27
- name: train
num_bytes: 5288
num_examples: 5
download_size: 166184960
dataset_size: 230841
- config_name: sociology
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 64824
num_examples: 201
- name: validation
num_bytes: 7018
num_examples: 22
- name: train
num_bytes: 1566
num_examples: 5
download_size: 166184960
dataset_size: 73408
- config_name: us_foreign_policy
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 27731
num_examples: 100
- name: validation
num_bytes: 3175
num_examples: 11
- name: train
num_bytes: 1564
num_examples: 5
download_size: 166184960
dataset_size: 32470
- config_name: virology
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 37585
num_examples: 166
- name: validation
num_bytes: 5325
num_examples: 18
- name: train
num_bytes: 1049
num_examples: 5
download_size: 166184960
dataset_size: 43959
- config_name: world_religions
features:
- name: input
dtype: string
- name: A
dtype: string
- name: B
dtype: string
- name: C
dtype: string
- name: D
dtype: string
- name: target
dtype: string
splits:
- name: test
num_bytes: 24065
num_examples: 171
- name: validation
num_bytes: 2620
num_examples: 19
- name: train
num_bytes: 623
num_examples: 5
download_size: 166184960
dataset_size: 27308
---
# MMLU dataset
Measuring Massive Multitask Language Understanding: https://github.com/hendrycks/test
task_list = [
"high_school_european_history",
"business_ethics",
"clinical_knowledge",
"medical_genetics",
"high_school_us_history",
"high_school_physics",
"high_school_world_history",
"virology",
"high_school_microeconomics",
"econometrics",
"college_computer_science",
"high_school_biology",
"abstract_algebra",
"professional_accounting",
"philosophy",
"professional_medicine",
"nutrition",
"global_facts",
"machine_learning",
"security_studies",
"public_relations",
"professional_psychology",
"prehistory",
"anatomy",
"human_sexuality",
"college_medicine",
"high_school_government_and_politics",
"college_chemistry",
"logical_fallacies",
"high_school_geography",
"elementary_mathematics",
"human_aging",
"college_mathematics",
"high_school_psychology",
"formal_logic",
"high_school_statistics",
"international_law",
"high_school_mathematics",
"high_school_computer_science",
"conceptual_physics",
"miscellaneous",
"high_school_chemistry",
"marketing",
"professional_law",
"management",
"college_physics",
"jurisprudence",
"world_religions",
"sociology",
"us_foreign_policy",
"high_school_macroeconomics",
"computer_security",
"moral_scenarios",
"moral_disputes",
"electrical_engineering",
"astronomy",
"college_biology",
]
```
@article{hendryckstest2021,
title={Measuring Massive Multitask Language Understanding},
author={Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt},
journal={Proceedings of the International Conference on Learning Representations (ICLR)},
year={2021}
}
```
提供机构:
lukaemon
原始信息汇总
数据集概述
1. 抽象代数 (abstract_algebra)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 100个样本, 18616字节
- 验证集: 11个样本, 1935字节
- 训练集: 5个样本, 783字节
- 下载大小: 166184960字节
- 数据集大小: 21334字节
2. 解剖学 (anatomy)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 135个样本, 32164字节
- 验证集: 14个样本, 3030字节
- 训练集: 5个样本, 920字节
- 下载大小: 166184960字节
- 数据集大小: 36114字节
3. 天文学 (astronomy)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 152个样本, 45695字节
- 验证集: 16个样本, 4903字节
- 训练集: 5个样本, 2029字节
- 下载大小: 166184960字节
- 数据集大小: 52627字节
4. 商业伦理 (business_ethics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 100个样本, 32540字节
- 验证集: 11个样本, 2949字节
- 训练集: 5个样本, 2143字节
- 下载大小: 166184960字节
- 数据集大小: 37632字节
5. 临床知识 (clinical_knowledge)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 265个样本, 60887字节
- 验证集: 29个样本, 6449字节
- 训练集: 5个样本, 1163字节
- 下载大小: 166184960字节
- 数据集大小: 68499字节
6. 大学生物学 (college_biology)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 144个样本, 47777字节
- 验证集: 16个样本, 4695字节
- 训练集: 5个样本, 1485字节
- 下载大小: 166184960字节
- 数据集大小: 53957字节
7. 大学化学 (college_chemistry)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 100个样本, 23996字节
- 验证集: 8个样本, 2260字节
- 训练集: 5个样本, 1284字节
- 下载大小: 166184960字节
- 数据集大小: 27540字节
8. 大学计算机科学 (college_computer_science)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 100个样本, 41927字节
- 验证集: 11个样本, 4574字节
- 训练集: 5个样本, 2718字节
- 下载大小: 166184960字节
- 数据集大小: 49219字节
9. 大学数学 (college_mathematics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 100个样本, 23996字节
- 验证集: 11个样本, 2579字节
- 训练集: 5个样本, 1446字节
- 下载大小: 166184960字节
- 数据集大小: 28021字节
10. 大学医学 (college_medicine)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 173个样本, 81174字节
- 验证集: 22个样本, 7743字节
- 训练集: 5个样本, 1623字节
- 下载大小: 166184960字节
- 数据集大小: 90540字节
11. 大学物理 (college_physics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 102个样本, 29454字节
- 验证集: 11个样本, 3401字节
- 训练集: 5个样本, 1365字节
- 下载大小: 166184960字节
- 数据集大小: 34220字节
12. 计算机安全 (computer_security)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 100个样本, 26412字节
- 验证集: 11个样本, 4460字节
- 训练集: 5个样本, 1054字节
- 下载大小: 166184960字节
- 数据集大小: 31926字节
13. 概念物理 (conceptual_physics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 235个样本, 39052字节
- 验证集: 26个样本, 4279字节
- 训练集: 5个样本, 887字节
- 下载大小: 166184960字节
- 数据集大小: 44218字节
14. 计量经济学 (econometrics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 114个样本, 45737字节
- 验证集: 12个样本, 4871字节
- 训练集: 5个样本, 1597字节
- 下载大小: 166184960字节
- 数据集大小: 52205字节
15. 电气工程 (electrical_engineering)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 145个样本, 24111字节
- 验证集: 16个样本, 2778字节
- 训练集: 5个样本, 925字节
- 下载大小: 166184960字节
- 数据集大小: 27814字节
16. 初等数学 (elementary_mathematics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 378个样本, 67450字节
- 验证集: 41个样本, 8689字节
- 训练集: 5个样本, 1393字节
- 下载大小: 166184960字节
- 数据集大小: 77532字节
17. 形式逻辑 (formal_logic)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 126个样本, 48891字节
- 验证集: 14个样本, 6142字节
- 训练集: 5个样本, 1710字节
- 下载大小: 166184960字节
- 数据集大小: 56743字节
18. 全球事实 (global_facts)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 100个样本, 17691字节
- 验证集: 10个样本, 1783字节
- 训练集: 5个样本, 1182字节
- 下载大小: 166184960字节
- 数据集大小: 20656字节
19. 高中生物学 (high_school_biology)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 310个样本, 107550字节
- 验证集: 32个样本, 10786字节
- 训练集: 5个样本, 1626字节
- 下载大小: 166184960字节
- 数据集大小: 119962字节
20. 高中化学 (high_school_chemistry)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 203个样本, 57031字节
- 验证集: 22个样本, 6926字节
- 训练集: 5个样本, 1173字节
- 下载大小: 166184960字节
- 数据集大小: 65130字节
21. 高中计算机科学 (high_school_computer_science)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 100个样本, 43764字节
- 验证集: 9个样本, 3268字节
- 训练集: 5个样本, 2871字节
- 下载大小: 166184960字节
- 数据集大小: 49903字节
22. 高中欧洲历史 (high_school_european_history)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 165个样本, 269133字节
- 验证集: 18个样本, 29494字节
- 训练集: 5个样本, 11517字节
- 下载大小: 166184960字节
- 数据集大小: 310144字节
23. 高中地理 (high_school_geography)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 198个样本, 40636字节
- 验证集: 22个样本, 4166字节
- 训练集: 5个样本, 1356字节
- 下载大小: 166184960字节
- 数据集大小: 46158字节
24. 高中政府与政治 (high_school_government_and_politics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 193个样本, 64711字节
- 验证集: 21个样本, 6904字节
- 训练集: 5个样本, 1732字节
- 下载大小: 166184960字节
- 数据集大小: 73347字节
25. 高中宏观经济学 (high_school_macroeconomics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 390个样本, 114945字节
- 验证集: 43个样本, 12707字节
- 训练集: 5个样本, 1281字节
- 下载大小: 166184960字节
- 数据集大小: 128933字节
26. 高中数学 (high_school_mathematics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 270个样本, 52952字节
- 验证集: 29个样本, 5550字节
- 训练集: 5个样本, 1250字节
- 下载大小: 166184960字节
- 数据集大小: 59752字节
27. 高中微观经济学 (high_school_microeconomics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 238个样本, 74025字节
- 验证集: 26个样本, 7359字节
- 训练集: 5个样本, 1251字节
- 下载大小: 166184960字节
- 数据集大小: 82635字节
28. 高中物理 (high_school_physics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 151个样本, 58469字节
- 验证集: 17个样本, 6640字节
- 训练集: 5个样本, 1442字节
- 下载大小: 166184960字节
- 数据集大小: 66551字节
29. 高中心理学 (high_school_psychology)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 545个样本, 155580字节
- 验证集: 60个样本, 16837字节
- 训练集: 5个样本, 1858字节
- 下载大小: 166184960字节
- 数据集大小: 174275字节
30. 高中统计学 (high_school_statistics)
- 特征: input, A, B, C, D, target (均为字符串类型)
- 分割:
- 测试集: 216
搜集汇总
数据集介绍

背景与挑战
背景概述
MMLU(大规模多任务语言理解测量)数据集是一个用于评估语言模型在多个学科领域理解能力的基准测试。它涵盖了57个广泛的任务,包括历史、科学、医学、法律等,旨在全面测试模型的知识广度和推理能力。该数据集在学术界和工业界被广泛使用,多个先进模型基于其进行训练和评估,是衡量语言模型性能的重要工具。
以上内容由遇见数据集搜集并总结生成



