资源简介:
---
language:
- tr
license: apache-2.0
task_categories:
- text-classification
- multiple-choice
- question-answering
task_ids:
- multiple-choice-qa
- open-domain-qa
- closed-domain-qa
tags:
- multi-task
- multitask
- mmlu
- hendrycks_test
dataset_info:
- config_name: abstract_algebra
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1213
num_examples: 4
- name: test
num_bytes: 30380
num_examples: 99
- name: validation
num_bytes: 2990
num_examples: 10
- config_name: anatomy
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1023
num_examples: 4
- name: test
num_bytes: 44968
num_examples: 134
- name: validation
num_bytes: 4074
num_examples: 13
- config_name: astronomy
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2775
num_examples: 4
- name: test
num_bytes: 72243
num_examples: 151
- name: validation
num_bytes: 6884
num_examples: 15
- config_name: business_ethics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2754
num_examples: 4
- name: test
num_bytes: 47509
num_examples: 99
- name: validation
num_bytes: 4131
num_examples: 10
- config_name: clinical_knowledge
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1615
num_examples: 4
- name: test
num_bytes: 92165
num_examples: 264
- name: validation
num_bytes: 9846
num_examples: 28
- config_name: college_biology
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1514
num_examples: 4
- name: test
num_bytes: 70502
num_examples: 143
- name: validation
num_bytes: 7086
num_examples: 15
- config_name: college_chemistry
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1350
num_examples: 4
- name: test
num_bytes: 35099
num_examples: 99
- name: validation
num_bytes: 2807
num_examples: 7
- config_name: college_computer_science
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 3582
num_examples: 4
- name: test
num_bytes: 64366
num_examples: 99
- name: validation
num_bytes: 6475
num_examples: 10
- config_name: college_mathematics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1696
num_examples: 4
- name: test
num_bytes: 35750
num_examples: 99
- name: validation
num_bytes: 3410
num_examples: 10
- config_name: college_medicine
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2090
num_examples: 4
- name: test
num_bytes: 119254
num_examples: 172
- name: validation
num_bytes: 10820
num_examples: 21
- config_name: college_physics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1504
num_examples: 4
- name: test
num_bytes: 41574
num_examples: 101
- name: validation
num_bytes: 4353
num_examples: 10
- config_name: computer_security
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1681
num_examples: 4
- name: test
num_bytes: 43455
num_examples: 99
- name: validation
num_bytes: 6697
num_examples: 10
- config_name: conceptual_physics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1210
num_examples: 4
- name: test
num_bytes: 63735
num_examples: 234
- name: validation
num_bytes: 6752
num_examples: 25
- config_name: econometrics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1997
num_examples: 4
- name: test
num_bytes: 65356
num_examples: 113
- name: validation
num_bytes: 6793
num_examples: 11
- config_name: electrical_engineering
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1207
num_examples: 4
- name: test
num_bytes: 38344
num_examples: 144
- name: validation
num_bytes: 4115
num_examples: 15
- config_name: elementary_mathematics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1728
num_examples: 4
- name: test
num_bytes: 100660
num_examples: 377
- name: validation
num_bytes: 12903
num_examples: 40
- config_name: formal_logic
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2365
num_examples: 4
- name: test
num_bytes: 73028
num_examples: 125
- name: validation
num_bytes: 8768
num_examples: 13
- config_name: global_facts
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1118
num_examples: 4
- name: test
num_bytes: 29486
num_examples: 99
- name: validation
num_bytes: 2736
num_examples: 9
- config_name: high_school_biology
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2162
num_examples: 4
- name: test
num_bytes: 156715
num_examples: 309
- name: validation
num_bytes: 14527
num_examples: 31
- config_name: high_school_chemistry
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1656
num_examples: 4
- name: test
num_bytes: 82374
num_examples: 202
- name: validation
num_bytes: 9753
num_examples: 21
- config_name: high_school_computer_science
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 3770
num_examples: 4
- name: test
num_bytes: 67680
num_examples: 99
- name: validation
num_bytes: 4744
num_examples: 8
- config_name: high_school_european_history
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 13380
num_examples: 4
- name: test
num_bytes: 379904
num_examples: 164
- name: validation
num_bytes: 38640
num_examples: 17
- config_name: high_school_geography
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1903
num_examples: 4
- name: test
num_bytes: 64542
num_examples: 197
- name: validation
num_bytes: 6151
num_examples: 21
- config_name: high_school_government_and_politics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1931
num_examples: 4
- name: test
num_bytes: 98507
num_examples: 192
- name: validation
num_bytes: 9710
num_examples: 20
- config_name: high_school_macroeconomics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1568
num_examples: 4
- name: test
num_bytes: 175522
num_examples: 389
- name: validation
num_bytes: 18938
num_examples: 42
- config_name: high_school_mathematics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1183
num_examples: 4
- name: test
num_bytes: 76921
num_examples: 269
- name: validation
num_bytes: 7961
num_examples: 28
- config_name: high_school_microeconomics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1197
num_examples: 4
- name: test
num_bytes: 110403
num_examples: 237
- name: validation
num_bytes: 10736
num_examples: 25
- config_name: high_school_physics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1789
num_examples: 4
- name: test
num_bytes: 84860
num_examples: 150
- name: validation
num_bytes: 8807
num_examples: 16
- config_name: high_school_psychology
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2191
num_examples: 4
- name: test
num_bytes: 237454
num_examples: 544
- name: validation
num_bytes: 25261
num_examples: 59
- config_name: high_school_statistics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2829
num_examples: 4
- name: test
num_bytes: 160308
num_examples: 215
- name: validation
num_bytes: 14465
num_examples: 22
- config_name: high_school_us_history
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 11136
num_examples: 4
- name: test
num_bytes: 427246
num_examples: 203
- name: validation
num_bytes: 44180
num_examples: 21
- config_name: high_school_world_history
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 6339
num_examples: 4
- name: test
num_bytes: 544262
num_examples: 236
- name: validation
num_bytes: 63826
num_examples: 25
- config_name: human_aging
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1301
num_examples: 4
- name: test
num_bytes: 72894
num_examples: 222
- name: validation
num_bytes: 7047
num_examples: 22
- config_name: human_sexuality
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1286
num_examples: 4
- name: test
num_bytes: 46845
num_examples: 130
- name: validation
num_bytes: 3231
num_examples: 11
- config_name: international_law
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2841
num_examples: 4
- name: test
num_bytes: 78414
num_examples: 120
- name: validation
num_bytes: 8742
num_examples: 12
- config_name: jurisprudence
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1336
num_examples: 4
- name: test
num_bytes: 49177
num_examples: 107
- name: validation
num_bytes: 5453
num_examples: 10
- config_name: logical_fallacies
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1958
num_examples: 4
- name: test
num_bytes: 76985
num_examples: 162
- name: validation
num_bytes: 7516
num_examples: 17
- config_name: machine_learning
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 3179
num_examples: 4
- name: test
num_bytes: 54414
num_examples: 111
- name: validation
num_bytes: 4357
num_examples: 10
- config_name: management
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 870
num_examples: 4
- name: test
num_bytes: 29869
num_examples: 102
- name: validation
num_bytes: 2530
num_examples: 10
- config_name: marketing
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1977
num_examples: 4
- name: test
num_bytes: 95368
num_examples: 233
- name: validation
num_bytes: 10670
num_examples: 24
- config_name: medical_genetics
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1297
num_examples: 4
- name: test
num_bytes: 29741
num_examples: 99
- name: validation
num_bytes: 3815
num_examples: 10
- config_name: miscellaneous
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 874
num_examples: 4
- name: test
num_bytes: 223389
num_examples: 782
- name: validation
num_bytes: 21001
num_examples: 85
- config_name: moral_disputes
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1842
num_examples: 4
- name: test
num_bytes: 165916
num_examples: 345
- name: validation
num_bytes: 18415
num_examples: 37
- config_name: moral_scenarios
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2582
num_examples: 4
- name: test
num_bytes: 614251
num_examples: 894
- name: validation
num_bytes: 68302
num_examples: 99
- config_name: nutrition
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2212
num_examples: 4
- name: test
num_bytes: 135605
num_examples: 305
- name: validation
num_bytes: 11919
num_examples: 32
- config_name: philosophy
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 997
num_examples: 4
- name: test
num_bytes: 121539
num_examples: 310
- name: validation
num_bytes: 12763
num_examples: 33
- config_name: prehistory
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2269
num_examples: 4
- name: test
num_bytes: 132441
num_examples: 323
- name: validation
num_bytes: 15041
num_examples: 34
- config_name: professional_accounting
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2409
num_examples: 4
- name: test
num_bytes: 178410
num_examples: 281
- name: validation
num_bytes: 20331
num_examples: 30
- config_name: professional_law
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 7449
num_examples: 4
- name: test
num_bytes: 2730513
num_examples: 1533
- name: validation
num_bytes: 294872
num_examples: 169
- config_name: professional_medicine
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 3669
num_examples: 4
- name: test
num_bytes: 298852
num_examples: 271
- name: validation
num_bytes: 31340
num_examples: 30
- config_name: professional_psychology
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1936
num_examples: 4
- name: test
num_bytes: 337821
num_examples: 611
- name: validation
num_bytes: 43121
num_examples: 68
- config_name: public_relations
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1592
num_examples: 4
- name: test
num_bytes: 42078
num_examples: 109
- name: validation
num_bytes: 6406
num_examples: 11
- config_name: security_studies
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 5725
num_examples: 4
- name: test
num_bytes: 307394
num_examples: 244
- name: validation
num_bytes: 32839
num_examples: 26
- config_name: sociology
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2078
num_examples: 4
- name: test
num_bytes: 100739
num_examples: 200
- name: validation
num_bytes: 10419
num_examples: 21
- config_name: us_foreign_policy
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 2098
num_examples: 4
- name: test
num_bytes: 41654
num_examples: 99
- name: validation
num_bytes: 4116
num_examples: 10
- config_name: virology
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 1305
num_examples: 4
- name: test
num_bytes: 59351
num_examples: 165
- name: validation
num_bytes: 8059
num_examples: 17
- config_name: world_religions
features:
- name: question
dtype: string
- name: choices
sequence: string
- name: answer
dtype: int64
splits:
- name: dev
num_bytes: 737
num_examples: 4
- name: test
num_bytes: 35616
num_examples: 170
- name: validation
num_bytes: 3704
num_examples: 18
configs:
- config_name: abstract_algebra
data_files:
- split: dev
path: abstract_algebra/dev-*
- split: test
path: abstract_algebra/test-*
- split: validation
path: abstract_algebra/validation-*
- config_name: anatomy
data_files:
- split: dev
path: anatomy/dev-*
- split: test
path: anatomy/test-*
- split: validation
path: anatomy/validation-*
- config_name: astronomy
data_files:
- split: dev
path: astronomy/dev-*
- split: test
path: astronomy/test-*
- split: validation
path: astronomy/validation-*
- config_name: business_ethics
data_files:
- split: dev
path: business_ethics/dev-*
- split: test
path: business_ethics/test-*
- split: validation
path: business_ethics/validation-*
- config_name: clinical_knowledge
data_files:
- split: dev
path: clinical_knowledge/dev-*
- split: test
path: clinical_knowledge/test-*
- split: validation
path: clinical_knowledge/validation-*
- config_name: college_biology
data_files:
- split: dev
path: college_biology/dev-*
- split: test
path: college_biology/test-*
- split: validation
path: college_biology/validation-*
- config_name: college_chemistry
data_files:
- split: dev
path: college_chemistry/dev-*
- split: test
path: college_chemistry/test-*
- split: validation
path: college_chemistry/validation-*
- config_name: college_computer_science
data_files:
- split: dev
path: college_computer_science/dev-*
- split: test
path: college_computer_science/test-*
- split: validation
path: college_computer_science/validation-*
- config_name: college_mathematics
data_files:
- split: dev
path: college_mathematics/dev-*
- split: test
path: college_mathematics/test-*
- split: validation
path: college_mathematics/validation-*
- config_name: college_medicine
data_files:
- split: dev
path: college_medicine/dev-*
- split: test
path: college_medicine/test-*
- split: validation
path: college_medicine/validation-*
- config_name: college_physics
data_files:
- split: dev
path: college_physics/dev-*
- split: test
path: college_physics/test-*
- split: validation
path: college_physics/validation-*
- config_name: computer_security
data_files:
- split: dev
path: computer_security/dev-*
- split: test
path: computer_security/test-*
- split: validation
path: computer_security/validation-*
- config_name: conceptual_physics
data_files:
- split: dev
path: conceptual_physics/dev-*
- split: test
path: conceptual_physics/test-*
- split: validation
path: conceptual_physics/validation-*
- config_name: econometrics
data_files:
- split: dev
path: econometrics/dev-*
- split: test
path: econometrics/test-*
- split: validation
path: econometrics/validation-*
- config_name: electrical_engineering
data_files:
- split: dev
path: electrical_engineering/dev-*
- split: test
path: electrical_engineering/test-*
- split: validation
path: electrical_engineering/validation-*
- config_name: elementary_mathematics
data_files:
- split: dev
path: elementary_mathematics/dev-*
- split: test
path: elementary_mathematics/test-*
- split: validation
path: elementary_mathematics/validation-*
- config_name: formal_logic
data_files:
- split: dev
path: formal_logic/dev-*
- split: test
path: formal_logic/test-*
- split: validation
path: formal_logic/validation-*
- config_name: global_facts
data_files:
- split: dev
path: global_facts/dev-*
- split: test
path: global_facts/test-*
- split: validation
path: global_facts/validation-*
- config_name: high_school_biology
data_files:
- split: dev
path: high_school_biology/dev-*
- split: test
path: high_school_biology/test-*
- split: validation
path: high_school_biology/validation-*
- config_name: high_school_chemistry
data_files:
- split: dev
path: high_school_chemistry/dev-*
- split: test
path: high_school_chemistry/test-*
- split: validation
path: high_school_chemistry/validation-*
- config_name: high_school_computer_science
data_files:
- split: dev
path: high_school_computer_science/dev-*
- split: test
path: high_school_computer_science/test-*
- split: validation
path: high_school_computer_science/validation-*
- config_name: high_school_european_history
data_files:
- split: dev
path: high_school_european_history/dev-*
- split: test
path: high_school_european_history/test-*
- split: validation
path: high_school_european_history/validation-*
- config_name: high_school_geography
data_files:
- split: dev
path: high_school_geography/dev-*
- split: test
path: high_school_geography/test-*
- split: validation
path: high_school_geography/validation-*
- config_name: high_school_government_and_politics
data_files:
- split: dev
path: high_school_government_and_politics/dev-*
- split: test
path: high_school_government_and_politics/test-*
- split: validation
path: high_school_government_and_politics/validation-*
- config_name: high_school_macroeconomics
data_files:
- split: dev
path: high_school_macroeconomics/dev-*
- split: test
path: high_school_macroeconomics/test-*
- split: validation
path: high_school_macroeconomics/validation-*
- config_name: high_school_mathematics
data_files:
- split: dev
path: high_school_mathematics/dev-*
- split: test
path: high_school_mathematics/test-*
- split: validation
path: high_school_mathematics/validation-*
- config_name: high_school_microeconomics
data_files:
- split: dev
path: high_school_microeconomics/dev-*
- split: test
path: high_school_microeconomics/test-*
- split: validation
path: high_school_microeconomics/validation-*
- config_name: high_school_physics
data_files:
- split: dev
path: high_school_physics/dev-*
- split: test
path: high_school_physics/test-*
- split: validation
path: high_school_physics/validation-*
- config_name: high_school_psychology
data_files:
- split: dev
path: high_school_psychology/dev-*
- split: test
path: high_school_psychology/test-*
- split: validation
path: high_school_psychology/validation-*
- config_name: high_school_statistics
data_files:
- split: dev
path: high_school_statistics/dev-*
- split: test
path: high_school_statistics/test-*
- split: validation
path: high_school_statistics/validation-*
- config_name: high_school_us_history
data_files:
- split: dev
path: high_school_us_history/dev-*
- split: test
path: high_school_us_history/test-*
- split: validation
path: high_school_us_history/validation-*
- config_name: high_school_world_history
data_files:
- split: dev
path: high_school_world_history/dev-*
- split: test
path: high_school_world_history/test-*
- split: validation
path: high_school_world_history/validation-*
- config_name: human_aging
data_files:
- split: dev
path: human_aging/dev-*
- split: test
path: human_aging/test-*
- split: validation
path: human_aging/validation-*
- config_name: human_sexuality
data_files:
- split: dev
path: human_sexuality/dev-*
- split: test
path: human_sexuality/test-*
- split: validation
path: human_sexuality/validation-*
- config_name: international_law
data_files:
- split: dev
path: international_law/dev-*
- split: test
path: international_law/test-*
- split: validation
path: international_law/validation-*
- config_name: jurisprudence
data_files:
- split: dev
path: jurisprudence/dev-*
- split: test
path: jurisprudence/test-*
- split: validation
path: jurisprudence/validation-*
- config_name: logical_fallacies
data_files:
- split: dev
path: logical_fallacies/dev-*
- split: test
path: logical_fallacies/test-*
- split: validation
path: logical_fallacies/validation-*
- config_name: machine_learning
data_files:
- split: dev
path: machine_learning/dev-*
- split: test
path: machine_learning/test-*
- split: validation
path: machine_learning/validation-*
- config_name: management
data_files:
- split: dev
path: management/dev-*
- split: test
path: management/test-*
- split: validation
path: management/validation-*
- config_name: marketing
data_files:
- split: dev
path: marketing/dev-*
- split: test
path: marketing/test-*
- split: validation
path: marketing/validation-*
- config_name: medical_genetics
data_files:
- split: dev
path: medical_genetics/dev-*
- split: test
path: medical_genetics/test-*
- split: validation
path: medical_genetics/validation-*
- config_name: miscellaneous
data_files:
- split: dev
path: miscellaneous/dev-*
- split: test
path: miscellaneous/test-*
- split: validation
path: miscellaneous/validation-*
- config_name: moral_disputes
data_files:
- split: dev
path: moral_disputes/dev-*
- split: test
path: moral_disputes/test-*
- split: validation
path: moral_disputes/validation-*
- config_name: moral_scenarios
data_files:
- split: dev
path: moral_scenarios/dev-*
- split: test
path: moral_scenarios/test-*
- split: validation
path: moral_scenarios/validation-*
- config_name: nutrition
data_files:
- split: dev
path: nutrition/dev-*
- split: test
path: nutrition/test-*
- split: validation
path: nutrition/validation-*
- config_name: philosophy
data_files:
- split: dev
path: philosophy/dev-*
- split: test
path: philosophy/test-*
- split: validation
path: philosophy/validation-*
- config_name: prehistory
data_files:
- split: dev
path: prehistory/dev-*
- split: test
path: prehistory/test-*
- split: validation
path: prehistory/validation-*
- config_name: professional_accounting
data_files:
- split: dev
path: professional_accounting/dev-*
- split: test
path: professional_accounting/test-*
- split: validation
path: professional_accounting/validation-*
- config_name: professional_law
data_files:
- split: dev
path: professional_law/dev-*
- split: test
path: professional_law/test-*
- split: validation
path: professional_law/validation-*
- config_name: professional_medicine
data_files:
- split: dev
path: professional_medicine/dev-*
- split: test
path: professional_medicine/test-*
- split: validation
path: professional_medicine/validation-*
- config_name: professional_psychology
data_files:
- split: dev
path: professional_psychology/dev-*
- split: test
path: professional_psychology/test-*
- split: validation
path: professional_psychology/validation-*
- config_name: public_relations
data_files:
- split: dev
path: public_relations/dev-*
- split: test
path: public_relations/test-*
- split: validation
path: public_relations/validation-*
- config_name: security_studies
data_files:
- split: dev
path: security_studies/dev-*
- split: test
path: security_studies/test-*
- split: validation
path: security_studies/validation-*
- config_name: sociology
data_files:
- split: dev
path: sociology/dev-*
- split: test
path: sociology/test-*
- split: validation
path: sociology/validation-*
- config_name: us_foreign_policy
data_files:
- split: dev
path: us_foreign_policy/dev-*
- split: test
path: us_foreign_policy/test-*
- split: validation
path: us_foreign_policy/validation-*
- config_name: virology
data_files:
- split: dev
path: virology/dev-*
- split: test
path: virology/test-*
- split: validation
path: virology/validation-*
- config_name: world_religions
data_files:
- split: dev
path: world_religions/dev-*
- split: test
path: world_religions/test-*
- split: validation
path: world_religions/validation-*
---
This Dataset is part of a series of datasets aimed at advancing Turkish LLM Developments by establishing rigid Turkish benchmarks to evaluate the performance of LLM's Produced in the Turkish Language.
# Dataset Card for mmlu-tr
malhajar/mmlu-tr is a translated version of [`mmlu`](https://huggingface.co/datasets/tasksource/mmlu) aimed specifically to be used in the [`OpenLLMTurkishLeaderboard`](https://huggingface.co/spaces/malhajar/OpenLLMTurkishLeaderboard)
MMLU (`hendrycks_test` on huggingface) without auxiliary train. It is much lighter (7MB vs 162MB) and faster than the original implementation, in which auxiliary train is loaded (+ duplicated!) by default for all the configs in the original version, making it quite heavy.
Reference to original dataset:
Measuring Massive Multitask Language Understanding - https://github.com/hendrycks/test
## Dataset Description
- **Paper:** [Measuring Massive Multitask Language Understanding](https://arxiv.org/abs/2009.03300)
- **Leaderboard:** [OpenLLMTurkishLeaderboard](https://huggingface.co/spaces/malhajar/OpenLLMTurkishLeaderboard)
### Supported Tasks and Leaderboards
This dataset are defined specifically to be used in [`OpenLLMTurkishLeaderboard`](https://huggingface.co/spaces/malhajar/OpenLLMTurkishLeaderboard)
### Languages
The text in the dataset is in Turkish.
### Contributions
This dataset was translated by [`Mohamad Alhajar`](https://www.linkedin.com/in/muhammet-alhajar/)
```
@article{hendryckstest2021,
title={Measuring Massive Multitask Language Understanding},
author={Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt},
journal={Proceedings of the International Conference on Learning Representations (ICLR)},
year={2021}
}