swap-uniba/mmlu_ita
收藏Hugging Face2024-01-19 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/swap-uniba/mmlu_ita
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为Measuring Massive Multitask Language Understanding (MMLU),旨在评估模型在多任务语言理解方面的能力。数据集包含多个配置,每个配置都包含问题、选项和答案等特征,并分为辅助训练、测试、验证和开发等子集。数据集为单语(英语),规模在10K到100K之间,适用于多项选择题的问答任务。
该数据集名为Measuring Massive Multitask Language Understanding (MMLU),旨在评估模型在多任务语言理解方面的能力。数据集包含多个配置,每个配置都包含问题、选项和答案等特征,并分为辅助训练、测试、验证和开发等子集。数据集为单语(英语),规模在10K到100K之间,适用于多项选择题的问答任务。
提供机构:
swap-uniba
原始信息汇总
数据集概述
基本信息
- 数据集名称: Measuring Massive Multitask Language Understanding
- 语言: 英语
- 许可证: MIT
- 数据集大小: 10K<n<100K
- 任务类型: 多选题问答(multiple-choice-qa)
- 数据来源: 原始数据
数据集配置
数据集包含多个配置,每个配置代表一个特定的学科领域,具体如下:
配置列表
- abstract_algebra
- anatomy
- astronomy
- business_ethics
- clinical_knowledge
- college_biology
- college_chemistry
- college_computer_science
- college_mathematics
- college_medicine
- college_physics
- computer_security
- conceptual_physics
- econometrics
- electrical_engineering
- elementary_mathematics
- formal_logic
- global_facts
- high_school_biology
- high_school_chemistry
- high_school_computer_science
- high_school_european_history
- high_school_geography
- high_school_government_and_politics
- high_school_macroeconomics
- high_school_mathematics
- high_school_microeconomics
- high_school_physics
- high_school_psychology
- high_school_statistics
- high_school_us_history
- high_school_world_history
- human_aging
- human_sexuality
- international_law
数据结构
每个配置包含以下特征:
- question: 问题,数据类型为字符串。
- choices: 选项,数据类型为字符串序列。
- answer: 答案,数据类型为类标签,选项包括 A, B, C, D。
数据分割
每个配置包含以下数据分割:
- auxiliary_train: 辅助训练集
- test: 测试集
- validation: 验证集
- dev: 开发集
示例
以 abstract_algebra 配置为例:
- auxiliary_train: 99842个样本,160601377字节
- test: 100个样本,19328字节
- validation: 11个样本,2024字节
- dev: 5个样本,830字节
数据集大小
- 下载大小: 166184960字节
- 数据集大小: 160623559字节



