XiaHan19/cmmlu
收藏CMMLU 数据集概述
基本信息
- 许可证: cc-by-nc-4.0
- 任务类别:
- 多项选择
- 问答
- 语言: 中文
- 标签:
- 中文
- LLM
- 评估
- 名称: CMMLU
- 数据量: 10K<n<100K
简介
CMMLU 是一个综合性的中文评估套件,专门设计用于评估大型语言模型(LLMs)在中文语言和文化背景下的高级知识和推理能力。CMMLU 涵盖了从初级到高级专业水平的 67 个主题,包括需要计算专业知识的物理和数学,以及人文和社会科学领域。许多任务由于其特定的上下文细微差别和用词,不易从其他语言翻译过来。此外,CMMLU 中的许多任务答案具有中国特定性,可能在其他地区或语言中不适用或不被认为是正确的。
数据结构
CMMLU 为每个主题提供了开发和测试数据集,每个开发集包含 5 个问题,每个测试集包含 100+ 个问题。每个问题都是多项选择题,有 4 个选项,只有一个选项是正确答案。
数据加载
可以使用 datasets 库加载数据集,示例如下:
python
from datasets import load_dataset
cmmlu = load_dataset(r"haonan-li/cmmlu", agronomy)
print(cmmlu[test][0])
也可以一次性加载所有数据: python task_list = [agronomy, anatomy, ancient_chinese, arts, astronomy, business_ethics, chinese_civil_service_exam, chinese_driving_rule, chinese_food_culture, chinese_foreign_policy, chinese_history, chinese_literature, chinese_teacher_qualification, clinical_knowledge, college_actuarial_science, college_education, college_engineering_hydrology, college_law, college_mathematics, college_medical_statistics, college_medicine, computer_science, computer_security, conceptual_physics, construction_project_management, economics, education, electrical_engineering, elementary_chinese, elementary_commonsense, elementary_information_and_technology, elementary_mathematics, ethnology, food_science, genetics, global_facts, high_school_biology, high_school_chemistry, high_school_geography, high_school_mathematics, high_school_physics, high_school_politics, human_sexuality, international_law, journalism, jurisprudence, legal_and_moral_basis, logical, machine_learning, management, marketing, marxist_theory, modern_chinese, nutrition, philosophy, professional_accounting, professional_law, professional_medicine, professional_psychology, public_relations, security_study, sociology, sports_science, traditional_chinese_medicine, virology, world_history, world_religions]
from datasets import load_dataset cmmlu = {k: load_dataset(r"haonan-li/cmmlu", k) for k in task_list}
引用
@misc{li2023cmmlu, title={CMMLU: Measuring massive multitask language understanding in Chinese}, author={Haonan Li and Yixuan Zhang and Fajri Koto and Yifei Yang and Hai Zhao and Yeyun Gong and Nan Duan and Timothy Baldwin}, year={2023}, eprint={2306.09212}, archivePrefix={arXiv}, primaryClass={cs.CL} }
许可证
CMMLU 数据集的许可证是 Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License。




