five

cmmlu

收藏
魔搭社区2026-05-16 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/opencompass/cmmlu
下载链接
链接失效反馈
官方服务:
资源简介:
# CMMLU: Measuring massive multitask language understanding in Chinese - **Homepage:** [https://github.com/haonan-li/CMMLU](https://github.com/haonan-li/CMMLU) - **Repository:** [https://huggingface.co/datasets/haonan-li/cmmlu](https://huggingface.co/datasets/haonan-li/cmmlu) - **Paper:** [CMMLU: Measuring Chinese Massive Multitask Language Understanding](https://arxiv.org/abs/2306.09212). ## Table of Contents - [Introduction](#introduction) - [Leaderboard](#leaderboard) - [Data](#data) - [Citation](#citation) - [License](#license) ## Introduction CMMLU is a comprehensive Chinese assessment suite specifically designed to evaluate the advanced knowledge and reasoning abilities of LLMs within the Chinese language and cultural context. CMMLU covers a wide range of subjects, comprising 67 topics that span from elementary to advanced professional levels. It includes subjects that require computational expertise, such as physics and mathematics, as well as disciplines within humanities and social sciences. Many of these tasks are not easily translatable from other languages due to their specific contextual nuances and wording. Furthermore, numerous tasks within CMMLU have answers that are specific to China and may not be universally applicable or considered correct in other regions or languages. ## Leaderboard Latest leaderboard is in our [github](https://github.com/haonan-li/CMMLU). ## Data We provide development and test dataset for each of 67 subjects, with 5 questions in development set and 100+ quesitons in test set. Each question in the dataset is a multiple-choice questions with 4 choices and only one choice as the correct answer. Here are two examples: ``` 题目:同一物种的两类细胞各产生一种分泌蛋白,组成这两种蛋白质的各种氨基酸含量相同,但排列顺序不同。其原因是参与这两种蛋白质合成的: A. tRNA种类不同 B. 同一密码子所决定的氨基酸不同 C. mRNA碱基序列不同 D. 核糖体成分不同 答案是:C ``` ``` 题目:某种植物病毒V是通过稻飞虱吸食水稻汁液在水稻间传播的。稻田中青蛙数量的增加可减少该病毒在水稻间的传播。下列叙述正确的是: A. 青蛙与稻飞虱是捕食关系 B. 水稻和病毒V是互利共生关系 C. 病毒V与青蛙是寄生关系 D. 水稻与青蛙是竞争关系 答案是: ``` #### Load data ```python from datasets import load_dataset cmmlu=load_dataset(r"haonan-li/cmmlu", 'agronomy') print(cmmlu['test'][0]) ``` #### Load all data at once ```python task_list = ['agronomy', 'anatomy', 'ancient_chinese', 'arts', 'astronomy', 'business_ethics', 'chinese_civil_service_exam', 'chinese_driving_rule', 'chinese_food_culture', 'chinese_foreign_policy', 'chinese_history', 'chinese_literature', 'chinese_teacher_qualification', 'clinical_knowledge', 'college_actuarial_science', 'college_education', 'college_engineering_hydrology', 'college_law', 'college_mathematics', 'college_medical_statistics', 'college_medicine', 'computer_science', 'computer_security', 'conceptual_physics', 'construction_project_management', 'economics', 'education', 'electrical_engineering', 'elementary_chinese', 'elementary_commonsense', 'elementary_information_and_technology', 'elementary_mathematics', 'ethnology', 'food_science', 'genetics', 'global_facts', 'high_school_biology', 'high_school_chemistry', 'high_school_geography', 'high_school_mathematics', 'high_school_physics', 'high_school_politics', 'human_sexuality', 'international_law', 'journalism', 'jurisprudence', 'legal_and_moral_basis', 'logical', 'machine_learning', 'management', 'marketing', 'marxist_theory', 'modern_chinese', 'nutrition', 'philosophy', 'professional_accounting', 'professional_law', 'professional_medicine', 'professional_psychology', 'public_relations', 'security_study', 'sociology', 'sports_science', 'traditional_chinese_medicine', 'virology', 'world_history', 'world_religions'] from datasets import load_dataset cmmlu = {k: load_dataset(r"haonan-li/cmmlu", k) for k in task_list} ``` ## Citation ``` @misc{li2023cmmlu, title={CMMLU: Measuring massive multitask language understanding in Chinese}, author={Haonan Li and Yixuan Zhang and Fajri Koto and Yifei Yang and Hai Zhao and Yeyun Gong and Nan Duan and Timothy Baldwin}, year={2023}, eprint={2306.09212}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` ## License The CMMLU dataset is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/).

# CMMLU:面向中文的大规模多任务语言理解能力测评 - **项目主页**:[https://github.com/haonan-li/CMMLU](https://github.com/haonan-li/CMMLU) - **代码仓库**:[https://huggingface.co/datasets/haonan-li/cmmlu](https://huggingface.co/datasets/haonan-li/cmmlu) - **论文**:[CMMLU: Measuring Chinese Massive Multitask Language Understanding](https://arxiv.org/abs/2306.09212)。 ## 目录 - [介绍](#introduction) - [排行榜](#leaderboard) - [数据集](#data) - [引用](#citation) - [许可](#license) ## 介绍 CMMLU是一套综合性中文测评套件,专为在中文语言与文化语境下评估大语言模型(Large Language Model,LLM)的高阶知识与推理能力而打造。CMMLU涵盖67个学科主题,覆盖从初等教育层级直至高等专业水平的广泛内容,既包含需要专业计算能力的学科(如物理学、数学),也涵盖人文社科领域的诸多分支。由于其独特的语境细节与措辞风格,其中多数任务难以直接从其他语言翻译而来。此外,CMMLU中的大量任务答案具有中国地域特异性,在其他地区或语言语境中可能并不通用,甚至不被视作正确答案。 ## 排行榜 最新排行榜可参见我们的[GitHub仓库](https://github.com/haonan-li/CMMLU)。 ## 数据集 我们为67个学科分别提供了开发集与测试集:每个开发集包含5道题目,测试集则包含100道以上题目。数据集中的每道题目均为四选一单项选择题,且仅有一个正确答案。 以下为两个示例: 题目:同一物种的两类细胞各产生一种分泌蛋白,组成这两种蛋白质的各种氨基酸含量相同,但排列顺序不同。其原因是参与这两种蛋白质合成的: A. tRNA种类不同 B. 同一密码子所决定的氨基酸不同 C. mRNA碱基序列不同 D. 核糖体成分不同 答案是:C 题目:某种植物病毒V是通过稻飞虱吸食水稻汁液在水稻间传播的。稻田中青蛙数量的增加可减少该病毒在水稻间的传播。下列叙述正确的是: A. 青蛙与稻飞虱是捕食关系 B. 水稻和病毒V是互利共生关系 C. 病毒V与青蛙是寄生关系 D. 水稻与青蛙是竞争关系 答案是: #### 加载数据 python from datasets import load_dataset cmmlu=load_dataset(r"haonan-li/cmmlu", 'agronomy') print(cmmlu['test'][0]) #### 一次性加载全部数据 python task_list = ['agronomy', 'anatomy', 'ancient_chinese', 'arts', 'astronomy', 'business_ethics', 'chinese_civil_service_exam', 'chinese_driving_rule', 'chinese_food_culture', 'chinese_foreign_policy', 'chinese_history', 'chinese_literature', 'chinese_teacher_qualification', 'clinical_knowledge', 'college_actuarial_science', 'college_education', 'college_engineering_hydrology', 'college_law', 'college_mathematics', 'college_medical_statistics', 'college_medicine', 'computer_science', 'computer_security', 'conceptual_physics', 'construction_project_management', 'economics', 'education', 'electrical_engineering', 'elementary_chinese', 'elementary_commonsense', 'elementary_information_and_technology', 'elementary_mathematics', 'ethnology', 'food_science', 'genetics', 'global_facts', 'high_school_biology', 'high_school_chemistry', 'high_school_geography', 'high_school_mathematics', 'high_school_physics', 'high_school_politics', 'human_sexuality', 'international_law', 'journalism', 'jurisprudence', 'legal_and_moral_basis', 'logical', 'machine_learning', 'management', 'marketing', 'marxist_theory', 'modern_chinese', 'nutrition', 'philosophy', 'professional_accounting', 'professional_law', 'professional_medicine', 'professional_psychology', 'public_relations', 'security_study', 'sociology', 'sports_science', 'traditional_chinese_medicine', 'virology', 'world_history', 'world_religions'] from datasets import load_dataset cmmlu = {k: load_dataset(r"haonan-li/cmmlu", k) for k in task_list} ## 引用 @misc{li2023cmmlu, title={CMMLU: Measuring massive multitask language understanding in Chinese}, author={Haonan Li and Yixuan Zhang and Fajri Koto and Yifei Yang and Hai Zhao and Yeyun Gong and Nan Duan and Timothy Baldwin}, year={2023}, eprint={2306.09212}, archivePrefix={arXiv}, primaryClass={cs.CL} } ## 许可 CMMLU数据集采用[知识共享署名-非商业性使用-相同方式共享4.0国际许可协议](http://creativecommons.org/licenses/by-nc-sa/4.0/)进行授权。
提供机构:
maas
创建时间:
2024-05-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作