台湾大规模多任务语言理解(TMMLU)
收藏arXiv2023-10-02 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2309.08448v2
下载链接
链接失效反馈官方服务:
资源简介:
台湾大规模多任务语言理解(TMMLU)数据集是由联发科技研究院创建,旨在评估语言模型在传统中文环境下的表现。该数据集涵盖从高中入学考试到职业考试的55个科目,涉及多个学科领域,从基础教育到专业水平不等。TMMLU数据集通过精心设计的考试题目,旨在识别模型在知识掌握和问题解决方面的盲点,类似于人类评估方式。此数据集的应用领域主要集中在语言模型的性能评估,特别是针对传统中文语言处理能力的提升。
The Taiwan Massive Multitask Language Understanding (TMMLU) dataset was developed by MediaTek Research Institute, with the goal of evaluating the performance of language models in Traditional Chinese language scenarios. This dataset encompasses 55 subjects ranging from high school entrance examinations to professional certification exams, spanning multiple academic disciplines from basic education to professional proficiency levels. By utilizing carefully curated exam questions, TMMLU aims to identify blind spots in knowledge mastery and problem-solving of language models, which is analogous to human evaluation protocols. The primary application scope of this dataset centers on performance evaluation of language models, particularly for enhancing the Traditional Chinese language processing capabilities of such models.
提供机构:
联发科技研究院
创建时间:
2023-09-15



