MALAMUTE
收藏arXiv2024-12-13 更新2024-12-25 收录
下载链接:
http://arxiv.org/abs/2412.10105v1
下载链接
链接失效反馈官方服务:
资源简介:
MALAMUTE是一个多语言、无模板、高度细粒度的教育探测数据集,由科罗拉多大学博尔德分校的研究团队创建。该数据集从71本大学级别的教材中提取,涵盖了英语、西班牙语和波兰语三种语言,包含33,361个课程概念和116,887个专家编写并经过同行评审的探测提示。数据集的创建过程包括从OpenStax项目中提取教材内容,并通过严格的筛选和质量控制确保提示的高质量。MALAMUTE旨在评估语言模型在教育领域的知识掌握情况,特别是在特定学科的细粒度知识上,以确保其在课堂中的安全有效应用。
MALAMUTE is a multilingual, template-free, highly fine-grained educational probing dataset created by the research team at the University of Colorado Boulder. Extracted from 71 college-level textbooks across three languages—English, Spanish, and Polish, the dataset contains 33,361 course concepts and 116,887 expert-written, peer-reviewed probing prompts. The dataset’s creation workflow involves extracting textbook content from the OpenStax project, followed by rigorous filtering and quality control measures to ensure the high quality of the prompts. MALAMUTE is designed to evaluate the knowledge mastery of large language models in educational contexts, particularly regarding fine-grained knowledge across specific disciplines, to ensure their safe and effective deployment in classroom settings.
提供机构:
科罗拉多大学博尔德分校
创建时间:
2024-12-13



