MBZUAI/Dialectal-Arabic-MMLU
收藏Hugging Face2026-03-30 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/MBZUAI/Dialectal-Arabic-MMLU
下载链接
链接失效反馈官方服务:
资源简介:
Dialectal-Arabic-MMLU是一个大规模的人工翻译数据集,扩展自MMLU-Redux,覆盖了5种主要阿拉伯方言:叙利亚、埃及、阿联酋、沙特和摩洛哥方言。该数据集包含21K个问答对,涵盖32个学术和专业领域。数据集的格式为多项选择题,每个问题有四个选项,正确答案存储在answer字段中。每个问题在各自的domain内有唯一的qid,这些ID在所有方言中是平行的。数据集支持的语言包括英语、现代标准阿拉伯语(MSA)以及上述5种方言。
Dialectal-Arabic-MMLU is a large-scale, human-translated for MMLU. We extend MMLU-Redux into 5 major dialects: Syrian, Egyptian, Emirati, Saudi, and Moroccan. This data covers 21K QA pairs across 32 academic and professional domains. The dataset follows a multiple-choice question-answering format with four candidate choices. The correct label is stored in the answer field. Each question is assigned a unique qid within its respective domain. These IDs are parallel across all dialects. The dataset supports English and MSA, as well as 5 dialects: Syrian, Egyptian, Emirati, Saudi, and Moroccan.
提供机构:
MBZUAI



