AraDICE-ArabicMMLU-lev

Name: AraDICE-ArabicMMLU-lev
Creator: maas
Published: 2025-12-05 16:38:47
License: 暂无描述

魔搭社区2025-12-05 更新2025-06-21 收录

下载链接：

https://modelscope.cn/datasets/QCRI/AraDICE-ArabicMMLU-lev

下载链接

链接失效反馈

官方服务：

资源简介：

# AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs -- ArabicMMLU - Levantine dialect ## Overview The **AraDiCE** dataset is crafted to assess the dialectal and cultural understanding of large language models (LLMs) within Arabic-speaking contexts. It includes post-edited adaptations of several benchmark datasets, specifically curated to validate LLM performance in culturally and dialectally relevant scenarios for Arabic. Within the AraDiCE collection, this particular subset is designated as **ArabicMMLU - Levantine Dialect**. ## Dataset Usage The AraDiCE dataset is intended to be used for benchmarking and evaluating large language models, specifically focusing on: - Assessing the performance of LLMs on Arabic-specific dialect and cultural specifics. - Dialectal variations in the Arabic language. - Cultural context awareness in reasoning. ## Evaluation We have used [lm-harness](https://github.com/EleutherAI/lm-evaluation-harness) eval framework to for the benchmarking. We will soon release them. Stay tuned!! ## Machine Translation Models We will soon be releasing all our *machine translation models*. Stay tuned! For early access, feel free to contact us. ## License The dataset is distributed under the **Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)**. The full license text can be found in the accompanying `licenses_by-nc-sa_4.0_legalcode.txt` file. ## Citation Please find the paper <a href="https://arxiv.org/pdf/2409.11404" target="_blank" style="margin-right: 15px; margin-left: 10px">here.</a> ``` @article{mousi2024aradicebenchmarksdialectalcultural, title={{AraDiCE}: Benchmarks for Dialectal and Cultural Capabilities in LLMs}, author={Basel Mousi and Nadir Durrani and Fatema Ahmad and Md. Arid Hasan and Maram Hasanain and Tameem Kabbani and Fahim Dalvi and Shammur Absar Chowdhury and Firoj Alam}, year={2024}, publisher={arXiv:2409.11404}, url={https://arxiv.org/abs/2409.11404}, } ```

# AraDiCE：大语言模型方言与文化能力基准测试集——ArabicMMLU-黎凡特方言子集 ## 概述 **AraDiCE**数据集旨在评估大语言模型（Large Language Model，LLM）在阿拉伯语语境下的方言与文化理解能力。该数据集包含多个基准数据集经过后期编辑的适配版本，专门针对阿拉伯语文化与方言相关场景，用于验证LLM的性能表现。在AraDiCE数据集集合中，本次介绍的特定子集被命名为**ArabicMMLU-黎凡特方言子集**。 ## 数据集用途 AraDiCE数据集主要用于大语言模型的基准测试与性能评估，具体聚焦以下方向： - 评估LLM在阿拉伯语专属方言与文化细节上的性能表现； - 分析阿拉伯语的方言变体差异； - 测试模型在推理过程中的文化语境感知能力。 ## 评估方式本次基准测试采用了[lm-harness](https://github.com/EleutherAI/lm-evaluation-harness)评估框架。相关测试资源即将上线，敬请关注！ ## 机器翻译模型我们即将发布全部机器翻译模型，敬请期待。若需提前获取权限，欢迎联系我们。 ## 许可协议本数据集采用**知识共享署名-非商业性使用-相同方式共享4.0国际许可协议（Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License，CC BY-NC-SA 4.0）**进行分发。完整许可协议文本可在随附的`licenses_by-nc-sa_4.0_legalcode.txt`文件中查看。 ## 引用方式论文详情可点击<a href="https://arxiv.org/pdf/2409.11404" target="_blank" style="margin-right: 15px; margin-left: 10px">此处</a>查阅。 @article{mousi2024aradicebenchmarksdialectalcultural, title={{AraDiCE}: Benchmarks for Dialectal and Cultural Capabilities in LLMs}, author={Basel Mousi and Nadir Durrani and Fatema Ahmad and Md. Arid Hasan and Maram Hasanain and Tameem Kabbani and Fahim Dalvi and Shammur Absar Chowdhury and Firoj Alam}, year={2024}, publisher={arXiv:2409.11404}, url={https://arxiv.org/abs/2409.11404}, }

提供机构：

maas

创建时间：

2025-06-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集