five

QCRI/AraDICE-ArabicMMLU-lev

收藏
Hugging Face2024-11-08 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/QCRI/AraDICE-ArabicMMLU-lev
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-sa-4.0 task_categories: - text-classification - question-answering language: - ar tags: - MMLU - reading-comprehension - commonsense-reasoning - capabilities - cultural-understanding - world-knowledge pretty_name: 'AraDiCE -- Arabic Dialect and Cultural Evaluation -- ArabicMMLU - Levantine dialect' size_categories: - 10K<n<100K dataset_info: - config_name: high_humanities_history splits: - name: test num_examples: 760 - config_name: high_humanities_islamic-studies splits: - name: test num_examples: 334 - config_name: high_humanities_philosophy splits: - name: test num_examples: 39 - config_name: high_language_arabic-language splits: - name: test num_examples: 390 - config_name: high_social-science_civics splits: - name: test num_examples: 87 - config_name: high_social-science_economics splits: - name: test num_examples: 360 - config_name: high_social-science_geography splits: - name: test num_examples: 1038 - config_name: high_stem_biology splits: - name: test num_examples: 1409 - config_name: high_stem_computer-science splits: - name: test num_examples: 261 - config_name: high_stem_physics splits: - name: test num_examples: 255 - config_name: middle_humanities_history splits: - name: test num_examples: 203 - config_name: middle_humanities_islamic-studies splits: - name: test num_examples: 238 - config_name: middle_language_arabic-language splits: - name: test num_examples: 27 - config_name: middle_other_general-knowledge splits: - name: test num_examples: 172 - config_name: middle_social-science_civics splits: - name: test num_examples: 236 - config_name: middle_social-science_economics splits: - name: test num_examples: 87 - config_name: middle_social-science_geography splits: - name: test num_examples: 272 - config_name: middle_social-science_social-science splits: - name: test num_examples: 241 - config_name: middle_stem_computer-science splits: - name: test num_examples: 27 - config_name: middle_stem_natural-science splits: - name: test num_examples: 242 - config_name: na_humanities_islamic-studies splits: - name: test num_examples: 639 - config_name: na_language_arabic-language-general splits: - name: test num_examples: 612 - config_name: na_language_arabic-language-grammar splits: - name: test num_examples: 365 - config_name: na_other_driving-test splits: - name: test num_examples: 1211 - config_name: na_other_general-knowledge splits: - name: test num_examples: 864 - config_name: primary_humanities_history splits: - name: test num_examples: 102 - config_name: primary_humanities_islamic-studies splits: - name: test num_examples: 999 - config_name: primary_language_arabic-language splits: - name: test num_examples: 252 - config_name: primary_other_general-knowledge splits: - name: test num_examples: 162 - config_name: primary_social-science_geography splits: - name: test num_examples: 57 - config_name: primary_social-science_social-science splits: - name: test num_examples: 705 - config_name: primary_stem_computer-science splits: - name: test num_examples: 190 - config_name: primary_stem_math splits: - name: test num_examples: 409 - config_name: primary_stem_natural-science splits: - name: test num_examples: 336 - config_name: prof_humanities_law splits: - name: test num_examples: 314 - config_name: univ_other_management splits: - name: test num_examples: 75 - config_name: univ_social-science_accounting splits: - name: test num_examples: 74 - config_name: univ_social-science_economics splits: - name: test num_examples: 137 - config_name: univ_social-science_political-science splits: - name: test num_examples: 210 - config_name: univ_stem_computer-science splits: - name: test num_examples: 64 configs: - config_name: high_humanities_history data_files: - split: test path: high_humanities_history/test.json - config_name: high_humanities_islamic-studies data_files: - split: test path: high_humanities_islamic-studies/test.json - config_name: high_humanities_philosophy data_files: - split: test path: high_humanities_philosophy/test.json - config_name: high_language_arabic-language data_files: - split: test path: high_language_arabic-language/test.json - config_name: high_social-science_civics data_files: - split: test path: high_social-science_civics/test.json - config_name: high_social-science_economics data_files: - split: test path: high_social-science_economics/test.json - config_name: high_social-science_geography data_files: - split: test path: high_social-science_geography/test.json - config_name: high_stem_biology data_files: - split: test path: high_stem_biology/test.json - config_name: high_stem_computer-science data_files: - split: test path: high_stem_computer-science/test.json - config_name: high_stem_physics data_files: - split: test path: high_stem_physics/test.json - config_name: middle_humanities_history data_files: - split: test path: middle_humanities_history/test.json - config_name: middle_humanities_islamic-studies data_files: - split: test path: middle_humanities_islamic-studies/test.json - config_name: middle_language_arabic-language data_files: - split: test path: middle_language_arabic-language/test.json - config_name: middle_other_general-knowledge data_files: - split: test path: middle_other_general-knowledge/test.json - config_name: middle_social-science_civics data_files: - split: test path: middle_social-science_civics/test.json - config_name: middle_social-science_economics data_files: - split: test path: middle_social-science_economics/test.json - config_name: middle_social-science_geography data_files: - split: test path: middle_social-science_geography/test.json - config_name: middle_social-science_social-science data_files: - split: test path: middle_social-science_social-science/test.json - config_name: middle_stem_computer-science data_files: - split: test path: middle_stem_computer-science/test.json - config_name: middle_stem_natural-science data_files: - split: test path: middle_stem_natural-science/test.json - config_name: na_humanities_islamic-studies data_files: - split: test path: na_humanities_islamic-studies/test.json - config_name: na_language_arabic-language-general data_files: - split: test path: na_language_arabic-language-general/test.json - config_name: na_language_arabic-language-grammar data_files: - split: test path: na_language_arabic-language-grammar/test.json - config_name: na_other_driving-test data_files: - split: test path: na_other_driving-test/test.json - config_name: na_other_general-knowledge data_files: - split: test path: na_other_general-knowledge/test.json - config_name: primary_humanities_history data_files: - split: test path: primary_humanities_history/test.json - config_name: primary_humanities_islamic-studies data_files: - split: test path: primary_humanities_islamic-studies/test.json - config_name: primary_language_arabic-language data_files: - split: test path: primary_language_arabic-language/test.json - config_name: primary_other_general-knowledge data_files: - split: test path: primary_other_general-knowledge/test.json - config_name: primary_social-science_geography data_files: - split: test path: primary_social-science_geography/test.json - config_name: primary_social-science_social-science data_files: - split: test path: primary_social-science_social-science/test.json - config_name: primary_stem_computer-science data_files: - split: test path: primary_stem_computer-science/test.json - config_name: primary_stem_math data_files: - split: test path: primary_stem_math/test.json - config_name: primary_stem_natural-science data_files: - split: test path: primary_stem_natural-science/test.json - config_name: prof_humanities_law data_files: - split: test path: prof_humanities_law/test.json - config_name: univ_other_management data_files: - split: test path: univ_other_management/test.json - config_name: univ_social-science_accounting data_files: - split: test path: univ_social-science_accounting/test.json - config_name: univ_social-science_economics data_files: - split: test path: univ_social-science_economics/test.json - config_name: univ_social-science_political-science data_files: - split: test path: univ_social-science_political-science/test.json - config_name: univ_stem_computer-science data_files: - split: test path: univ_stem_computer-science/test.json --- # AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs -- ArabicMMLU - Levantine dialect ## Overview The **AraDiCE** dataset is crafted to assess the dialectal and cultural understanding of large language models (LLMs) within Arabic-speaking contexts. It includes post-edited adaptations of several benchmark datasets, specifically curated to validate LLM performance in culturally and dialectally relevant scenarios for Arabic. Within the AraDiCE collection, this particular subset is designated as **ArabicMMLU - Levantine Dialect**. ## Dataset Usage The AraDiCE dataset is intended to be used for benchmarking and evaluating large language models, specifically focusing on: - Assessing the performance of LLMs on Arabic-specific dialect and cultural specifics. - Dialectal variations in the Arabic language. - Cultural context awareness in reasoning. ## Evaluation We have used [lm-harness](https://github.com/EleutherAI/lm-evaluation-harness) eval framework to for the benchmarking. We will soon release them. Stay tuned!! ## Machine Translation Models We will soon be releasing all our *machine translation models*. Stay tuned! For early access, feel free to contact us. ## License The dataset is distributed under the **Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)**. The full license text can be found in the accompanying `licenses_by-nc-sa_4.0_legalcode.txt` file. ## Citation Please find the paper <a href="https://arxiv.org/pdf/2409.11404" target="_blank" style="margin-right: 15px; margin-left: 10px">here.</a> ``` @article{mousi2024aradicebenchmarksdialectalcultural, title={{AraDiCE}: Benchmarks for Dialectal and Cultural Capabilities in LLMs}, author={Basel Mousi and Nadir Durrani and Fatema Ahmad and Md. Arid Hasan and Maram Hasanain and Tameem Kabbani and Fahim Dalvi and Shammur Absar Chowdhury and Firoj Alam}, year={2024}, publisher={arXiv:2409.11404}, url={https://arxiv.org/abs/2409.11404}, } ```
提供机构:
QCRI
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作