PEARL-LITE
收藏PEARL-LITE 数据集概述
基本信息
- 数据集ID:
UBC-NLP/PEARL-LITE - 许可证:
cc-by-nc-nd-4.0 - 任务类别: 视觉问答 (Visual Question Answering)
- 语言: 阿拉伯语 (问题与答案), 英语 (元数据)
- 标签: 文化 (Culture), 阿拉伯语 (Arabic), 视觉问答 (VQA)
- 规模类别: 1K < n < 10K
数据集描述
PEARL-LITE 是主 PEARL 基准测试的一个轻量级子集,专为需要快速评估或更短迭代周期的用户设计。它保留了主 PEARL 基准测试的文化丰富性和问题类型多样性,但减少了示例数量。
关键特性
- 用途: 快速评估和更短的迭代周期
- 内容: PEARL 基准测试的子集
- 模态: 图像-文本对
数据集结构
特征
category: 类别 (string)country: 国家 (string)image: 图像 (image)image_id: 图像ID (string)augmented_caption: 增强标题 (string)question: 问题 (string)answer: 答案 (string)answer_letter: 答案字母 (string)choices: 选项 (sequence of string)question_type: 问题类型 (string)annotation_id: 注释ID (string)qa_index: QA索引 (int32)
数据划分
- 测试集 (test):
- 样本数量: 6,867
- 大小: 3,607,317,256.405 字节
- 下载大小: 1,432,676,863 字节
相关资源
- 论文: Pearl: A Multimodal Culturally-Aware Arabic Instruction Dataset
- ArXiv 链接: http://arxiv.org/abs/2505.21979
- GitHub 仓库: https://github.com/UBC-NLP/pearl
引用
bibtex @article{Alwajih2025pearl, title={Pearl: A Multimodal Culturally-Aware {A}rabic Instruction Dataset}, author={Fakhraddin Alwajih and Samar M. Magdy and Abdellah El Mekki and Omer Nacar and Youssef Nafea and Safaa Taher Abdelfadil and Abdulfattah Mohammed Yahya and Hamzah Luqman and Nada Almarwani and Samah Aloufi and Baraah Qawasmeh and Houdaifa Atou and Serry Sibaee and Hamzah A. Alsayadi and Walid Al-Dhabyani and Maged S. Al-shaibani and Aya El aatar and Nour Qandos and Rahaf Alhamouri and Samar Ahmad and Razan Khassib and Lina Hamad and Mohammed Anwar AL-Ghrawi and Fatimah Alshamari and Cheikh Malainine and Doaa Qawasmeh and Aminetou Yacoub and Tfeil moilid and Ruwa AbuHweidi and Ahmed Aboeitta and Vatimetou Mohamed Lemin and Reem Abdel-Salam and Ahlam Bashiti and Adel Ammar and Aisha Alansari and Ahmed Ashraf and Nora Alturayeif and Sara Shatnawi and Alcides Alcoba Inciarte and AbdelRahim A. Elmadany and Mohamedou cheikh tourad and Ismail Berrada and Mustafa Jarrar and Shady Shehata and Muhammad Abdul-Mageed}, journal={arXiv preprint arXiv:2505.21979}, year={2025} }




