five

CVQA

收藏
arXiv2024-06-10 更新2024-06-12 收录
下载链接:
https://huggingface.co/datasets/afaji/cvqa
下载链接
链接失效反馈
官方服务:
资源简介:
CVQA是一个大规模的多语言视觉问答基准数据集,由MBZUAI创建,旨在涵盖28个国家的文化和26种不同的语言。该数据集包含9044个样本,覆盖10个多样化的类别,并由流利的本地语言使用者和文化专家进行标注和验证。CVQA不仅包括文化驱动的图像和问题,还特别关注低资源语言,通过与本地社区的合作,确保了数据集的高质量和多样性。该数据集的应用领域广泛,主要用于评估多模态大型语言模型在跨文化和多语言环境下的理解和推理能力,推动了文化意识和语言多样性在AI领域的研究。

CVQA is a large-scale multilingual visual question answering (VQA) benchmark dataset created by MBZUAI, which aims to cover the cultural backgrounds of 28 countries and 26 distinct languages. The dataset contains 9,044 samples spanning 10 diverse categories, and was annotated and validated by fluent native language speakers and cultural experts. Beyond culture-driven images and questions, CVQA pays special attention to low-resource languages, and ensures the high quality and diversity of the dataset through collaboration with local communities. This dataset has a wide range of application scenarios, and is mainly used to evaluate the understanding and reasoning abilities of multimodal large language models in cross-cultural and multilingual environments, promoting research on cultural awareness and linguistic diversity in the field of artificial intelligence.
提供机构:
MBZUAI
创建时间:
2024-06-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作