Robust VQA (RVQA) Dataset with Language Prior and Compositional Reasoning Labels
收藏ieee-dataport.org2025-01-22 收录
下载链接:
https://ieee-dataport.org/documents/robust-vqa-rvqa-dataset-language-prior-and-compositional-reasoning-labels
下载链接
链接失效反馈官方服务:
资源简介:
This dataset is designed to advance research in Visual Question Answering (VQA), specifically addressing challenges related to language priors and compositional reasoning. It incorporates question labels categorizing queries based on their susceptibility to either issue, allowing for targeted evaluation of VQA models. The dataset consists of 33,051 training images and 14,165 validation images, along with 571,244 training questions and 245,087 validation questions. Among the training questions, 313,664 focus on compositional reasoning, while 257,580 pertain to language prior. Similarly, the validation questions are categorized into 134,313 for compositional reasoning and 110,774 for language prior. This dataset serves as a benchmarking tool for evaluating models' performance across these two challenges, providing insights into areas that require further improvement. The comprehensive dataset preparation process, including image collection, caption generation, prompt creation, QA pair generation, and quality control, is outlined in the accompanying algorithm. The dataset is designed to be extensible to other image sources and can be a valuable resource for researchers focusing on VQA tasks involving complex reasoning.
本数据集旨在推进视觉问答(VQA)领域的研究,特别是针对语言先验和组合推理等挑战。该数据集通过将问题标签分类为易于受上述问题影响的不同类别,实现了对VQA模型的有针对性评估。数据集包含33,051张训练图像和14,165张验证图像,以及571,244个训练问题和245,087个验证问题。在训练问题中,313,664个问题聚焦于组合推理,而257,580个问题与语言先验相关。同样,验证问题也被分为134,313个与组合推理相关和110,774个与语言先验相关的问题。该数据集作为评估模型在这两项挑战中表现的标准工具,为需要进一步改进的领域提供了洞见。伴随的算法详细概述了全面的数据集准备过程,包括图像收集、字幕生成、提示创建、问答对生成和质量控制。该数据集设计为可扩展至其他图像源,对于专注于涉及复杂推理的VQA任务的学者而言,它是一个宝贵的资源。
提供机构:
IEEE Dataport



