VQA 2.0 Training and Validation Set
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/microsoft/unilm/tree/master/beit3
下载链接
链接失效反馈官方服务:
资源简介:
该数据集主要针对视觉问答(VQA)任务,其中人类的不确定性标签被分为低、中、高三个级别。此外,数据集包含了具有多个人类回应的样本,以评估模型性能与人类不确定性分布之间的关系。在规模上,该数据集包括训练集443,757个样本,验证集213,954个样本,以及经过筛选的BEiT3数据集3,248个样本和LXMERT数据集15,408个样本。其任务专注于视觉问答(Vqa)。
This dataset is primarily designed for the Visual Question Answering (VQA) task, where human uncertainty labels are divided into three levels: low, medium, and high. Additionally, the dataset includes samples with multiple human responses, aiming to evaluate the correlation between model performance and the distribution of human uncertainty. In terms of scale, the dataset consists of 443,757 training samples, 213,957 validation samples, as well as 3,248 filtered samples from the BEiT3 dataset and 15,408 samples from the LXMERT dataset. Its core task focuses on Visual Question Answering (VQA).
提供机构:
VQA 2.0



