RetVQA

Name: RetVQA
Creator: Curated by authors from Visual Genome annotations
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://vl2g.github.io/projects/retvqa/

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为RETVQA，它侧重于对一组异构图像提出多图像、与元数据无关的问题，并期待混合了以分类为导向和开放性生成式的答案。数据集涵盖了五大类别的问题：颜色、形状、数量、物体属性和基于关系的问题。它包含了二元生成式和开放性生成式的答案，并采用了随机80%-10%-10%的比例进行训练集、验证集和测试集的划分。这是该领域内最大的数据集，拥有多样化的多图像问题。该数据集的任务是基于检索的视觉问题回答（Retvqa）。

This dataset, named RETVQA, focuses on generating multi-image, metadata-agnostic questions for a collection of heterogeneous images, and expects answers that integrate both classification-oriented and open-ended generative formats. It encompasses five categories of questions: color, shape, quantity, object attribute, and relation-based questions. The dataset includes both binary generative and open-ended generative answers, and employs a random 80%-10%-10% split for the training, validation, and test sets. It is the largest dataset in this field, featuring diverse multi-image question samples. The core task corresponding to this dataset is retrieval-based visual question answering (RetVQA).

提供机构：

Curated by authors from Visual Genome annotations

搜集汇总

数据集介绍

背景与挑战

背景概述

RetVQA是一个专注于基于检索的视觉问答（RETVQA）任务的数据集，特点是多图像检索需求、多样的问题类型和混合的答案形式，是目前该领域最大的数据集。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集