Toloka Visual Question Answering Dataset
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7057740
下载链接
链接失效反馈官方服务:
资源简介:
Our dataset consists of the images associated with textual questions. One entry (instance) in our dataset is a question-image pair labeled with the ground truth coordinates of a bounding box containing the visual answer to the given question. The images were obtained from a CC BY-licensed subset of the Microsoft Common Objects in Context dataset, MS COCO. All data labeling was performed on the Toloka crowdsourcing platform, https://toloka.ai/.
Our dataset has 45,199 instances split among three subsets: train (38,990 instances), public test (1,705 instances), and private test (4,504 instances). The entire train dataset was available for everyone since the start of the challenge. The public test dataset was available since the evaluation phase of the competition, but without any ground truth labels. After the end of the competition, public and private sets were released.
The datasets will be provided as files in the comma-separated values (CSV) format containing the following columns.
Column
Type
Description
image
string
URL of an image on a public content delivery network
width
integer
image width
height
integer
image height
left
integer
bounding box coordinate: left
top
integer
bounding box coordinate: top
right
integer
bounding box coordinate: right
bottom
integer
bounding box coordinate: bottom
question
string
question in English
This upload also contains a ZIP file with the images from MS COCO.
创建时间:
2023-10-10



