DocVQA

arXiv2025-09-30 收录

下载链接：

https://cvit.iiit.ac.in/docvqa/

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个为视觉问答任务设立的基准数据集，它要求模型理解和文档中的文本及图像内容，以回答特定的问题。此外，该模型在DocVQA的训练集上进行微调，并在验证集上进行评估，其任务是进行视觉问题解答。

This dataset is a benchmark for the visual question answering (VQA) task, demanding models to comprehend textual and visual content in documents to answer specific questions. Additionally, models fine-tuned on the DocVQA training set are evaluated on its validation split, with the core task being visual question answering.

5,000+

优质数据集

54 个

任务类型

进入经典数据集