DocVQA
收藏arXiv2025-09-30 收录
下载链接:
https://cvit.iiit.ac.in/docvqa/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个为视觉问答任务设立的基准数据集,它要求模型理解和文档中的文本及图像内容,以回答特定的问题。此外,该模型在DocVQA的训练集上进行微调,并在验证集上进行评估,其任务是进行视觉问题解答。
This dataset is a benchmark for the visual question answering (VQA) task, demanding models to comprehend textual and visual content in documents to answer specific questions. Additionally, models fine-tuned on the DocVQA training set are evaluated on its validation split, with the core task being visual question answering.



