cmarkea/doc-vqa
收藏数据集概述
数据集信息
-
特征字段:
id: 字符串类型paper_id: 字符串类型source: 字符串类型image: 图像类型qa: 结构体类型,包含以下字段:en: 列表类型,包含以下字段:answer: 字符串类型question: 字符串类型
fr: 列表类型,包含以下字段:answer: 字符串类型question: 字符串类型
-
数据分割:
train: 包含9688个样本,大小为2435754786.096字节test: 包含2421个样本,大小为611923621.391字节
-
数据集大小:
- 下载大小: 2185509114字节
- 数据集总大小: 3047678407.4870005字节
-
配置:
default配置包含以下数据文件:train:data/train-*test:data/test-*
-
许可证: Apache 2.0
-
任务类别: 视觉问答 (visual-question-answering)
-
语言: 英语 (en), 法语 (fr)
-
标签: AFTdb, infoVQA
数据集描述
- 该数据集整合了来自Infographic_vqa数据集和AFTDB数据集的图像,每张图像平均关联五个问题和答案,支持英语和法语两种语言。
数据样本示例
json { "id": "31311a78fb5a4daa93e85d31620fad17", "paper_id": "2303.12112v3", "source": "aftdb_figure", "image": [PIL.Image], "qa": { "en": [ { "answer": "A man riding an orange snowboard jumping off a snow ramp.", "question": "What is the real image of the generated image A person on a snowboard in the air?" }, { "answer": "A pizza with basil leaves.", "question": "What kind of pizza is in the real image?" }, { "answer": "A brown grizzly bear.", "question": "What animal is in the real images?" }, { "answer": "The cat is on some green grass.", "question": "Where is the black and white cat in the real image?" }, { "answer": "Two cups on saucers.", "question": "What is on top of the wooden table in the real image?" } ], "fr": [ { "answer": "Un homme sur un snowboard orange sautant dune rampe de neige.", "question": "Quelle est limage réelle de limage générée Une personne sur un snowboard dans les airs?" }, { "answer": "Une pizza avec des feuilles de basilic.", "question": "Quel type de pizza est dans limage réelle?" }, { "answer": "Un grizzli brun.", "question": "Quel animal est dans les vraies images?" }, { "answer": "Le chat est sur de lherbe verte.", "question": "Où est le chat noir et blanc sur la vraie image?" }, { "answer": "Deux tasses sur des soucoupes.", "question": "Quest-ce quil y a sur la table en bois sur la vraie image?" } ] } }
数据集统计信息
| 数据集 | 图像数量 | 问答对数量 |
|---|---|---|
| infoVQA | 2,096 | 21,074 |
| aftdb_figure | 10,016 | 101,218 |
| doc-vqa(Train) | 9,688 | 97,842 |
| doc-vqa(Test) | 2,421 | 24,452 |
引用
bibtex @online{Dedoc-vqa, AUTHOR = {Loïc SOKOUDJOU SONAG}, URL = {https://huggingface.co/datasets/cmarkea/doc-vqa}, YEAR = {2024}, KEYWORDS = {NLP ; Multimodal} }



