path-vqa

Name: path-vqa
Creator: maas
Published: 2026-05-12 12:24:00
License: 暂无描述

魔搭社区2026-05-12 更新2024-06-08 收录

下载链接：

https://modelscope.cn/datasets/swift/path-vqa

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for PathVQA ## Dataset Description PathVQA is a dataset of question-answer pairs on pathology images. The dataset is intended to be used for training and testing Medical Visual Question Answering (VQA) systems. The dataset includes both open-ended questions and binary "yes/no" questions. The dataset is built from two publicly-available pathology textbooks: "Textbook of Pathology" and "Basic Pathology", and a publicly-available digital library: "Pathology Education Informational Resource" (PEIR). The copyrights of images and captions belong to the publishers and authors of these two books, and the owners of the PEIR digital library. **Repository:** [PathVQA Official GitHub Repository](https://github.com/UCSD-AI4H/PathVQA) **Paper:** [PathVQA: 30000+ Questions for Medical Visual Question Answering](https://arxiv.org/abs/2003.10286) **Leaderboard:** [Papers with Code Leaderboard](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa) ### Dataset Summary The dataset was obtained from the updated Google Drive link shared by the authors on Feb 15, 2023, see the [commit](https://github.com/UCSD-AI4H/PathVQA/commit/117e7f4ef88a0e65b0e7f37b98a73d6237a3ceab) in the GitHub repository. This version of the dataset contains a total of 5,004 images and 32,795 question-answer pairs. Out of the 5,004 images, 4,289 images are referenced by a question-answer pair, while 715 images are not used. There are a few image-question-answer triplets which occur more than once in the same split (training, validation, test). After dropping the duplicate image-question-answer triplets, the dataset contains 32,632 question-answer pairs on 4,289 images. #### Supported Tasks and Leaderboards The PathVQA dataset has an active leaderboard on [Papers with Code](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa) where models are ranked based on three metrics: "Yes/No Accuracy", "Free-form accuracy" and "Overall accuracy". "Yes/No Accuracy" is the accuracy of a model's generated answers for the subset of binary "yes/no" questions. "Free-form accuracy" is the accuracy of a model's generated answers for the subset of open-ended questions. "Overall accuracy" is the accuracy of a model's generated answers across all questions. #### Languages The question-answer pairs are in English. ## Dataset Structure ### Data Instances Each instance consists of an image-question-answer triplet. ``` { 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=CMYK size=309x272>, 'question': 'where are liver stem cells (oval cells) located?', 'answer': 'in the canals of hering' } ``` ### Data Fields - `'image'`: the image referenced by the question-answer pair. - `'question'`: the question about the image. - `'answer'`: the expected answer. ### Data Splits The dataset is split into training, validation and test. The split is provided directly by the authors. | | Training Set | Validation Set | Test Set | |-------------------------|:------------:|:--------------:|:--------:| | QAs |19,654 |6,259 |6,719 | | Images |2,599 |832 |858 | ## Additional Information ### Licensing Information The authors have released the dataset under the [MIT License](https://github.com/UCSD-AI4H/PathVQA/blob/master/LICENSE). ### Citation Information ``` @article{he2020pathvqa, title={PathVQA: 30000+ Questions for Medical Visual Question Answering}, author={He, Xuehai and Zhang, Yichen and Mou, Luntian and Xing, Eric and Xie, Pengtao}, journal={arXiv preprint arXiv:2003.10286}, year={2020} } ```

# PathVQA 数据集卡片 ## 数据集描述 PathVQA是一款面向病理图像的问答对数据集，旨在用于医学视觉问答（Medical Visual Question Answering, VQA）系统的训练与测试。该数据集涵盖开放式问题与二元“是/否”问题两类题型。数据集取材自两本公开可用的病理教科书：《病理学教科书》（*Textbook of Pathology*）与《基础病理学》（*Basic Pathology*），以及一个公开数字图书馆——病理学教育信息资源库（Pathology Education Informational Resource, PEIR）。图像与文本说明的版权归属于这两本教科书的出版方、作者，以及PEIR数字图书馆的所有者。 **仓库**：[PathVQA 官方GitHub仓库](https://github.com/UCSD-AI4H/PathVQA) **论文**：[PathVQA：面向医学视觉问答的30000+问答对数据集](https://arxiv.org/abs/2003.10286) **排行榜**：[Papers with Code 排行榜](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa) ### 数据集概览本数据集源自作者于2023年2月15日分享的更新版Google Drive链接，详见GitHub仓库中的[提交记录](https://github.com/UCSD-AI4H/PathVQA/commit/117e7f4ef88a0e65b0e7f37b98a73d6237a3ceab)。此版本数据集共包含5004张图像与32795组问答对。其中4289张图像被问答对引用，剩余715张图像未被使用。部分图像-问题-答案三元组在同一划分（训练集、验证集、测试集）中重复出现。移除重复的三元组后，数据集在4289张图像上共包含32632组问答对。 #### 支持任务与排行榜 PathVQA数据集在[Papers with Code](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa)设有活跃排行榜，模型基于三项指标进行排名：“是/否准确率”“自由形式准确率”与“总体准确率”。其中，“是/否准确率”指模型针对二元“是/否”问题子集生成答案的准确率；“自由形式准确率”指模型针对开放式问题子集生成答案的准确率；“总体准确率”指模型在全部问答对上生成答案的准确率。 #### 语言问答对采用英文编写。 ## 数据集结构 ### 数据实例每个实例由图像-问题-答案三元组构成。 { 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=CMYK size=309x272>, 'question': '肝干细胞（卵圆形细胞）位于何处？', 'answer': '赫林管内' } ### 数据字段 - `'image'`：问答对所引用的图像。 - `'question'`：针对该图像提出的问题。 - `'answer'`：预期的标准答案。 ### 数据划分数据集按照作者提供的划分方式分为训练集、验证集与测试集。 | | 训练集 | 验证集 | 测试集 | |-------------------------|:------------:|:--------------:|:--------:| | 问答对数量 |19,654 |6,259 |6,719 | | 图像数量 |2,599 |832 |858 | ## 附加信息 ### 许可信息作者已将该数据集以[MIT许可协议](https://github.com/UCSD-AI4H/PathVQA/blob/master/LICENSE)开源发布。 ### 引用信息 @article{he2020pathvqa, title={PathVQA: 面向医学视觉问答的30000+问答对数据集}, author={He, Xuehai and Zhang, Yichen and Mou, Luntian and Xing, Eric and Xie, Pengtao}, journal={arXiv preprint arXiv:2003.10286}, year={2020} }

提供机构：

maas

创建时间：

2024-06-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集