path-vqa
收藏魔搭社区2026-05-12 更新2024-06-08 收录
下载链接:
https://modelscope.cn/datasets/swift/path-vqa
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for PathVQA
## Dataset Description
PathVQA is a dataset of question-answer pairs on pathology images. The dataset is intended to be used for training and testing
Medical Visual Question Answering (VQA) systems. The dataset includes both open-ended questions and binary "yes/no" questions.
The dataset is built from two publicly-available pathology textbooks: "Textbook of Pathology" and "Basic Pathology", and a
publicly-available digital library: "Pathology Education Informational Resource" (PEIR). The copyrights of images and captions
belong to the publishers and authors of these two books, and the owners of the PEIR digital library.<br>
**Repository:** [PathVQA Official GitHub Repository](https://github.com/UCSD-AI4H/PathVQA)<br>
**Paper:** [PathVQA: 30000+ Questions for Medical Visual Question Answering](https://arxiv.org/abs/2003.10286)<br>
**Leaderboard:** [Papers with Code Leaderboard](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa)
### Dataset Summary
The dataset was obtained from the updated Google Drive link shared by the authors on Feb 15, 2023,
see the [commit](https://github.com/UCSD-AI4H/PathVQA/commit/117e7f4ef88a0e65b0e7f37b98a73d6237a3ceab)
in the GitHub repository. This version of the dataset contains a total of 5,004 images and 32,795 question-answer pairs.
Out of the 5,004 images, 4,289 images are referenced by a question-answer pair, while 715 images are not used.
There are a few image-question-answer triplets which occur more than once in the same split (training, validation, test).
After dropping the duplicate image-question-answer triplets, the dataset contains 32,632 question-answer pairs on 4,289 images.
#### Supported Tasks and Leaderboards
The PathVQA dataset has an active leaderboard on [Papers with Code](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa)
where models are ranked based on three metrics: "Yes/No Accuracy", "Free-form accuracy" and "Overall accuracy". "Yes/No Accuracy" is
the accuracy of a model's generated answers for the subset of binary "yes/no" questions. "Free-form accuracy" is the accuracy
of a model's generated answers for the subset of open-ended questions. "Overall accuracy" is the accuracy of a model's generated
answers across all questions.
#### Languages
The question-answer pairs are in English.
## Dataset Structure
### Data Instances
Each instance consists of an image-question-answer triplet.
```
{
'image': <PIL.JpegImagePlugin.JpegImageFile image mode=CMYK size=309x272>,
'question': 'where are liver stem cells (oval cells) located?',
'answer': 'in the canals of hering'
}
```
### Data Fields
- `'image'`: the image referenced by the question-answer pair.
- `'question'`: the question about the image.
- `'answer'`: the expected answer.
### Data Splits
The dataset is split into training, validation and test. The split is provided directly by the authors.
| | Training Set | Validation Set | Test Set |
|-------------------------|:------------:|:--------------:|:--------:|
| QAs |19,654 |6,259 |6,719 |
| Images |2,599 |832 |858 |
## Additional Information
### Licensing Information
The authors have released the dataset under the [MIT License](https://github.com/UCSD-AI4H/PathVQA/blob/master/LICENSE).
### Citation Information
```
@article{he2020pathvqa,
title={PathVQA: 30000+ Questions for Medical Visual Question Answering},
author={He, Xuehai and Zhang, Yichen and Mou, Luntian and Xing, Eric and Xie, Pengtao},
journal={arXiv preprint arXiv:2003.10286},
year={2020}
}
```
# PathVQA 数据集卡片
## 数据集描述
PathVQA是一款面向病理图像的问答对数据集,旨在用于医学视觉问答(Medical Visual Question Answering, VQA)系统的训练与测试。该数据集涵盖开放式问题与二元“是/否”问题两类题型。数据集取材自两本公开可用的病理教科书:《病理学教科书》(*Textbook of Pathology*)与《基础病理学》(*Basic Pathology*),以及一个公开数字图书馆——病理学教育信息资源库(Pathology Education Informational Resource, PEIR)。图像与文本说明的版权归属于这两本教科书的出版方、作者,以及PEIR数字图书馆的所有者。<br>
**仓库**:[PathVQA 官方GitHub仓库](https://github.com/UCSD-AI4H/PathVQA)<br>
**论文**:[PathVQA:面向医学视觉问答的30000+问答对数据集](https://arxiv.org/abs/2003.10286)<br>
**排行榜**:[Papers with Code 排行榜](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa)
### 数据集概览
本数据集源自作者于2023年2月15日分享的更新版Google Drive链接,详见GitHub仓库中的[提交记录](https://github.com/UCSD-AI4H/PathVQA/commit/117e7f4ef88a0e65b0e7f37b98a73d6237a3ceab)。此版本数据集共包含5004张图像与32795组问答对。其中4289张图像被问答对引用,剩余715张图像未被使用。部分图像-问题-答案三元组在同一划分(训练集、验证集、测试集)中重复出现。移除重复的三元组后,数据集在4289张图像上共包含32632组问答对。
#### 支持任务与排行榜
PathVQA数据集在[Papers with Code](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa)设有活跃排行榜,模型基于三项指标进行排名:“是/否准确率”“自由形式准确率”与“总体准确率”。其中,“是/否准确率”指模型针对二元“是/否”问题子集生成答案的准确率;“自由形式准确率”指模型针对开放式问题子集生成答案的准确率;“总体准确率”指模型在全部问答对上生成答案的准确率。
#### 语言
问答对采用英文编写。
## 数据集结构
### 数据实例
每个实例由图像-问题-答案三元组构成。
{
'image': <PIL.JpegImagePlugin.JpegImageFile image mode=CMYK size=309x272>,
'question': '肝干细胞(卵圆形细胞)位于何处?',
'answer': '赫林管内'
}
### 数据字段
- `'image'`:问答对所引用的图像。
- `'question'`:针对该图像提出的问题。
- `'answer'`:预期的标准答案。
### 数据划分
数据集按照作者提供的划分方式分为训练集、验证集与测试集。
| | 训练集 | 验证集 | 测试集 |
|-------------------------|:------------:|:--------------:|:--------:|
| 问答对数量 |19,654 |6,259 |6,719 |
| 图像数量 |2,599 |832 |858 |
## 附加信息
### 许可信息
作者已将该数据集以[MIT许可协议](https://github.com/UCSD-AI4H/PathVQA/blob/master/LICENSE)开源发布。
### 引用信息
@article{he2020pathvqa,
title={PathVQA: 面向医学视觉问答的30000+问答对数据集},
author={He, Xuehai and Zhang, Yichen and Mou, Luntian and Xing, Eric and Xie, Pengtao},
journal={arXiv preprint arXiv:2003.10286},
year={2020}
}
提供机构:
maas
创建时间:
2024-06-05



