five

path-vqa

收藏
魔搭社区2026-05-12 更新2024-06-08 收录
下载链接:
https://modelscope.cn/datasets/swift/path-vqa
下载链接
链接失效反馈
官方服务:
资源简介:
# Dataset Card for PathVQA ## Dataset Description PathVQA is a dataset of question-answer pairs on pathology images. The dataset is intended to be used for training and testing Medical Visual Question Answering (VQA) systems. The dataset includes both open-ended questions and binary "yes/no" questions. The dataset is built from two publicly-available pathology textbooks: "Textbook of Pathology" and "Basic Pathology", and a publicly-available digital library: "Pathology Education Informational Resource" (PEIR). The copyrights of images and captions belong to the publishers and authors of these two books, and the owners of the PEIR digital library.<br> **Repository:** [PathVQA Official GitHub Repository](https://github.com/UCSD-AI4H/PathVQA)<br> **Paper:** [PathVQA: 30000+ Questions for Medical Visual Question Answering](https://arxiv.org/abs/2003.10286)<br> **Leaderboard:** [Papers with Code Leaderboard](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa) ### Dataset Summary The dataset was obtained from the updated Google Drive link shared by the authors on Feb 15, 2023, see the [commit](https://github.com/UCSD-AI4H/PathVQA/commit/117e7f4ef88a0e65b0e7f37b98a73d6237a3ceab) in the GitHub repository. This version of the dataset contains a total of 5,004 images and 32,795 question-answer pairs. Out of the 5,004 images, 4,289 images are referenced by a question-answer pair, while 715 images are not used. There are a few image-question-answer triplets which occur more than once in the same split (training, validation, test). After dropping the duplicate image-question-answer triplets, the dataset contains 32,632 question-answer pairs on 4,289 images. #### Supported Tasks and Leaderboards The PathVQA dataset has an active leaderboard on [Papers with Code](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa) where models are ranked based on three metrics: "Yes/No Accuracy", "Free-form accuracy" and "Overall accuracy". "Yes/No Accuracy" is the accuracy of a model's generated answers for the subset of binary "yes/no" questions. "Free-form accuracy" is the accuracy of a model's generated answers for the subset of open-ended questions. "Overall accuracy" is the accuracy of a model's generated answers across all questions. #### Languages The question-answer pairs are in English. ## Dataset Structure ### Data Instances Each instance consists of an image-question-answer triplet. ``` { 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=CMYK size=309x272>, 'question': 'where are liver stem cells (oval cells) located?', 'answer': 'in the canals of hering' } ``` ### Data Fields - `'image'`: the image referenced by the question-answer pair. - `'question'`: the question about the image. - `'answer'`: the expected answer. ### Data Splits The dataset is split into training, validation and test. The split is provided directly by the authors. | | Training Set | Validation Set | Test Set | |-------------------------|:------------:|:--------------:|:--------:| | QAs |19,654 |6,259 |6,719 | | Images |2,599 |832 |858 | ## Additional Information ### Licensing Information The authors have released the dataset under the [MIT License](https://github.com/UCSD-AI4H/PathVQA/blob/master/LICENSE). ### Citation Information ``` @article{he2020pathvqa, title={PathVQA: 30000+ Questions for Medical Visual Question Answering}, author={He, Xuehai and Zhang, Yichen and Mou, Luntian and Xing, Eric and Xie, Pengtao}, journal={arXiv preprint arXiv:2003.10286}, year={2020} } ```

# PathVQA 数据集卡片 ## 数据集描述 PathVQA是一款面向病理图像的问答对数据集,旨在用于医学视觉问答(Medical Visual Question Answering, VQA)系统的训练与测试。该数据集涵盖开放式问题与二元“是/否”问题两类题型。数据集取材自两本公开可用的病理教科书:《病理学教科书》(*Textbook of Pathology*)与《基础病理学》(*Basic Pathology*),以及一个公开数字图书馆——病理学教育信息资源库(Pathology Education Informational Resource, PEIR)。图像与文本说明的版权归属于这两本教科书的出版方、作者,以及PEIR数字图书馆的所有者。<br> **仓库**:[PathVQA 官方GitHub仓库](https://github.com/UCSD-AI4H/PathVQA)<br> **论文**:[PathVQA:面向医学视觉问答的30000+问答对数据集](https://arxiv.org/abs/2003.10286)<br> **排行榜**:[Papers with Code 排行榜](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa) ### 数据集概览 本数据集源自作者于2023年2月15日分享的更新版Google Drive链接,详见GitHub仓库中的[提交记录](https://github.com/UCSD-AI4H/PathVQA/commit/117e7f4ef88a0e65b0e7f37b98a73d6237a3ceab)。此版本数据集共包含5004张图像与32795组问答对。其中4289张图像被问答对引用,剩余715张图像未被使用。部分图像-问题-答案三元组在同一划分(训练集、验证集、测试集)中重复出现。移除重复的三元组后,数据集在4289张图像上共包含32632组问答对。 #### 支持任务与排行榜 PathVQA数据集在[Papers with Code](https://paperswithcode.com/sota/medical-visual-question-answering-on-pathvqa)设有活跃排行榜,模型基于三项指标进行排名:“是/否准确率”“自由形式准确率”与“总体准确率”。其中,“是/否准确率”指模型针对二元“是/否”问题子集生成答案的准确率;“自由形式准确率”指模型针对开放式问题子集生成答案的准确率;“总体准确率”指模型在全部问答对上生成答案的准确率。 #### 语言 问答对采用英文编写。 ## 数据集结构 ### 数据实例 每个实例由图像-问题-答案三元组构成。 { 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=CMYK size=309x272>, 'question': '肝干细胞(卵圆形细胞)位于何处?', 'answer': '赫林管内' } ### 数据字段 - `'image'`:问答对所引用的图像。 - `'question'`:针对该图像提出的问题。 - `'answer'`:预期的标准答案。 ### 数据划分 数据集按照作者提供的划分方式分为训练集、验证集与测试集。 | | 训练集 | 验证集 | 测试集 | |-------------------------|:------------:|:--------------:|:--------:| | 问答对数量 |19,654 |6,259 |6,719 | | 图像数量 |2,599 |832 |858 | ## 附加信息 ### 许可信息 作者已将该数据集以[MIT许可协议](https://github.com/UCSD-AI4H/PathVQA/blob/master/LICENSE)开源发布。 ### 引用信息 @article{he2020pathvqa, title={PathVQA: 面向医学视觉问答的30000+问答对数据集}, author={He, Xuehai and Zhang, Yichen and Mou, Luntian and Xing, Eric and Xie, Pengtao}, journal={arXiv preprint arXiv:2003.10286}, year={2020} }
提供机构:
maas
创建时间:
2024-06-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作