iamlkx/vqa-rad

Name: iamlkx/vqa-rad
Creator: iamlkx
Published: 2026-04-20 02:23:37
License: 暂无描述

Hugging Face2026-04-20 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/iamlkx/vqa-rad

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc0-1.0 task_categories: - visual-question-answering language: - en paperswithcode_id: vqa-rad tags: - medical pretty_name: VQA-RAD size_categories: - 1K<n<10K dataset_info: features: - name: image dtype: image - name: question dtype: string - name: answer dtype: string splits: - name: train num_bytes: 95883938.139 num_examples: 1793 - name: test num_bytes: 23818877.0 num_examples: 451 download_size: 34496718 dataset_size: 119702815.139 --- # Dataset Card for VQA-RAD ## Dataset Description VQA-RAD is a dataset of question-answer pairs on radiology images. The dataset is intended to be used for training and testing Medical Visual Question Answering (VQA) systems. The dataset includes both open-ended questions and binary "yes/no" questions. The dataset is built from [MedPix](https://medpix.nlm.nih.gov/), which is a free open-access online database of medical images. The question-answer pairs were manually generated by a team of clinicians. **Homepage:** [Open Science Framework Homepage](https://osf.io/89kps/)<br> **Paper:** [A dataset of clinically generated visual questions and answers about radiology images](https://www.nature.com/articles/sdata2018251)<br> **Leaderboard:** [Papers with Code Leaderboard](https://paperswithcode.com/sota/medical-visual-question-answering-on-vqa-rad) ### Dataset Summary The dataset was downloaded from the [Open Science Framework Homepage](https://osf.io/89kps/) on June 3, 2023. The dataset contains 2,248 question-answer pairs and 315 images. Out of the 315 images, 314 images are referenced by a question-answer pair, while 1 image is not used. The training set contains 3 duplicate image-question-answer triplets. The training set also has 1 image-question-answer triplet in common with the test set. After dropping these 4 image-question-answer triplets from the training set, the dataset contains 2,244 question-answer pairs on 314 images. #### Supported Tasks and Leaderboards This dataset has an active leaderboard on [Papers with Code](https://paperswithcode.com/sota/medical-visual-question-answering-on-vqa-rad) where models are ranked based on three metrics: "Close-ended Accuracy", "Open-ended accuracy" and "Overall accuracy". "Close-ended Accuracy" is the accuracy of a model's generated answers for the subset of binary "yes/no" questions. "Open-ended accuracy" is the accuracy of a model's generated answers for the subset of open-ended questions. "Overall accuracy" is the accuracy of a model's generated answers across all questions. #### Languages The question-answer pairs are in English. ## Dataset Structure ### Data Instances Each instance consists of an image-question-answer triplet. ``` { 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=566x555>, 'question': 'are regions of the brain infarcted?', 'answer': 'yes' } ``` ### Data Fields - `'image'`: the image referenced by the question-answer pair. - `'question'`: the question about the image. - `'answer'`: the expected answer. ### Data Splits The dataset is split into training and test. The split is provided directly by the authors. | | Training Set | Test Set | |-------------------------|:------------:|:---------:| | QAs |1,793 |451 | | Images |313 |203 | ## Additional Information ### Licensing Information The authors have released the dataset under the CC0 1.0 Universal License. ### Citation Information ``` @article{lau2018dataset, title={A dataset of clinically generated visual questions and answers about radiology images}, author={Lau, Jason J and Gayen, Soumya and Ben Abacha, Asma and Demner-Fushman, Dina}, journal={Scientific data}, volume={5}, number={1}, pages={1--10}, year={2018}, publisher={Nature Publishing Group} } ```

提供机构：

iamlkx

5,000+

优质数据集

54 个

任务类型

进入经典数据集