five

abhay2812/vqa-rad

收藏
Hugging Face2026-03-24 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/abhay2812/vqa-rad
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc0-1.0 task_categories: - visual-question-answering language: - en tags: - medical - radiology - vqa - medical-vqa - clinical - chest-xray - ct-scan - mri pretty_name: "VQA-RAD Full: Visual Question Answering on Radiology Images" size_categories: - 1K<n<10K dataset_info: features: - name: qid dtype: int64 - name: image_name dtype: string - name: image_organ dtype: string - name: question dtype: string - name: answer dtype: string - name: answer_normalized dtype: string - name: answer_type dtype: string - name: question_type_primary dtype: string - name: question_type_raw dtype: string - name: phrase_type dtype: string - name: evaluation dtype: string - name: split dtype: string - name: image dtype: image splits: - name: train num_bytes: 178000000 num_examples: 1794 - name: test num_bytes: 45000000 num_examples: 450 configs: - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* --- # VQA-RAD Full: Visual Question Answering on Radiology Images ## Dataset Description This is a **cleaned and comprehensive** version of the [VQA-RAD dataset](https://doi.org/10.17605/OSF.IO/89KPS), the first manually constructed dataset where clinicians asked naturally occurring questions about radiology images and provided reference answers. Unlike the existing [flaviagiammarino/vqa-rad](https://huggingface.co/datasets/flaviagiammarino/vqa-rad) on HuggingFace which only contains image-question-answer triplets, this version **preserves all original metadata** from the source — including question types, answer types, image organ labels, phrase types, and evaluation status — enabling fine-grained evaluation of Medical VQA systems. - **Paper:** [A dataset of clinically generated visual questions and answers about radiology images](https://www.nature.com/articles/sdata2018251) (Scientific Data, 2018) - **Original Source:** [Open Science Framework](https://doi.org/10.17605/OSF.IO/89KPS) - **License:** [CC0 1.0 Universal](https://creativecommons.org/publicdomain/zero/1.0/) ## Dataset Summary | | Train | Test | Total | |---|---|---|---| | QA pairs | 1,794 | 450 | 2,244 | | Unique images | 313 | 203 | 314 | The dataset contains **2,244 question-answer pairs** (after deduplication) on **314 radiology images** sourced from [MedPix®](https://medpix.nlm.nih.gov/), an open-access database of medical images and teaching cases. Questions and answers were manually generated by 15 clinical trainees (medical students and fellows) who had completed core clinical rotations. ## Data Fields | Field | Type | Description | |---|---|---| | `qid` | int | Unique question ID | | `image` | image | The radiology image (JPEG) | | `image_name` | string | Original filename (e.g., `synpic54610.jpg`) | | `image_organ` | string | Body region: `HEAD`, `CHEST`, or `ABD` | | `question` | string | The clinical question about the image | | `answer` | string | Ground truth answer (original casing) | | `answer_normalized` | string | Lowercase, stripped answer for evaluation | | `answer_type` | string | `CLOSED` (yes/no) or `OPEN` (free-form) | | `question_type_primary` | string | Primary question category (see taxonomy below) | | `question_type_raw` | string | Original question type label (may contain multi-labels) | | `phrase_type` | string | `freeform`, `para` (paraphrase), `test_freeform`, or `test_para` | | `evaluation` | string | `evaluated`, `not evaluated`, or `given` | | `split` | string | `train` or `test` | ## Question Type Taxonomy As defined in the original paper: | Question Type | Description | Example | |---|---|---| | **PRES** | Object/condition presence | *"Is there a pneumothorax present?"* | | **POS** | Positional reasoning | *"Where is the lesion located?"* | | **ABN** | Abnormality | *"Is there something wrong with the image?"* | | **MODALITY** | Imaging modality | *"Is this a CT or an MRI?"* | | **PLANE** | Image orientation | *"Is this an axial image?"* | | **SIZE** | Size/measurement | *"Is the heart enlarged?"* | | **ORGAN** | Organ system | *"What organ system is pictured?"* | | **ATTRIB** | Attribute (other) | *"Is the mass well circumscribed?"* | | **COLOR** | Signal intensity/color | *"Is the lesion more or less dense than the liver?"* | | **COUNT** | Counting | *"How many lesions are there?"* | | **OTHER** | Other | Catch-all category | ## Dataset Distributions ### Answer Type | | CLOSED | OPEN | |---|---|---| | Train | 1,297 | 497 | | Test | 275 | 175 | ### Image Organ | HEAD | CHEST | ABD | |---|---|---| | 715 | 794 | 739 | ### Question Type (Test Free-form) | Type | CLOSED | OPEN | Total | |---|---|---|---| | PRES | 82 | 29 | 111 | | POS | 3 | 35 | 38 | | ABN | 25 | 9 | 34 | | SIZE | 27 | 3 | 30 | | MODALITY | 15 | 14 | 29 | | PLANE | 12 | 11 | 23 | | OTHER | 9 | 11 | 20 | | ORGAN | 2 | 8 | 10 | | ATTRIB | 6 | 2 | 8 | | COUNT | 2 | 1 | 3 | | COLOR | 2 | 0 | 2 | ## Usage ```python from datasets import load_dataset ds = load_dataset("abhay2812/vqa-rad") # Access a sample sample = ds['train'][0] print(sample['question']) # "Are regions of the brain infarcted?" print(sample['answer']) # "Yes" print(sample['question_type_primary']) # "PRES" print(sample['answer_type']) # "CLOSED" print(sample['image_organ']) # "HEAD" # Filter by question type pres_questions = ds['test'].filter(lambda x: x['question_type_primary'] == 'PRES') # Filter by answer type for separate evaluation closed = ds['test'].filter(lambda x: x['answer_type'] == 'CLOSED') open_ended = ds['test'].filter(lambda x: x['answer_type'] == 'OPEN') # Filter test free-form only (standard benchmark split) test_freeform = ds['test'].filter(lambda x: x['phrase_type'] == 'test_freeform') ``` ## Evaluation Following the original paper and the [Papers with Code leaderboard](https://paperswithcode.com/dataset/vqa-rad), models are typically evaluated on three metrics: - **Closed-ended Accuracy**: Accuracy on yes/no questions - **Open-ended Accuracy**: Accuracy on free-form answer questions - **Overall Accuracy**: Accuracy across all questions The `answer_normalized` field provides lowercased answers for consistent evaluation matching. ## Cleaning Steps Applied 1. Renamed columns from uppercase Excel headers to clean lowercase names 2. Extracted image filenames from full MedPix URLs 3. Fixed `answer_type` inconsistency (trailing whitespace) 4. Handled 1 null answer (marked as `"unanswerable"`) 5. Converted numeric answers to strings (COUNT-type answers like `4`, `12`, `0.05`) 6. Added `answer_normalized` (lowercase, stripped) for evaluation 7. Fixed question type typos: `ATRIB` → `ATTRIB`, `Other` → `OTHER`, `PRSE` → `PRES` 8. Created `question_type_primary` from multi-label entries (e.g., `SIZE, PRES` → `SIZE`) 9. Removed 4 duplicate image-question-answer triplets 10. Verified all 314 images load correctly ## Citation If you use this dataset, please cite the original paper: ```bibtex @article{lau2018dataset, title={A dataset of clinically generated visual questions and answers about radiology images}, author={Lau, Jason J and Gayen, Soumya and Ben Abacha, Asma and Demner-Fushman, Dina}, journal={Scientific Data}, volume={5}, number={1}, pages={1--10}, year={2018}, publisher={Nature Publishing Group}, doi={10.1038/sdata.2018.251} } ``` ## Acknowledgments Dataset cleaned and uploaded by [abhay2812](https://huggingface.co/abhay2812). The original dataset was created by researchers at the Lister Hill National Center for Biomedical Communications, National Library of Medicine, and is archived on the [Open Science Framework](https://doi.org/10.17605/OSF.IO/89KPS).
提供机构:
abhay2812
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作