Linn5412/CXR-GroundVQA

Name: Linn5412/CXR-GroundVQA
Creator: Linn5412
Published: 2026-04-02 06:48:28
License: 暂无描述

Hugging Face2026-04-02 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/Linn5412/CXR-GroundVQA

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit task_categories: - visual-question-answering language: - en tags: - chest-x-ray - phrase-grounding - medical-imaging - synthetic-data - VQA size_categories: - 1K<n<10K --- # CXR-GroundVQA A synthetic VQA dataset for chest X-ray phrase grounding, containing 6,006 question-answer pairs across 995 radiologist-annotated CXR images from VinDr-CXR. ## Overview | Property | Value | |----------|-------| | Total QA pairs | 6,006 | | Images | 995 (from VinDr-CXR training split) | | Question types | 8 | | Finding categories | 22 | | Bbox format | Normalized [x, y, w, h] in [0, 1] | | Language | English | | Quality score | Mean 0.986, 100% verification pass | ## Question Types | Category | Type | Count | % | |----------|------|-------|---| | Bbox-localization | Open-ended Localization | 836 | 13.9 | | Bbox-localization | Single-choice Localization | 673 | 11.2 | | Bbox-localization | Multi-choice Localization | 446 | 7.4 | | Bbox-localization | Zero-knowledge Detection | 481 | 8.0 | | Label-identification | Open-ended Identification | 1,092 | 18.2 | | Label-identification | Single-choice Identification | 836 | 13.9 | | Label-identification | Multi-choice Identification | 419 | 7.0 | | Polarity | Polarity Judgment | 1,223 | 20.4 | ## Data Format Each sample is a JSON object with messages (conversation turns) and images (file paths): ```json { "messages": [ { "role": "user", "content": "<image> What abnormality can be observed within the region designated by [0.735, 0.565, 0.035, 0.038]?" }, { "role": "assistant", "content": "The region [0.735, 0.565, 0.035, 0.038] shows Nodule/Mass." } ], "images": [ "images/0021df30f3fddef551eb3df4354b1d06.png" ] } ``` Bounding boxes use normalized [x, y, w, h] coordinates where (x, y) is the top-left corner, and (w, h) are width and height relative to image dimensions, all in [0, 1]. ## Image Setup **Images are not included in this repository** due to the PhysioNet Credentialed Health Data License of VinDr-CXR. To use this dataset: 1. Obtain credentialed access to [VinDr-CXR v1.0.0](https://physionet.org/content/vindr-cxr/1.0.0/) on PhysioNet. 2. Download the training split DICOM files. 3. Convert the 995 referenced DICOM files to PNG format and place them in an images/ directory. The image field in each sample specifies the expected relative path (e.g., images/xxxx.png). ## Source Generated from the training split of [VinDr-CXR](https://physionet.org/content/vindr-cxr/1.0.0/) using an automated LLM-driven pipeline with placeholder-based factual grounding and 7-check quality assurance. See the paper for details. ## Citation ```bibtex @inproceedings{lin2026cxrgroundvqa, title={CXR-GroundVQA: An LLM-Synthesized VQA Dataset for Chest X-ray Phrase Grounding}, author={Lin, Jiaming}, booktitle={ACM Multimedia}, year={2026} } ```

提供机构：

Linn5412

5,000+

优质数据集

54 个

任务类型

进入经典数据集