ReXGradient-160K
收藏魔搭社区2026-01-07 更新2025-05-10 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/ReXGradient-160K
下载链接
链接失效反馈官方服务:
资源简介:
## Overview
ReXGradient-160K is the largest publicly available multi-site chest X-ray dataset, containing 273,004 unique chest X-ray images from 160,000 radiological studies, collected from 109,487 unique patients across 3 U.S. health systems (79 medical sites). This comprehensive dataset includes multiple images per study and detailed radiology reports, making it particularly valuable for the development and evaluation of AI systems for medical imaging and automated report generation models.
To access the PNG images from these part files, you can:
- Concatenate the files into a single archive: cat deid_png.part* > deid_png.tar
- Extract the archive: tar -xf deid_png.tar
## Dataset Composition
The dataset is divided into three splits:
- **Training**: 140,000 studies, 238,968 images, 95,716 patients
- **Validation**: 10,000 studies, 17,007 images, 6,964 patients
- **Public Test**: 10,000 studies, 17,029 images, 6,807 patients
An additional private test set (10,000 studies) is reserved for model evaluation on the [ReXrank](https://rexrank.ai) benchmark.
> **Note**: a file containing radiologist-annotated bounding boxes for interstitial patterns for ~400 examples originally made for [RadGame](https://arxiv.org/abs/2509.13270) is available at metadata/interstitial_pattern_bbox.json. The json uses values following Label Studio's coordinate format.
## Image Characteristics
In accordance with our data use agreement, images were downsampled to 25% of their original dimensions using cubic interpolation with anti-aliasing to maintain important structural details. Original images are commercially available through Gradient Health (Durham, NC, USA).
## Report Structure
Each radiological report is structured into four key sections:
- **Indication**: Provides relevant patient background and reason for examination
- **Comparison**: References to previous studies for comparison
- **Findings**: Detailed radiological observations
- **Impression**: Summary of key conclusions and recommendations
## Citation
If you use this dataset in your research, please cite:
```
@inproceedings{zhang2025rexgradient,
title={ReXGradient-160K: A Large-Scale Publicly Available Dataset of Chest Radiographs with Free-text Reports},
author={Zhang, Xiaoman and Acosta, Julián N. and Miller, Josh and Huang, Ouwen and Rajpurkar, Pranav},
booktitle={arXiv:2505.00228v1},
year={2025}
}
@article{zhang2024rexrank,
title={ReXrank: A Public Leaderboard for AI-Powered Radiology Report Generation},
author={Zhang, Xiaoman and Zhou, Hong-Yu and Yang, Xiaoli and Banerjee, Oishi and Acosta, Juli{\'a}n N and Miller, Josh and Huang, Ouwen and Rajpurkar, Pranav},
journal={AAAI Bridge Program AIMedHealth},
year={2025}
}
```
## License and Terms of Use
This dataset is subject to the terms and conditions outlined in the Non-Commercial Data Access and Use Agreement at the top of this page.
## Acknowledgments
This dataset was provided to Harvard by Gradient Health. Users are required to recognize (i) the ReXGradient-160K Data Repository and (ii) the contribution of Gradient Health as the source of the Data in all publications or presentations.
This work was supported by the Biswas Family Foundation's Transformative Computational Biology Grant in Collaboration with the Milken Institute. We extend our sincerest thanks for making this work possible.
## Contact
For questions regarding the dataset, please contact: xiaoman_zhang@hms.harvard.edu with "ReXGradient-160K" in the subject line.
Learn more at [gradienthealth.io](https://gradienthealth.io)
提供机构:
maas
创建时间:
2025-05-03



