five

ReXGradient-160K

收藏
魔搭社区2026-01-07 更新2025-05-10 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/ReXGradient-160K
下载链接
链接失效反馈
官方服务:
资源简介:
## Overview ReXGradient-160K is the largest publicly available multi-site chest X-ray dataset, containing 273,004 unique chest X-ray images from 160,000 radiological studies, collected from 109,487 unique patients across 3 U.S. health systems (79 medical sites). This comprehensive dataset includes multiple images per study and detailed radiology reports, making it particularly valuable for the development and evaluation of AI systems for medical imaging and automated report generation models. To access the PNG images from these part files, you can: - Concatenate the files into a single archive: cat deid_png.part* > deid_png.tar - Extract the archive: tar -xf deid_png.tar ## Dataset Composition The dataset is divided into three splits: - **Training**: 140,000 studies, 238,968 images, 95,716 patients - **Validation**: 10,000 studies, 17,007 images, 6,964 patients - **Public Test**: 10,000 studies, 17,029 images, 6,807 patients An additional private test set (10,000 studies) is reserved for model evaluation on the [ReXrank](https://rexrank.ai) benchmark. > **Note**: a file containing radiologist-annotated bounding boxes for interstitial patterns for ~400 examples originally made for [RadGame](https://arxiv.org/abs/2509.13270) is available at metadata/interstitial_pattern_bbox.json. The json uses values following Label Studio's coordinate format. ## Image Characteristics In accordance with our data use agreement, images were downsampled to 25% of their original dimensions using cubic interpolation with anti-aliasing to maintain important structural details. Original images are commercially available through Gradient Health (Durham, NC, USA). ## Report Structure Each radiological report is structured into four key sections: - **Indication**: Provides relevant patient background and reason for examination - **Comparison**: References to previous studies for comparison - **Findings**: Detailed radiological observations - **Impression**: Summary of key conclusions and recommendations ## Citation If you use this dataset in your research, please cite: ``` @inproceedings{zhang2025rexgradient, title={ReXGradient-160K: A Large-Scale Publicly Available Dataset of Chest Radiographs with Free-text Reports}, author={Zhang, Xiaoman and Acosta, Julián N. and Miller, Josh and Huang, Ouwen and Rajpurkar, Pranav}, booktitle={arXiv:2505.00228v1}, year={2025} } @article{zhang2024rexrank, title={ReXrank: A Public Leaderboard for AI-Powered Radiology Report Generation}, author={Zhang, Xiaoman and Zhou, Hong-Yu and Yang, Xiaoli and Banerjee, Oishi and Acosta, Juli{\'a}n N and Miller, Josh and Huang, Ouwen and Rajpurkar, Pranav}, journal={AAAI Bridge Program AIMedHealth}, year={2025} } ``` ## License and Terms of Use This dataset is subject to the terms and conditions outlined in the Non-Commercial Data Access and Use Agreement at the top of this page. ## Acknowledgments This dataset was provided to Harvard by Gradient Health. Users are required to recognize (i) the ReXGradient-160K Data Repository and (ii) the contribution of Gradient Health as the source of the Data in all publications or presentations. This work was supported by the Biswas Family Foundation's Transformative Computational Biology Grant in Collaboration with the Milken Institute. We extend our sincerest thanks for making this work possible. ## Contact For questions regarding the dataset, please contact: xiaoman_zhang@hms.harvard.edu with "ReXGradient-160K" in the subject line. Learn more at [gradienthealth.io](https://gradienthealth.io)
提供机构:
maas
创建时间:
2025-05-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作