five

RadGraph-XL: A Large-Scale Expert-Annotated Dataset for Entity and Relation Extraction from Radiology Reports

收藏
DataCite Commons2025-09-12 更新2026-05-04 收录
下载链接:
https://physionet.org/content/radgraph-xl/
下载链接
链接失效反馈
官方服务:
资源简介:
Radiology reports are essential for clinical care but pose challenges for automated processing due to their unstructured nature. Existing datasets like RadGraph-1.0 focus narrowly on chest X-rays (CXR), limiting their applicability. We introduce RadGraph-XL, a large-scale, expert-annotated dataset of 2,300 radiology reports with over 410,000 labeled entities and relations, spanning four anatomy-modality pairs: chest computed tomography (CT), abdomen/pelvis CT, brain magnetic resonance imaging (MR), and CXR. Each report is annotated by board-certified radiologists using a detailed schema that captures observations, anatomical references, and their relationships. A novel post-processing step identifies measurement-related entities, a clinically valuable category. Trained models using RadGraph-XL outperform prior methods and GPT-4, and generalize well to out-of-domain data such as deep vein thrombosis (DVT) ultrasound reports. RadGraph-XL is released publicly with models and annotations to support applications in clinical natural language processing (NLP), medical imaging artificial intelligence, and foundation model evaluation, setting a new benchmark for structured information extraction in radiology.
提供机构:
PhysioNet
创建时间:
2025-08-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作