RadGraph-XL: A Large-Scale Expert-Annotated Dataset for Entity and Relation Extraction from Radiology Reports
收藏DataCite Commons2025-09-12 更新2026-05-04 收录
下载链接:
https://physionet.org/content/radgraph-xl/
下载链接
链接失效反馈官方服务:
资源简介:
Radiology reports are essential for clinical care but pose challenges for
automated processing due to their unstructured nature. Existing datasets like
RadGraph-1.0 focus narrowly on chest X-rays (CXR), limiting their
applicability. We introduce RadGraph-XL, a large-scale, expert-annotated
dataset of 2,300 radiology reports with over 410,000 labeled entities and
relations, spanning four anatomy-modality pairs: chest computed tomography
(CT), abdomen/pelvis CT, brain magnetic resonance imaging (MR), and CXR.
Each report is annotated by board-certified radiologists using a detailed
schema that captures observations, anatomical references, and their
relationships. A novel post-processing step identifies measurement-related
entities, a clinically valuable category. Trained models using RadGraph-XL
outperform prior methods and GPT-4, and generalize well to out-of-domain data
such as deep vein thrombosis (DVT) ultrasound reports.
RadGraph-XL is released publicly with models and annotations to support
applications in clinical natural language processing (NLP), medical imaging
artificial intelligence, and foundation model evaluation, setting a new
benchmark for structured information extraction in radiology.
提供机构:
PhysioNet
创建时间:
2025-08-29



