RadGraph: Extracting Clinical Entities and Relations from Radiology Reports
收藏DataCite Commons2021-12-16 更新2025-04-16 收录
下载链接:
https://physionet.org/content/radgraph/1.0.0/
下载链接
链接失效反馈官方服务:
资源简介:
RadGraph is a dataset of entities and relations in full-text radiology
reports. We designed a novel information extraction (IE) schema to structure
clinical information in a radiology report with four entities and three
relations. Our train set consists of 500 MIMIC-CXR radiology reports annotated
according to our schema by board-certified radiologists. Our test set consists
of 50 MIMIC-CXR and 50 CheXpert reports, which are independently annotated by
two board-certified radiologists. Additionally, we release annotations
generated by a benchmark deep learning model that achieves a micro F1 of 0.82
(MIMIC-CXR test set) and 0.73 (CheXpert test set) on an evaluation metric for
end-to-end relation extraction, where entity boundaries, entity types, and
relation type must be correct. We use our model to automatically generate
entity and relation labels across 220,763 MIMIC-CXR reports and 500 CheXpert
reports, where annotations can be mapped to associated chest radiographs in
the MIMIC-CXR and CheXpert datasets respectively. The dataset, which includes
reports, entities, and relations, is de-identified according to the US Health
Insurance Portability Act (HIPAA). This dataset is intended to support the
development of natural language processing (NLP) methods for entity and
relation extraction in radiology as well as enable multi-modal use cases that
can leverage entities, relations, and associated radiographs.
提供机构:
PhysioNet
创建时间:
2021-06-03



