MIMIC-IV-Ext-PE: Pulmonary Embolism Labels for CT Pulmonary Angiography Radiology Reports
收藏DataCite Commons2026-03-24 更新2026-05-04 收录
下载链接:
https://physionet.org/content/mimic-iv-ext-pe/
下载链接
链接失效反馈官方服务:
资源简介:
Pulmonary embolism (PE) is a leading cause of preventable in-hospital
mortality. Advances in diagnosis, risk stratification, and prevention can
improve outcomes. Large, publicly available datasets are needed to move
research forward but are lacking in the field of hemostasis and thrombosis. In
this study, we added PE labels to computed tomography pulmonary angiography
(CTPA) radiology reports in MIMIC-IV. We used Regular Expression (RegEx) to
identify CTPA radiology reports (n=19,942) and extracted sentences containing
PE-related words ("snippets"). Two physicians manually reviewed these
snippets, referring to the full report as needed, and labeled each report as
PE positive or negative. Positive labels included any acute PE (n=1,591).
Acute PE that only involved subsegmental arteries were labeled as
subsegmental. Negative labels included chronic PE, equivocal findings, and no
PE (n=18,351). Using this as a gold standard, we then compared the performance
of a finetuned transformer model to diagnosis codes in their ability to
classify the reports as PE positive or negative.
提供机构:
PhysioNet
创建时间:
2026-01-06



