Application of Med-PaLM 2 in the refinement of MIMIC-CXR labels

Name: Application of Med-PaLM 2 in the refinement of MIMIC-CXR labels
Creator: PhysioNet
Published: 2025-02-04 18:56:29
License: 暂无描述

DataCite Commons2025-02-04 更新2025-04-16 收录

下载链接：

https://physionet.org/content/med-palm2-mimic-cxr/

下载链接

链接失效反馈

官方服务：

资源简介：

MIMIC-CXR is a large, open source dataset that is widely-used in medical AI research. One of the limitations of this dataset is the lack of ground truth labels for the chest X-ray studies. Prior work has extracted structured labels from the MIMIC-CXR radiology report text using CheXpert, a natural language processing (NLP) model. As comprehensive expert validation of these labels is cost-prohibitive, there exists a need for scalable methods of identifying NLP- derived labels that would benefit from manual review. We have developed prompts for extraction of clinically-relevant labels using a clinically- trained large language model, Med-PaLM 2, which we selectively applied to MIMIC-CXR radiology reports. A subset of cases where the Med-PaLM 2 results differed from the previously published CheXpert labels were reviewed by three US board certified radiologists to establish a ground truth. Of these differing labels, Med-PaLM 2 achieved an accuracy of 66%, compared to 19% of CheXpert. Our results demonstrate the potential use of medically-oriented large language models such as Med-PaLM 2 in both label extraction and identifying cases for manual review. This dataset offers 1,378 radiologist- verified ground truth labels to the MIMIC-CXR project.

提供机构：

PhysioNet

创建时间：

2025-01-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集