MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation
收藏DataCite Commons2025-04-09 更新2025-04-16 收录
下载链接:
https://physionet.org/content/mimic-iii-ext-verifact-bhc/
下载链接
链接失效反馈官方服务:
资源简介:
The _VeriFact-BHC_ dataset is designed to verify the factuality of long-form
text written about a patient against their own electronic health record. There
is increasing interest in using large language models (LLMs) to generate
clinical text in patient care applications, yet this text needs to be
evaluated for factual errors and hallucinations prior to committing text to a
patient 's permanent medical record. Text written about a patient should be
internally consistent with information already known about the patient, such
as that stored in their medical records.
_VeriFact-BHC_ contains long-form Brief Hospital Course (BHC) clinical
narratives typically found in a discharge summary that have been decomposed
into text proposition statements. From 100 patients in the MIMIC-III Clinical
Database v1.4, we consider two types of BHC text: a human-written BHC and a
LLM-generated BHC. The original human clinician-written BHC is extracted from
the discharge summary note. The LLM-generated BHC is composed by a LLM using
the patient 's longitudinal clinical notes from the hospital admission. Each
BHC is decomposed in two ways: sentence propositions and atomic claim
propositions. The remaining electronic health record (EHR) notes for each
patient serves as a patient-specific reference of facts that is used by
clinicians and _VeriFact_ to assign labels. A total of 13,070 propositions are
annotated by multiple clinicians with a ground truth established via majority
voting and manual adjudication. Also provided are labels assigned by the
_VeriFact_ artificial intelligence system and labels assessing whether
propositions are valid from a first-order logic standpoint. The reference EHR
for each patient is provided in both machine-readable and PDF formats.
By offering this dataset, we hope to spur further investigation and creation
of computational systems for automatic chart review and patient-specific fact
verification. We invite the research community to utilize this dataset to
develop better methods to guardrail patient-specific LLM-generated clinical
text.
提供机构:
PhysioNet
创建时间:
2025-03-24



