MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation
收藏DataCite Commons2025-04-09 更新2025-04-16 收录
下载链接:
https://physionet.org/content/mimic-iii-ext-verifact-bhc/1.0.0/
下载链接
链接失效反馈官方服务:
资源简介:
The _VeriFact-BHC_ dataset is designed to verify the factuality of long-form
text written about a patient against their own electronic health record. There
is increasing interest in using large language models (LLMs) to generate
clinical text in patient care applications, yet this text needs to be
evaluated for factual errors and hallucinations prior to committing text to a
patient 's permanent medical record. Text written about a patient should be
internally consistent with information already known about the patient, such
as that stored in their medical records.
_VeriFact-BHC_ contains long-form Brief Hospital Course (BHC) clinical
narratives typically found in a discharge summary that have been decomposed
into text proposition statements. From 100 patients in the MIMIC-III Clinical
Database v1.4, we consider two types of BHC text: a human-written BHC and a
LLM-generated BHC. The original human clinician-written BHC is extracted from
the discharge summary note. The LLM-generated BHC is composed by a LLM using
the patient 's longitudinal clinical notes from the hospital admission. Each
BHC is decomposed in two ways: sentence propositions and atomic claim
propositions. The remaining electronic health record (EHR) notes for each
patient serves as a patient-specific reference of facts that is used by
clinicians and _VeriFact_ to assign labels. A total of 13,070 propositions are
annotated by multiple clinicians with a ground truth established via majority
voting and manual adjudication. Also provided are labels assigned by the
_VeriFact_ artificial intelligence system and labels assessing whether
propositions are valid from a first-order logic standpoint. The reference EHR
for each patient is provided in both machine-readable and PDF formats.
By offering this dataset, we hope to spur further investigation and creation
of computational systems for automatic chart review and patient-specific fact
verification. We invite the research community to utilize this dataset to
develop better methods to guardrail patient-specific LLM-generated clinical
text.
**VeriFact-BHC**数据集旨在针对患者自身的电子健康记录(electronic health record, EHR),验证针对该患者撰写的长文本的事实准确性。
当前,在患者护理场景中利用大语言模型(Large Language Model, LLM)生成临床文本的研究热度日益攀升,但此类文本在录入患者永久病历前,需针对其中的事实错误与幻觉现象开展评估。针对患者撰写的文本应与该患者已有的已知信息(例如存储于其病历中的信息)保持内在一致性。
VeriFact-BHC数据集包含源自出院小结的长格式简短住院病程(Brief Hospital Course, BHC)临床叙事文本,并将其拆解为文本命题陈述。本数据集从MIMIC-III临床数据库v1.4的100名患者数据中,选取了两类BHC文本:人工撰写的BHC与大语言模型生成的BHC。其中,原始的临床医师人工撰写BHC源自出院小结文本;大语言模型生成的BHC则由大语言模型基于患者住院期间的纵向临床笔记构建而成。每份BHC均以两种方式进行拆解:句子级命题与原子断言命题。每名患者剩余的电子健康记录(EHR)笔记将作为该患者专属的事实参考依据,供临床医师与VeriFact系统用于标注标签。总计13070条命题由多名临床医师进行标注,并通过多数投票与人工审定确立基准真值。数据集同时提供了由VeriFact人工智能系统生成的标注标签,以及从一阶逻辑视角评估命题是否有效的标注标签。每名患者的参考EHR均提供机器可读格式与PDF格式两种版本。
本数据集的发布旨在推动自动病历审查与患者专属事实验证相关计算系统的进一步研究与开发。我们诚挚邀请研究界利用本数据集开发更完善的方法,为大语言模型生成的患者专属临床文本构建防护机制。
提供机构:
PhysioNet
创建时间:
2025-03-24



