ArchEHR-QA: A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization
收藏DataCite Commons2026-01-02 更新2026-05-04 收录
下载链接:
https://physionet.org/content/archehr-qa-bionlp-task-2025/1.3/
下载链接
链接失效反馈官方服务:
资源简介:
Patient's unique information needs about their hospitalization can be
addressed using clinical evidence from electronic health records (EHRs) and
artificial intelligence (AI). However, robust datasets to assess the
factuality and relevance of AI-generated responses are lacking and, to our
knowledge, none capture patient information needs in the context of their
EHRs. To address this gap, we introduce ArchEHR-QA, an expert-annotated
dataset of 134 cases from intensive care unit and emergency department
settings to evaluate the grounding capabilities of models for responding to
patient-initiated queries. The dataset consists of patient-initiated questions
posted in public domain, the corresponding clinician-interpreted questions,
the excerpts of the EHRs annotated at the sentence-level with relevance to the
question, and clinician-generated free-text answers to the questions grounded
with EHR sentences. We collect true patient health information needs expressed
in real-world health forum messages, we then align the messages to publicly
accessible real EHRs. To our knowledge, this is the first public dataset that
encapsulates patient questions and relevant clinical evidence from EHRs. We
further provide an evaluation framework to assess two critical aspects of a
grounded EHR QA system: does it identify relevant information from given
clinical evidence and does it use this information in responding to user
queries.
提供机构:
PhysioNet
创建时间:
2025-12-17



