five

ArchEHR-QA: A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization

收藏
DataCite Commons2026-01-02 更新2026-05-04 收录
下载链接:
https://physionet.org/content/archehr-qa-bionlp-task-2025/1.3/
下载链接
链接失效反馈
官方服务:
资源简介:
Patient's unique information needs about their hospitalization can be addressed using clinical evidence from electronic health records (EHRs) and artificial intelligence (AI). However, robust datasets to assess the factuality and relevance of AI-generated responses are lacking and, to our knowledge, none capture patient information needs in the context of their EHRs. To address this gap, we introduce ArchEHR-QA, an expert-annotated dataset of 134 cases from intensive care unit and emergency department settings to evaluate the grounding capabilities of models for responding to patient-initiated queries. The dataset consists of patient-initiated questions posted in public domain, the corresponding clinician-interpreted questions, the excerpts of the EHRs annotated at the sentence-level with relevance to the question, and clinician-generated free-text answers to the questions grounded with EHR sentences. We collect true patient health information needs expressed in real-world health forum messages, we then align the messages to publicly accessible real EHRs. To our knowledge, this is the first public dataset that encapsulates patient questions and relevant clinical evidence from EHRs. We further provide an evaluation framework to assess two critical aspects of a grounded EHR QA system: does it identify relevant information from given clinical evidence and does it use this information in responding to user queries.
提供机构:
PhysioNet
创建时间:
2025-12-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作