MIMIC-IV-Note: Deidentified free-text clinical notes
收藏DataCite Commons2024-12-22 更新2024-07-13 收录
下载链接:
https://physionet.org/content/mimic-iv-note/2.2/
下载链接
链接失效反馈官方服务:
资源简介:
The advent of large, open access text databases has driven advances in state-
of-the-art model performance in natural language processing (NLP). The
relatively limited amount of clinical data available for NLP has been cited as
a significant barrier to the field's progress. Here we describe MIMIC-IV-Note:
a collection of deidentified free-text clinical notes for patients included in
the MIMIC-IV clinical database. MIMIC-IV-Note contains 331,794 deidentified
discharge summaries from 145,915 patients admitted to the hospital and
emergency department at the Beth Israel Deaconess Medical Center in Boston,
MA, USA. The database also contains 2,321,355 deidentified radiology reports
for 237,427 patients. All notes have had protected health information removed
in accordance with the Health Insurance Portability and Accountability Act
(HIPAA) Safe Harbor provision. All notes are linkable to MIMIC-IV providing
important context to the clinical data therein. The database is intended to
stimulate research in clinical natural language processing and associated
areas.
大规模开放获取文本数据库的问世,推动了自然语言处理(NLP)领域模型性能的前沿进展。而可供NLP研究使用的临床数据相对匮乏,这被认为是该领域发展的一大重要阻碍。本文介绍MIMIC-IV-Note:一款面向MIMIC-IV临床数据库中患者的去标识化自由文本临床笔记集合。MIMIC-IV-Note包含331,794份去标识化出院小结,涵盖美国马萨诸塞州波士顿市贝斯以色列女执事医疗中心的145,915名住院及急诊就诊患者。该数据库同时收录了2,321,355份面向237,427名患者的去标识化放射科报告。所有笔记均已依照《健康保险流通与责任法案(HIPAA)安全港条款》移除了受保护健康信息。所有笔记均可与MIMIC-IV数据库关联,为其中收录的临床数据提供重要上下文信息。本数据库旨在推动临床自然语言处理及相关领域的研究工作。
提供机构:
PhysioNet
创建时间:
2023-01-06
搜集汇总
数据集介绍

背景与挑战
背景概述
MIMIC-IV-Note是一个大规模去标识化临床文本数据集,包含33万+出院摘要和232万+放射报告,支持临床自然语言处理研究。所有数据经过严格去标识化处理,并与MIMIC-IV临床数据库关联,为研究提供重要临床背景。
以上内容由遇见数据集搜集并总结生成



