MIMIC-III and eICU-CRD: Feature Representation by FIDDLE Preprocessing
收藏physionet.org2025-01-21 收录
下载链接:
https://physionet.org/content/mimic-eicu-fiddle-feature/1.0.0/
下载链接
链接失效反馈官方服务:
资源简介:
This is a preprocessed dataset derived from patient records in MIMIC-III and eICU, two large-scale electronic health record (EHR) databases. It contains features and labels for 5 prediction tasks involving 3 adverse outcomes (prediction times listed in parentheses): in-hospital mortality (48h), acute respiratory failure (4h and 12h), and shock (4h and 12h). We extracted comprehensive, high-dimensional feature representations (up to ~8,000 features) using FIDDLE (FlexIble Data-Driven pipeLinE), an open-source preprocessing pipeline for structured clinical data. These 5 prediction tasks were designed in consultation with a critical care physician for their clinical importance, and were used as part of the proof-of-concept experiments in the original paper to demonstrate FIDDLE's utility in aiding the feature engineering step of machine learning model development. The intent of this release is to share preprocessed MIMIC-III and eICU datasets used in the experiments to support and enable reproducible machine learning research on EHR data.
本数据集为由MIMIC-III与eICU两大规模电子健康记录(EHR)数据库中的患者记录预处理而成。其中包含涉及三种不良预后(预测时间列表如下括号所示)的五个预测任务的特征与标签:院内死亡率(48小时)、急性呼吸衰竭(4小时与12小时)及休克(4小时与12小时)。我们运用FIDDLE(FlexIble Data-Driven pipeLinE,一种用于结构化临床数据的开源预处理管道)提取了全面且高维度的特征表示(高达约8000个特征)。这五个预测任务是在与重症监护医师协商的基础上,基于其临床重要性而设计的,并在原始论文中作为验证概念实验的一部分,用以展示FIDDLE在辅助机器学习模型开发中的特征工程步骤之效用。此次发布的目的是分享用于实验的预处理MIMIC-III与eICU数据集,以支持并促进基于EHR数据的可重复机器学习研究。
提供机构:
physionet.org



