EchoNext: A Dataset for Detecting Echocardiogram-Confirmed Structural Heart Disease from ECGs
收藏DataCite Commons2026-04-30 更新2026-05-04 收录
下载链接:
https://physionet.org/content/echonext/1.1.1/
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains a de-identified collection of **100,000** 12-lead
electrocardiograms (ECGs) with paired structural heart disease (SHD) labels
derived from echocardiography, collected at Columbia University Irving Medical
Center. Each ECG is provided with raw waveform data sampled at 250 Hz across
all 12 leads, along with accompanying demographic and ECG-specific tabular
metadata, including age, sex, heart rate, PR interval, QRS duration, and
corrected QT interval. Each ECG is annotated with a binary label indicating
the presence or absence of structural heart disease based on echocardiographic
findings, including reduced left ventricular ejection fraction, increased
ventricular wall thickness, significant valvular disease, right ventricular
dysfunction, pulmonary hypertension, or pericardial effusion.
This dataset was developed as part of the creation of the **Columbia Mini-
Model** , a lightweight deep learning model for SHD detection from ECGs. The
dataset represents a simplified, focused subset of the larger EchoNext
training population and was used to evaluate model performance in resource-
constrained settings or smaller-scale deployment environments. It is being
released to promote transparency and reproducibility, support further research
in cardiovascular AI, and enable benchmarking of lightweight ECG-based
screening models for structural heart disease.
本数据集包含经去标识化处理的**10万份**12导联心电图(electrocardiogram, ECG),配套源自超声心动图的结构性心脏病(structural heart disease, SHD)标注标签,采集自哥伦比亚大学欧文医学中心(Columbia University Irving Medical Center)。每份心电图均附带所有12导联以250 Hz采样的原始波形数据,同时配套人口统计学信息及心电图专属的表格型元数据,包括年龄、性别、心率、PR间期、QRS时限、校正QT间期。每份心电图均标注有二元标签,用于基于超声心动图检查结果判断是否存在结构性心脏病,涵盖左心室射血分数降低、心室壁厚度增加、显著瓣膜病变、右心室功能障碍、肺动脉高压或心包积液。
本数据集作为**Columbia Mini-Model**(哥伦比亚微型模型)研发工作的一部分构建而成,该模型是一款用于从心电图中检测结构性心脏病的轻量级深度学习模型。本数据集是规模更大的EchoNext训练人群的简化聚焦子集,被用于评估轻量级模型在资源受限场景或小规模部署环境中的性能。本次发布本数据集旨在提升研究透明度与可复现性、支撑心血管人工智能(Artificial Intelligence, AI)领域的进一步研究,同时为基于心电图的结构性心脏病轻量级筛查模型提供基准测试支持。
提供机构:
PhysioNet
创建时间:
2026-04-22



