Preprocessed CPSC and PTB-XL Data
收藏DataCite Commons2025-06-01 更新2025-01-06 收录
下载链接:
https://figshare.com/articles/dataset/Preprocessed_CPSC_Data/25532869/3
下载链接
链接失效反馈官方服务:
资源简介:
<pre><pre><b>CPSC 2018</b><br><br>The first dataset is a preprocessed version of the CPSC 2018 dataset, which contains 6877 ECG recordings. We preprocessed the dataset by resampling the ECG signals to 250 Hz and equalizing the ECG signal length to 60 seconds, <br>yielding a signal length of T=15,000 data points per recording.For the hyperparameter study, we employed a fixed train-valid-test split with ratio 60-20-20, while for the final evaluations, including the comparison with the state-of-the-art methods and ablation studies, we used a 10-fold cross-validation strategy.<br><br>The raw CPSC 2018 dataset can be downloaded from the website of the <br>PhysioNet/Computing in Cardiology Challenge 2020.<br>(License: Creative Commons Attribution 4.0 International Public License).<br><br><br><b>PTB-XL (Super-Diag.)</b><br><br>The second dataset is a pre-processed version of PTB-XL, a large multi-label dataset of 21,799 clinical 12-lead ECG records of 10 seconds each. <br>PTB-XL contains 71 ECG statements, categorized into 44 diagnostic, 19 form, and 12 rhythmic classes. In addition, the diagnostic category can be divided into 24 sub- and 5 coarse-grained super-classes. <br>In our pre-processed version, we utilize the super-diagnostic labels for classification and the <b>recommended train-valid-test </b><b>splits</b>, sampled at 100 Hz. We select only samples with at least one label in the super-diagnostic category,without applying any further preprocessing.<br><br>The raw PTB-XL dataset can be downloaded from the PhysioNet/PTB-XL website.<br>(License: Creative Commons Attribution 4.0 International Public License).<br></pre><pre></pre></pre><br>
<b>CPSC 2018</b><br><br>第一个数据集为CPSC 2018数据集的预处理版本,包含6877条心电图(ECG)记录。我们通过将心电信号重采样至250Hz并统一所有记录的信号时长为60秒完成预处理,最终每条记录的信号长度为T=15000个数据点。在超参数研究阶段,我们采用固定的训练集-验证集-测试集划分比例60-20-20;而在开展包含与当前最优方法对比以及消融实验在内的最终评估时,我们采用10折交叉验证策略。<br><br>原始CPSC 2018数据集可从PhysioNet/Computing in Cardiology Challenge 2020官网下载,授权协议为知识共享署名4.0国际公共许可(Creative Commons Attribution 4.0 International Public License)。<br><br><br><b>PTB-XL (Super-Diag.)</b><br><br>第二个数据集为PTB-XL的预处理版本,该数据集是一个大型多标签数据集,包含21799条时长为10秒的临床12导联心电图记录。PTB-XL涵盖71项心电陈述,可划分为44项诊断类别、19项形态类别与12项节律类别。此外,诊断类别还可进一步细分为24个子类与5个粗粒度超类。在我们的预处理版本中,我们采用超诊断标签开展分类任务,并使用官方推荐的训练集-验证集-测试集划分方案,信号采样率为100Hz。我们仅保留超诊断类别中至少带有一个标签的样本,未施加任何额外预处理步骤。<br><br>原始PTB-XL数据集可从PhysioNet/PTB-XL官网下载,授权协议为知识共享署名4.0国际公共许可(Creative Commons Attribution 4.0 International Public License)。<br>
提供机构:
figshare
创建时间:
2024-11-06
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含两个预处理的心电图数据集:CPSC 2018(6877条记录,重采样至250 Hz,长度标准化为60秒)和PTB-XL (Super-Diag.)(21799条12导联ECG记录,使用超诊断标签分类)。两者均来自PhysioNet,采用CC BY 4.0许可,适用于深度学习和医疗健康分析。
以上内容由遇见数据集搜集并总结生成



