Preprocessed CPSC and PTB-XL Data
收藏DataCite Commons2024-11-06 更新2024-08-19 收录
下载链接:
https://figshare.com/articles/dataset/Preprocessed_CPSC_Data/25532869
下载链接
链接失效反馈官方服务:
资源简介:
<pre><pre><b>CPSC 2018</b><br><br>The first dataset is a preprocessed version of the CPSC 2018 dataset, which contains 6877 ECG recordings. We preprocessed the dataset by resampling the ECG signals to 250 Hz and equalizing the ECG signal length to 60 seconds, <br>yielding a signal length of T=15,000 data points per recording.For the hyperparameter study, we employed a fixed train-valid-test split with ratio 60-20-20, while for the final evaluations, including the comparison with the state-of-the-art methods and ablation studies, we used a 10-fold cross-validation strategy.<br><br>The raw CPSC 2018 dataset can be downloaded from the website of the <br>PhysioNet/Computing in Cardiology Challenge 2020.<br>(License: Creative Commons Attribution 4.0 International Public License).<br><br><br><b>PTB-XL (Super-Diag.)</b><br><br>The second dataset is a pre-processed version of PTB-XL, a large multi-label dataset of 21,799 clinical 12-lead ECG records of 10 seconds each. <br>PTB-XL contains 71 ECG statements, categorized into 44 diagnostic, 19 form, and 12 rhythmic classes. In addition, the diagnostic category can be divided into 24 sub- and 5 coarse-grained super-classes. <br>In our pre-processed version, we utilize the super-diagnostic labels for classification and the <b>recommended train-valid-test </b><b>splits</b>, sampled at 100 Hz. We select only samples with at least one label in the super-diagnostic category,without applying any further preprocessing.<br><br>The raw PTB-XL dataset can be downloaded from the PhysioNet/PTB-XL website.<br>(License: Creative Commons Attribution 4.0 International Public License).<br></pre><pre></pre></pre><br>
**CPSC 2018数据集**
首个数据集为CPSC 2018数据集的预处理版本,共包含6877条心电图(Electrocardiogram, ECG)记录。我们对该数据集进行预处理:将ECG信号重采样至250赫兹,并将所有ECG信号的时长统一至60秒,最终每条记录的信号长度为T=15000个数据点。
在超参数研究阶段,我们采用固定的60-20-20比例训练集-验证集-测试集划分方式;而在最终评估阶段(包括与当前最优方法的对比实验以及消融实验),我们采用10折交叉验证策略。
原始CPSC 2018数据集可从PhysioNet/2020年心血管计算挑战赛(Computing in Cardiology Challenge 2020)官网下载,其授权协议为"Creative Commons Attribution 4.0 International Public License"。
**PTB-XL(超级诊断分类任务版)**
第二个数据集为PTB-XL的预处理版本,该数据集是一个大型多标签数据集,包含21799条时长为10秒的临床12导联ECG记录。
PTB-XL数据集共涵盖71项ECG描述项,可划分为44项诊断类、19项形态类以及12项节律类。此外,诊断类别还可进一步细分为24个子类与5个粗粒度超类。
在我们的预处理版本中,我们采用超级诊断标签开展分类任务,并使用官方推荐的训练集-验证集-测试集划分方式,同时将信号采样率统一至100赫兹。我们仅保留在超级诊断类别中至少带有一个标签的样本,未施加任何额外预处理操作。
原始PTB-XL数据集可从PhysioNet平台的PTB-XL官网下载,其授权协议为"Creative Commons Attribution 4.0 International Public License"。
提供机构:
figshare
创建时间:
2024-04-03
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含两个预处理的心电图数据集:CPSC 2018和PTB-XL,分别包含6877条和21,799条ECG记录,适用于深度学习和医疗健康信息分析。
以上内容由遇见数据集搜集并总结生成



