Abdominal Electromyograms (EMGs) Dataset: Breathing Patterns of Sleeping Adults
收藏doi.org2025-03-22 收录
下载链接:
http://doi.org/10.17632/pmspdmgcd4.3
下载链接
链接失效反馈官方服务:
资源简介:
This data set provides Machine Learning for defining breathing patterns in sleep for adults using preprocessed abdominal electromyograms (EMGs). The data set of 40 records was casually picked from a vaster database (Computing in Cardiology Challenge 2018: Training/Test Sets. 2018. URL: https://archive.physionet.org/physiobank/database/challenge/2018/).
The optimal exponential smoothing model was uniform for all records: additive errors, small undamped trends, and no seasonality. Cleared out by trends and noises, signals had autocorrelation functions with the power-law decay. That has allowed making their persistence factors evaluations (Hurst exponent).
Most of the signals (38 of 40) showed frequent outliers: from a few percent up to 24.6 % of emissions. Wide data variability has been rated with the median absolute deviations, which is the most robust statistic in such a case. High variability looks a bit odd, considering low enough noise levels.
The outliers' percentage, variability, SNR (signal-to-noise ratio), and persistency factors were statistically z-scored with medians and median absolute deviations. Further, their linear combinations form three independent Principal Components: numeric attributes z_1, z_2, and z_3 of the data set.
Manhattan distances matrix among subjects' vectors in 4D attributes space allows imaging the data set as a weighted biconnected graph, the vertices of which are subjects. The weights of the graph's edges reflect distances between any pair of them. "Closeness centralities" of vertices, a well-known parameter in graphs theory, allowed us to cluster the data on two clusters with 11 and 29 subjects. They present two biconnected subgraphs, peripheral and core, respectively. The belonging to one of them has been reflected in binary (nominal) attribute z_4. There are 0 as the label of the peripheral subgraph and 1 for core one, respectively.
The periodograms of EMGs permitted us to find ten subjects with regular breathing and 30 with irregular one, defining two inequal classes using nominal attribute z_5.
So, we offer here the data set for Machine Learning in ARFF format, containing 40 instances with five attributes, the sense of which is described above.
本数据集旨在为成年人睡眠呼吸模式定义提供机器学习应用,其中使用了经过预处理的腹部肌电图(EMG)。此数据集包含40条记录,这些记录是从更庞大的数据库中随机选取的(来源:2018年计算心脏学挑战赛:训练/测试集。2018. 网址:https://archive.physionet.org/physiobank/database/challenge/2018/)。对于所有记录,最优指数平滑模型是一致的:附加误差、微小的未阻尼趋势以及无季节性。信号经趋势和噪声清除后,其自相关函数呈现出幂律衰减,从而允许对持久性因子进行评估(Hurst指数)。
大多数信号(40条中的38条)显示了频繁的异常值:从几个百分点到高达24.6%的排放。通过中位数绝对偏差对数据变异性进行评估,这是在此类情况下最稳健的统计量。考虑到噪声水平足够低,高变异性略显异常。
异常值的百分比、变异性、信噪比(SNR)和持久性因子均通过中位数和中位数绝对偏差进行了统计标准化。进一步地,它们的线性组合形成了三个独立的特征成分:数据集的数值属性z_1、z_2和z_3。
四个维度的属性空间中,主体向量的曼哈顿距离矩阵使得数据集能够被形象化为一个加权双连通图,其中顶点代表主体。图的边权重反映了任意一对顶点之间的距离。顶点的“接近中心度”,作为图论中的一个已知参数,使我们能够在两个具有11和29个主体的聚类中对数据进行分组。它们分别呈现了两个双连通子图,外围和核心。一个属于其中一个子图的信息已在二元(名义)属性z_4中体现出来。外围子图的标签为0,核心子图的标签为1。
通过EMG的周期图,我们能够识别出10名呼吸规律和30名呼吸不规律的主体,使用名义属性z_5定义了两个不等的类别。
因此,我们在此提供的数据集适用于ARFF格式的机器学习,包含40个实例和五个属性,其含义如上所述。
提供机构:
Mendeley Data



