Women's dataset from the "Predicting increased blood pressure using Machine Learning" paper
收藏DataCite Commons2025-06-01 更新2024-08-17 收录
下载链接:
https://figshare.com/articles/dataset/Women_s_dataset_from_the_Predicting_increased_blood_pressure_using_Machine_Learning_paper/845664/1
下载链接
链接失效反馈官方服务:
资源简介:
This dataset was part of a study that investigated the prediction of increased blood pressure (systolic blood pressure > 120 mmHg for women, and systolic blood pressure > 139 mmHg for men) by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old (Mean = 23.14, Standard Deviation = 6.03). The sample was divided into two sets of each sex (training and test) for cross-validation. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC and WHR is the combination that produces the best prediction, since it has the lowest deviance (87.42) and misclassification (.19), and the higher pseudo R2 (.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set, and respectively 45.65% and 65.15% in the test sample. For men BMI, WC, HC and WHC showed the best prediction with the lowest deviance (57.25) and misclassification (.16), and the higher pseudo R2 (.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set, and respectively 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power. The new prediction algorithms provided by the current paper can be used to estimate increased blood pressure in women and hypertension in men. This can be of special interest for health professionals that have scarce material resources (such as blood pressure monitors), and that need to rely on inexpensive and easy to use diagnostic methods.
本数据集源自一项旨在通过体重指数(body mass index, BMI)、腰围(waist circumference, WC)、臀围(hip circumference, HC)及腰臀比(waist hip ratio, WHR)预测血压升高的研究,其中血压升高定义为女性收缩压(systolic blood pressure)>120mmHg、男性收缩压>139mmHg,研究所采用的机器学习技术为分类树(classification tree)。研究共收集了400名大学生的相关数据,女性占比56.3%,年龄跨度为16至63岁,平均年龄23.14岁,标准差6.03。按性别将样本划分为训练集与测试集两组,用于交叉验证(cross-validation)。针对不同性别,在训练组中基于不同数量与组合的预测变量,共构建了15棵分类树。研究结果显示,对于女性而言,BMI、WC与WHR的组合具备最优预测性能:其偏差值(deviance)最低(87.42)、误分类率(misclassification rate)最低(0.19),且伪R²(pseudo R²)最高(0.43)。该模型在训练集中的灵敏度(sensitivity)为80.86%、特异度(specificity)为81.22%,在测试样本中则分别为45.65%与65.15%。针对男性而言,BMI、WC、HC及WHR的组合具备最优预测性能,其偏差值最低(57.25)、误分类率最低(0.16),且伪R²最高(0.46)。该模型在训练集中的灵敏度为72%、特异度为86.25%,在测试集中则分别为58.38%与69.70%。最后,将分类树分析的结果与传统逻辑回归(logistic regression)进行对比,结果表明分类树的预测性能优于逻辑回归。本研究提出的新型预测算法可用于评估女性的血压升高情况与男性的高血压病症,对于缺乏专业医疗资源(如血压计)、需要依赖低成本且易操作诊断手段的医疗工作者而言,该算法具备特殊的应用价值。
提供机构:
figshare
创建时间:
2016-01-18



