five

DataSheet1_Utilizing machine learning algorithms for the prediction of carotid artery plaques in a Chinese population.docx

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/DataSheet1_Utilizing_machine_learning_algorithms_for_the_prediction_of_carotid_artery_plaques_in_a_Chinese_population_docx/24503287
下载链接
链接失效反馈
官方服务:
资源简介:
Background: Ischemic stroke is a significant global health issue, imposing substantial social and economic burdens. Carotid artery plaques (CAP) serve as an important risk factor for stroke, and early screening can effectively reduce stroke incidence. However, China lacks nationwide data on carotid artery plaques. Machine learning (ML) can offer an economically efficient screening method. This study aimed to develop ML models using routine health examinations and blood markers to predict the occurrence of carotid artery plaques. Methods: This study included data from 5,211 participants aged 18–70, encompassing health check-ups and biochemical indicators. Among them, 1,164 participants were diagnosed with carotid artery plaques through carotid ultrasound. We constructed six ML models by employing feature selection with elastic net regression, selecting 13 indicators. Model performance was evaluated using accuracy, sensitivity, specificity, Positive Predictive Value (PPV), Negative Predictive Value (NPV), F1 score, kappa value, and Area Under the Curve (AUC) value. Feature importance was assessed by calculating the root mean square error (RMSE) loss after permutations for each variable in every model. Results: Among all six ML models, LightGBM achieved the highest accuracy at 91.8%. Feature importance analysis revealed that age, Low-Density Lipoprotein Cholesterol (LDL-c), and systolic blood pressure were important predictive factors in the models. Conclusion: LightGBM can effectively predict the occurrence of carotid artery plaques using demographic information, physical examination data and biochemistry data.

背景:缺血性脑卒中(Ischemic stroke)是全球性重大健康问题,带来了沉重的社会与经济负担。颈动脉斑块(Carotid artery plaques, CAP)是脑卒中的重要危险因素,早期筛查可有效降低脑卒中的发病率。但目前我国尚缺乏全国范围内的颈动脉斑块相关数据。机器学习(Machine Learning, ML)可提供一种经济高效的筛查手段。本研究旨在利用常规体检数据与血液标志物构建机器学习模型,以预测颈动脉斑块的发生风险。 方法:本研究纳入5211名年龄介于18至70岁的受试者,涵盖健康体检及生化指标数据。其中1164名受试者经颈动脉超声检查确诊为颈动脉斑块。本研究采用弹性网回归进行特征筛选,选取13项指标,构建了6种机器学习模型。通过准确率、灵敏度、特异度、阳性预测值(Positive Predictive Value, PPV)、阴性预测值(Negative Predictive Value, NPV)、F1分数、Kappa值及曲线下面积(Area Under the Curve, AUC)评估模型性能;通过置换各模型中每个变量后的均方根误差(Root Mean Square Error, RMSE)损失值,对特征重要性进行评估。 结果:在全部6种机器学习模型中,LightGBM的准确率最高,达91.8%。特征重要性分析显示,年龄、低密度脂蛋白胆固醇(Low-Density Lipoprotein Cholesterol, LDL-c)及收缩压是模型中关键的预测因子。 结论:LightGBM可通过人口学信息、体检数据及生化数据,有效预测颈动脉斑块的发生风险。
创建时间:
2023-11-06
二维码
社区交流群
二维码
科研交流群
商业服务