five

Confusion matrix structure.

收藏
Figshare2025-12-26 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_p_Confusion_matrix_structure_p_/30955931
下载链接
链接失效反馈
官方服务:
资源简介:
Coronary heart disease (CHD) remains the leading cause of mortality worldwide, disproportionately affecting low- and middle-income countries where diagnostic resources are limited. Traditional statistical models often fail to deliver adequate predictive accuracy in complex, high-dimensional, and imbalanced health datasets. To develop and evaluate enhanced machine learning and hybrid ensemble models for the prediction of coronary heart disease, with a focus on improving diagnostic performance, interpretability, and applicability in resource-constrained settings. We utilized a nationally representative dataset of 253,680 individuals from the Behavioral Risk Factor Surveillance System. Preprocessing included normalization and balancing via the Synthetic Minority Oversampling Technique (SMOTE). Baseline models—Decision Trees, Random Forests, Gradient Boosting, and Support Vector Machines—were compared against improved versions: Adaptive Noise–Resistant Decision Tree (ADNRT), Hybrid Imbalanced Random Forest (HIRF), Pruned Gradient Boosting Machine (PGBM), and Enhanced Support Vector Machine (ESVM). Ensemble approaches (stacking, boosting, bagging, Bayesian model averaging and majority voting) were implemented and evaluated using accuracy, sensitivity, specificity, and area under the curve (AUC). Calibration and learning curves were also analyzed. Enhanced models consistently outperformed their baseline counterparts. PGBM achieved the highest sensitivity (90.8%), while HIRF demonstrated the best overall calibration and balance (AUC = 0.937; sensitivity = 88.4%; specificity = 82.9%). The stacking ensemble emerged as the best-performing model with an accuracy of 87.2%, sensitivity of 89.6%, specificity of 84.7%, and AUC of 0.94. Calibration and learning curve analyses confirmed strong generalizability and low overfitting across ensemble models. Hybrid ensemble machine learning models significantly outperform traditional classifiers in CHD prediction, offering high accuracy, robustness, and interpretability. These models present a scalable framework for implementing AI-driven diagnostic tools in low–resource environments, potentially transforming early detection and prevention of coronary heart disease.
创建时间:
2025-12-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作