five

Attributes information of the dataset.

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Attributes_information_of_the_dataset_/24714289
下载链接
链接失效反馈
官方服务:
资源简介:
Chronic kidney disease (CKD) has become a major global health crisis, causing millions of yearly deaths. Predicting the possibility of a person being affected by the disease will allow timely diagnosis and precautionary measures leading to preventive strategies for health. Machine learning techniques have been popularly applied in various disease diagnoses and predictions. Ensemble learning approaches have become useful for predicting many complex diseases. In this paper, we utilise the boosting method, one of the popular ensemble learnings, to achieve a higher prediction accuracy for CKD. Five boosting algorithms are employed: XGBoost, CatBoost, LightGBM, AdaBoost, and gradient boosting. We experimented with the CKD data set from the UCI machine learning repository. Various preprocessing steps are employed to achieve better prediction performance, along with suitable hyperparameter tuning and feature selection. We assessed the degree of importance of each feature in the dataset leading to CKD. The performance of each model was evaluated with accuracy, precision, recall, F1-score, Area under the curve-receiving operator characteristic (AUC-ROC), and runtime. AdaBoost was found to have the overall best performance among the five algorithms, scoring the highest in almost all the performance measures. It attained 100% and 98.47% accuracy for training and testing sets. This model also exhibited better precision, recall, and AUC-ROC curve performance.
创建时间:
2023-12-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作