five

Covariates for algorithm development.

收藏
Figshare2026-03-05 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_p_Covariates_for_algorithm_development_p_/31547516
下载链接
链接失效反馈
官方服务:
资源简介:
ObjectiveChild stunting continues to pose a substantial global health challenge, requiring multifaceted strategies that combine conventional epidemiological approaches with advanced analytic methods. The aim of this study was to determine the most effective machine learning model for predicting stunting based on water, sanitation, and hygiene behaviors and infrastructure, with the goal of identifying high-risk children who would benefit most from targeted interventions.MethodsThis study was a secondary analysis of data from a matched cohort study assessing the effectiveness of combined on-premise piped water and improved sanitation for improved health outcomes in rural Odisha, India. Data for the parent study were collected from 2,398 households with a child under five years of age across 90 villages, and complete data were available for 1,196 children. Feature engineering techniques were employed to identify the most relevant predictors and utilized structural equation modeling, forward selection, backward elimination, and least absolute shrinkage and selection operator techniques. Five machine learning algorithms commonly used for binary classification tasks were compared: logistic regression, classification tree, support vector machine, neural network, and extreme gradient boosting.ResultsAmong 1,196 children analyzed, the extreme gradient boosting model with forward selection feature engineering best predicted stunting based on water, sanitation, and hygiene (WaSH) factors. It correctly identified 81% of stunted children and 92% of non-stunted children, with an overall accuracy of 88%. The model’s area under the receiver operating characteristic curve (AUROC) was 0.959 (95% CI: 0.949–0.968), indicating that WaSH factors strongly predict child stunting when analyzed using this advanced machine learning technique. Four WaSH factors were identified as having the strongest power to predict stunting in our sample: improved sanitation coverage, presence of a handwashing station, piped water coverage, and availability of preferred drinking water source.ConclusionsThe results demonstrate the efficacy of machine learning algorithms, especially extreme gradient boosting to potentially inform targeted WaSH interventions for reducing childhood stunting in resource-limited settings. However, these findings require external validation in other populations, and the complete-case analysis approach (excluding 35% of children with missing data) may limit generalizability to settings with less systematic data collection.
创建时间:
2026-03-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作