Data_Sheet_1_Predicting Obesity in Adults Using Machine Learning Techniques: An Analysis of Indonesian Basic Health Research 2018.pdf
收藏frontiersin.figshare.com2023-06-01 更新2025-03-22 收录
下载链接:
https://frontiersin.figshare.com/articles/dataset/Data_Sheet_1_Predicting_Obesity_in_Adults_Using_Machine_Learning_Techniques_An_Analysis_of_Indonesian_Basic_Health_Research_2018_pdf/14814054/1
下载链接
链接失效反馈官方服务:
资源简介:
Obesity is strongly associated with multiple risk factors. It is significantly contributing to an increased risk of chronic disease morbidity and mortality worldwide. There are various challenges to better understand the association between risk factors and the occurrence of obesity. The traditional regression approach limits analysis to a small number of predictors and imposes assumptions of independence and linearity. Machine Learning (ML) methods are an alternative that provide information with a unique approach to the application stage of data analysis on obesity. This study aims to assess the ability of ML methods, namely Logistic Regression, Classification and Regression Trees (CART), and Naïve Bayes to identify the presence of obesity using publicly available health data, using a novel approach with sophisticated ML methods to predict obesity as an attempt to go beyond traditional prediction models, and to compare the performance of three different methods. Meanwhile, the main objective of this study is to establish a set of risk factors for obesity in adults among the available study variables. Furthermore, we address data imbalance using Synthetic Minority Oversampling Technique (SMOTE) to predict obesity status based on risk factors available in the dataset. This study indicates that the Logistic Regression method shows the highest performance. Nevertheless, kappa coefficients show only moderate concordance between predicted and measured obesity. Location, marital status, age groups, education, sweet drinks, fatty/oily foods, grilled foods, preserved foods, seasoning powders, soft/carbonated drinks, alcoholic drinks, mental emotional disorders, diagnosed hypertension, physical activity, smoking, and fruit and vegetables consumptions are significant in predicting obesity status in adults. Identifying these risk factors could inform health authorities in designing or modifying existing policies for better controlling chronic diseases especially in relation to risk factors associated with obesity. Moreover, applying ML methods on publicly available health data, such as Indonesian Basic Health Research (RISKESDAS) is a promising strategy to fill the gap for a more robust understanding of the associations of multiple risk factors in predicting health outcomes.
肥胖与多种风险因素具有显著相关性,其在全球范围内显著增加了慢性疾病发病率和死亡率。深入理解风险因素与肥胖发生之间的关联面临着诸多挑战。传统的回归分析方法局限于对少数预测因子的分析,并强加了独立性和线性的假设。机器学习(ML)方法作为一种替代方案,为肥胖数据分析的应用阶段提供了一种独特的视角。本研究旨在评估机器学习方法,即逻辑回归、分类与回归树(CART)和朴素贝叶斯,在利用公开健康数据识别肥胖存在情况的能力,通过运用先进的机器学习方法预测肥胖,旨在超越传统的预测模型,并比较三种不同方法的性能。同时,本研究的主要目标是确定成人肥胖的若干风险因素。此外,我们运用合成少数类过采样技术(SMOTE)解决数据不平衡问题,以数据集中可用的风险因素预测肥胖状态。研究表明,逻辑回归方法表现最佳。然而,Kappa系数仅显示出预测肥胖与实际测量肥胖之间中等程度的一致性。地理位置、婚姻状况、年龄组、教育程度、含糖饮料、油腻食物、烧烤食物、腌制食品、调味粉、软饮料/碳酸饮料、酒精饮料、精神情绪障碍、确诊高血压、体力活动、吸烟以及水果和蔬菜的摄入对预测成人肥胖状态具有重要意义。识别这些风险因素能够为卫生当局在制定或修改现有政策以更好地控制慢性疾病,尤其是与肥胖相关风险因素相关的慢性疾病提供信息。此外,在公开健康数据,如印度尼西亚基本健康研究(RISKESDAS)上应用机器学习方法,是一种有前景的策略,旨在填补对多个风险因素与预测健康结果之间关联的更稳健理解的空白。
提供机构:
Frontiers



