Data Sheet 1_Nine-year risk stratification and prediction of Helicobacter pylori infection using Group-Based Trajectory Modeling and machine learning in 35,206 adults.docx
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Data_Sheet_1_Nine-year_risk_stratification_and_prediction_of_Helicobacter_pylori_infection_using_Group-Based_Trajectory_Modeling_and_machine_learning_in_35_206_adults_docx/30607634
下载链接
链接失效反馈官方服务:
资源简介:
BackgroundHelicobacter pylori (H. pylori) infection remains prevalent in regions such as Shanxi, China, contributing to gastrointestinal morbidity. Accurately identifying high-risk individuals is essential for effective screening and early intervention.
MethodsWe conducted a retrospective longitudinal cohort study of 35,206 adults who underwent repeated annual health checkups with H. pylori testing at a single center from 2016 to 2024. Group-Based Trajectory Modeling (GBTM) identified risk subgroups. Multivariable logistic regression identified predictors of high-risk trajectories; alcohol consumption was assessed as an effect modifier. Five machine learning models—including Light Gradient Boosting Machine (LightGBM), Extreme Gradient Boosting, Logistic regression, etc.—were trained using a 7:3 split. Temporal validation (2016–2020 training/2021–2024 validation) assessed generalizability. SHapley Additive exPlanations (SHAP) improved interpretability. A prediction tool was deployed via R Shiny.
ResultsGBTM identified high-risk (14.63%) and low-risk (85.37%) groups. Protective factors included women (OR = 0.042, 95% CI: 0.039–0.046) and unmarried status (OR = 0.092, 95% CI: 0.085–0.099); risk factors included obesity (OR = 1.138, 95% CI: 1.070–1.210), blue-collar workers (OR = 1.557, 95% CI: 1.454–1.666), and alcohol consumption (OR = 1.277, 95% CI: 1.165–1.401). Alcohol consumption interacted with all significant factors in subgroup analysis (all p < 0.001), with the strongest interaction observed for being married (OR = 8.622, 95% CI: 7.872–9.437). LightGBM achieved AUCs of 0.851 (training), 0.843 (validation), 0.863 (temporal training), and 0.831 (temporal validation). SHAP ranked marital status and sex as top predictors. The tool is available at: https://prediction-model-for-hp.shinyapps.io/hp_shinyapp-/.
ConclusionWe developed an online, interpretable risk prediction tool with validated accuracy to support precision screening of H. pylori infection.
创建时间:
2025-11-13



