five

Additional file 1 of Predicting clinical outcomes in Helicobacter pylori-positive patients using supervised learning through the integration of demographic and genomic features

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Additional_file_1_of_Predicting_clinical_outcomes_in_Helicobacter_pylori-positive_patients_using_supervised_learning_through_the_integration_of_demographic_and_genomic_features/31374966
下载链接
链接失效反馈
官方服务:
资源简介:
Supplementary Material 1:Supplementary Table S1: Clinical metadata of the patients. Includes patient demographics like their age, sex, their geographical location and the disease phenotype. Supplementary Table S2: Presence/absence data of H. pylori virulence genes implicated in gastric cancer. Supplementary Table S3: Sequence-derived features extracted from H. pylori genomes using iFeatureOmega and MathFeature. Supplementary Table S4: Aggregate variant categories of H. pylori genomes. Supplementary Table S5: The final all feature-types integrated dataset. Supplementary Table S6: ShapRFECV feature elimination report. Contains information about the selected and eliminated features based on the recall metric. Supplementary Table S7: Optimal hyperparameters selected using Bayesian optimization for XGBoost and Random Forest models. Supplementary Table S8: Detailed evaluation metrics for all three models with class-specific precision, recall, F1-score, AUROC and AUPRC scores with lower and upper confidence intervals. Supplementary Table S9: Classification reports of all three models. Supplementary Table S10: Detailed evaluation metrics (AUROC, AUPRC, Precision, Recall, F1) for all models trained (i) only on demographic features (ii) only on genomic features. Supplementary Table S11: Detailed evaluation metrics (AUROC, AUPRC, Precision, Recall, F1) across Logistic Regression, XGBoost and Random Forest stratified by geographical region. Supplementary Table S12: Model performance metrics stratified by data source on held-out test set. Supplementary Table S13: Leave-one-source-out (LOSO) cross-validation performance metrics. Supplementary Table S14: Quantitative SHAP summary tables that rank features by mean absolute SHAP values for all three models (Logistic Regression, XGBoost, and Random Forest). Each table presents the top 10 features along with their mean |SHAP| values, mean SHAP values (indicating direction of effect), standard deviations, and median absolute SHAP values.
创建时间:
2026-01-29
二维码
社区交流群
二维码
科研交流群
商业服务