Test sets result of machine learning models.
收藏Figshare2026-01-06 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_p_Test_sets_result_of_machine_learning_models_p_/31010349
下载链接
链接失效反馈官方服务:
资源简介:
BackgroundAccurate diagnosis of Parkinson’s Disease (PD) remains challenging due to its biological complexity. Integrating machine learning with multi-omics and network topological analyses may enhance diagnostic precision.ObjectiveTo classify early-stage PD patients and healthy controls (HCs) by integrating multi-omics data and network-based machine learning.MethodsWe analyzed 305 participants from the PPMI cohort (213 PD, 92 HCs) using DNA methylation, gene expression, and proteomic data. Feature selection was conducted through sPLS-DA, followed by integration via DIABLO. Regulatory gene networks were constructed using STRING-based topology analysis. An XGBoost classifiers were trained and optimized on 244 samples and validated on 61 independent samples. Furthermore, we conducted external validation using data from 26 participants (12 PD, 14 HCs) in the GEO dataset, incorporating DNA methylation and gene expression profiles.ResultsDIABLO-based integration identified 56 CpG sites, 61 genes, and 70 proteins. Network topology analysis revealed 59 key regulators. Among three XGBoost models—based on multi-omics signatures, topological regulators, and their combination—the multi-omics model achieved the best test-set performance (AUC 0.72, 95%CI 0.51-0.85, accuracy 0.74, 95%CI 0.61-0.82). Within the validation set, the topological regulators model achieved the superior performance (AUC 0.57, 95%CI 0.37-0.73, accuracy 0.62, 95%CI 0.38-0.73).ConclusionCombining machine learning with integrative multi-omics and network topology analysis enables effective biomarker identification and PD classification, with strong potential for clinical diagnostic applications.
创建时间:
2026-01-06



