Machine learning-driven prognostic and diagnostic models for lung adenocarcinoma using intratumor heterogeneity and multi-omics data
收藏Taylor & Francis Group2025-07-31 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Machine_learning-driven_prognostic_and_diagnostic_models_for_lung_adenocarcinoma_using_intratumor_heterogeneity_and_multi-omics_data/29750034/1
下载链接
链接失效反馈官方服务:
资源简介:
Intratumor heterogeneity (ITH) significantly impacts cancer prognosis and treatment response. Focusing on lung adenocarcinoma (LUAD), this study investigates the relationship between ITH and clinical outcomes, and constructs machine learning-based prognostic and diagnostic models. ITH scores were calculated using the DEPTH2 package, and weighted gene co-expression network analysis (WGCNA) was applied to identify ITH-associated core genes. A 19-gene prognostic model was developed using Elastic Net (Enet), and a 7-gene diagnostic model was built through a combination of LASSO and Random Forest (RF). The prognostic model was validated across six independent datasets, while the diagnostic model was tested in three. ITH was found to correlate significantly with clinical characteristics such as gender, M stage, and overall survival. WGCNA revealed the black and lightgreen modules as key to ITH, contributing 126 core genes. Both models demonstrated strong predictive performance and generalizability, accurately stratifying LUAD patients and distinguishing them from healthy controls. These findings underscore the clinical value of incorporating ITH and multi-omics data into model construction to enhance precision in LUAD diagnosis and prognosis.
肿瘤内异质性(Intratumor Heterogeneity, ITH)对癌症预后及治疗反应具有显著影响。本研究以肺腺癌(Lung Adenocarcinoma, LUAD)为研究对象,探讨ITH与临床结局的关联,并构建基于机器学习的预后与诊断模型。研究采用DEPTH2软件包计算ITH评分,并通过加权基因共表达网络分析(Weighted Gene Co-expression Network Analysis, WGCNA)筛选与ITH相关的核心基因。本研究利用弹性网络(Elastic Net, Enet)构建了包含19个基因的预后模型,并结合最小绝对收缩与选择算子(LASSO)与随机森林(Random Forest, RF)构建了包含7个基因的诊断模型。预后模型在6个独立数据集中完成验证,诊断模型则在3个独立数据集内进行测试。研究发现ITH与性别、M分期及总生存期等临床特征显著相关。WGCNA分析显示,黑色模块与淡绿色模块为ITH相关的关键模块,共筛选得到126个核心基因。两款模型均展现出优异的预测性能与泛化能力,可准确分层肺腺癌患者的预后风险,并能有效区分肺腺癌患者与健康对照个体。本研究结果表明,将ITH与多组学数据纳入模型构建,可提升肺腺癌诊断与预后预测的精准性,凸显了其临床应用价值。
提供机构:
Guo, Hua; Wang, Yuntao; Gan, Shiwei; Li, Xiaohua; Zeng, Xuefeng; Nie, Xiaohong
创建时间:
2025-07-31



