Interpretable machine learning model integrating CT radiomics, CTR, and clinical features for EGFR mutation prediction in ≤3 cm lung adenocarcinoma nodules
收藏DataCite Commons2026-01-21 更新2026-04-25 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Interpretable_machine_learning_model_integrating_CT_radiomics_CTR_and_clinical_features_for_EGFR_mutation_prediction_in_3_cm_lung_adenocarcinoma_nodules/30921070
下载链接
链接失效反馈官方服务:
资源简介:
Non-invasive prediction of EGFR mutation status in lung adenocarcinoma (LUAD) is critical for treatment planning, particularly in small pulmonary nodules where tissue genotyping is limited. However, the consolidation-to-tumor ratio (CTR), a clinically relevant imaging biomarker, has rarely been incorporated into radiomics-based models. To develop and validate an interpretable CT radiomics model incorporating CTR and clinical features for predicting EGFR mutation status in LUAD patients with nodules ≤3 cm. In this retrospective study included 492 patients with pathologically confirmed LUAD who underwent preoperative non-contrast chest CT between January 2017 and December 2022. Tumors were manually segmented for radiomic feature extraction, and CTR was measured for each lesion. Radiomic textures were computed with PyRadiomics using a fixed gray-level bin width. Feature selection was performed using analysis of variance and mutual information filtering followed by RFE with a random-forest base estimator. Three random forest classifiers were constructed: a radiomics-only model, a clinical-only model, and a combined radiomics-clinical model. Model performance was assessed by AUC with 95% CI, and interpretability was evaluated using SHapley Additive exPlanations (SHAP). The combined model achieved the best performance (AUC, 0.74 [95% CI: 0.69–0.79] in training; 0.76 [95% CI: 0.66–0.85] in testing), outperforming the radiomics-only (AUC, 0.69) and clinical-only (AUC, 0.60) models in the testing cohort. CTR was the most influential feature according to SHAP analysis. A interpretable radiomics model integrating CTR and clinical features enables effective non-invasive prediction of EGFR mutation status in small LUAD nodules, supporting molecular risk stratification when tissue genotyping is unavailable.
肺腺癌(LUAD)患者表皮生长因子受体(EGFR)突变状态的无创预测对治疗方案制定至关重要,尤其在组织基因分型受限的肺小结节患者中。然而,作为一项具有临床价值的影像学生物标志物,实变-肿瘤比(CTR)极少被纳入基于放射组学的模型中。本研究旨在构建并验证一款可解释的CT放射组学模型,纳入CTR与临床特征,用于预测结节直径≤3cm的LUAD患者的EGFR突变状态。本回顾性研究纳入了492例经病理确诊的LUAD患者,这些患者于2017年1月至2022年12月期间接受了术前胸部平扫CT检查。研究人员对肿瘤进行手动分割以提取放射组学特征,并对每个病灶测量CTR值。采用PyRadiomics工具包,以固定灰度箱宽度计算放射组学纹理特征。特征选择采用方差分析与互信息过滤,随后结合以随机森林为基础估计器的递归特征消除(Recursive Feature Elimination, RFE)方法。本研究构建了三款随机森林分类器:仅放射组学模型、仅临床特征模型,以及放射组学-临床特征联合模型。模型性能通过受试者工作特征曲线下面积(Area Under Curve, AUC)及95%置信区间(Confidence Interval, CI)进行评估,可解释性则采用SHapley可加解释(SHAP)方法进行验证。联合模型取得了最优性能:训练集AUC为0.74(95%CI:0.69~0.79),测试集AUC为0.76(95%CI:0.66~0.85),在测试队列中优于仅放射组学模型(AUC=0.69)与仅临床特征模型(AUC=0.60)。SHAP分析显示,CTR是影响模型预测的最关键特征。一款整合CTR与临床特征的可解释放射组学模型,可有效实现肺小结节LUAD患者EGFR突变状态的无创预测,在组织基因分型无法开展时,可为分子风险分层提供决策支持。
提供机构:
Taylor & Francis
创建时间:
2025-12-19



