Table_4_An online tool for survival prediction of extrapulmonary small cell carcinoma with random forest.xlsx
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Table_4_An_online_tool_for_survival_prediction_of_extrapulmonary_small_cell_carcinoma_with_random_forest_xlsx/23600853
下载链接
链接失效反馈官方服务:
资源简介:
PurposeExtrapulmonary small cell carcinoma (EPSCC) is rare, and its knowledge is mainly extrapolated from small cell lung carcinoma. Reliable survival prediction tools are lacking.
MethodsA total of 3,921 cases of EPSCC were collected from the Surveillance Epidemiology and End Results (SEER) database, which form the training and internal validation cohorts of the survival prediction model. The endpoint was an overall survival of 0.5–5 years. Internal validation performances of machine learning algorithms were compared, and the best model was selected. External validation (n = 68) was performed to evaluate the generalization ability of the selected model.
ResultsAmong machine learning algorithms, the random forest model performs best on internal validation, whose area under the curve (AUC) is 0.736–0.800. The net benefit is higher than the TNM classification in decision curve analysis. The AUC of this model on the external validation cohort is 0.739–0.811. This model was then deployed online as a free, publicly available prediction tool of EPSCC (http://42.192.80.13:4399/).
ConclusionThis study provides an excellent online survival prediction tool for EPSCC with machine learning and large-scale data. Age, TNM stages, and surgery (including potential performance status information) are the most critical factors for the prediction model.
研究目的:肺外小细胞癌(Extrapulmonary small cell carcinoma, EPSCC)较为罕见,目前对该病的认知主要从小细胞肺癌的研究中推断而来,且缺乏可靠的生存预测工具。
研究方法:本研究从监测、流行病学与最终结果(Surveillance, Epidemiology, and End Results, SEER)数据库中收集了共计3921例肺外小细胞癌病例,将其作为生存预测模型的训练队列与内部验证队列。本研究的终点为0.5至5年的总生存期。研究比较了多种机器学习算法的内部验证性能,并遴选出最优模型;随后纳入68例样本开展外部验证,以评估所选模型的泛化能力。
研究结果:在诸多机器学习算法中,随机森林模型在内部验证中表现最佳,其曲线下面积(area under the curve, AUC)为0.736~0.800。决策曲线分析显示,该模型的净获益优于TNM分期系统。该模型在外部验证队列中的曲线下面积为0.739~0.811。随后,本研究将该模型部署于线上,推出一款免费的公开肺外小细胞癌生存预测工具(http://42.192.80.13:4399/)。
研究结论:本研究借助机器学习与大规模临床数据,为肺外小细胞癌提供了一款优秀的在线生存预测工具。年龄、TNM分期以及手术(含潜在体能状态信息)是该预测模型的核心影响因素。
创建时间:
2023-06-29



