Code for Rapid and non-invasive early detection of lung cancer by integration of machine learning and salivary metabolic fingerprints using MS LOC platform
收藏doi.org2024-11-22 更新2025-03-24 收录
下载链接:
http://doi.org/10.17632/5thzsdjc5f.2
下载链接
链接失效反馈官方服务:
资源简介:
Most lung cancer (LC) patients are diagnosed at the advanced stages due to the lack of effective screening methods. Therefore, a non-invasive method for LC screening and early detection in large-scale clinical use is necessary. Herein, a total of 1043 saliva samples were collected from 334 LC patients and 709 non-LC volunteers from six hospitals and their metabolomics data were obtained using mass spectrometry Lab-on-a-Chip (MS LOC). This approach displays high speed and high-throughput capability (96 samples per batch) for stable salivary metabolic fingerprints acquisition. Utilizing machine learning-based feature screening, we identified 35 metabolic features for LC, indicating that metabolism was disturbed in saliva from LC patients. Subsequently, a classification model named SalivaMLD was developed using an ensemble voting strategy based on multiple machine learning algorithms. By combining the predictions from various models, the voting mechanism enhanced the model's classification accuracy and robustness. In the validation set, SalivaMLD demonstrated strong diagnostic performance, achieving an area under the curve (AUC) of 0.850, a sensitivity of 83.33%, and a specificity of 74.39%. In the test set, this model showed comparable effectiveness with AUC, sensitivity, and specificity of 0.849, 81.69%, and 74.23%, respectively, outperforming conventional tumor markers, such as carcinoembryonic antigen (CEA) and carbohydrate antigen 125 (CA125). Notably, SalivaMLD distinguished early-stage LC with an accuracy of 77.42%-81.97% and effectively differentiated LC with different pathology in both the validation and test sets. Hence, this method for screening LC by integration of machine learning and MS LOC-based salivary metabolic fingerprints may be widely applied in clinical practice for rapid and non-invasive detection.
由于缺乏有效的筛查方法,大多数肺癌(LC)患者在晚期才被诊断。因此,迫切需要一种无创的肺癌筛查和早期检测方法,以便在大型临床实践中推广应用。本研究中,共收集了来自六个医院的334名肺癌患者和709名非肺癌志愿者的1043份唾液样本,并利用质谱实验室芯片(MS LOC)获取了其代谢组学数据。该方法具备高速和高通量能力(每批次96个样本),能够稳定获取唾液代谢指纹。通过基于机器学习的特征筛选,我们确定了35个与肺癌相关的代谢特征,表明肺癌患者的唾液中代谢功能紊乱。随后,采用基于多种机器学习算法的集成投票策略,开发了名为SalivaMLD的分类模型。通过结合多个模型的预测结果,投票机制增强了模型在分类准确性和鲁棒性方面的表现。在验证集中,SalivaMLD展现了卓越的诊断性能,曲线下面积(AUC)达到0.850,灵敏度83.33%,特异性74.39%。在测试集中,该模型在AUC、灵敏度、特异性等方面表现出与验证集相当的有效性,分别为0.849、81.69%和74.23%,优于传统的肿瘤标志物,如癌胚抗原(CEA)和糖类抗原125(CA125)。值得注意的是,SalivaMLD在验证和测试集中均能够准确区分早期肺癌,准确率在77.42%-81.97%之间,并且能够有效地区分不同病理类型的肺癌。因此,基于机器学习和MS LOC唾液代谢指纹集成的肺癌筛查方法有望在临床实践中得到广泛应用,以实现快速、非侵入性的检测。
提供机构:
doi.org



