five

Supplementary Material for: Screening of Serum miRNAs as Diagnostic Biomarkers for Lung Cancer Using the Minimal-Redundancy-Maximal-Relevance Algorithm and Random Forest Classifier Based on a Public Database

收藏
Mendeley Data2024-06-25 更新2024-06-27 收录
下载链接:
https://karger.figshare.com/articles/dataset/Supplementary_Material_for_Screening_of_Serum_miRNAs_as_Diagnostic_Biomarkers_for_Lung_Cancer_Using_the_Minimal-Redundancy-Maximal-Relevance_Algorithm_and_Random_Forest_Classifier_Based_on_a_Public_Database/20418567/1
下载链接
链接失效反馈
官方服务:
资源简介:
Background: Lung cancer is one of the deadliest cancers, early diagnosis of which can efficiently enhance patient’s survival. We aimed to screening out the serum miRNAs as diagnostic biomarkers for patients with lung cancer. Methods: A total of 416 remarkably differentially expressed miRNAs were acquired using the limma package, and next feature ranking was derived by the minimal-redundancy-maximal-relevance method. An incremental feature selection algorithm of a random forest (RF) classifier was utilized to choose the top 5 miRNA combination with the optimum predictive performance. The performance of the RF classifier of top 5 miRNAs was analyzed using the receiver operator characteristic (ROC) curve. Afterward, the classification effect of the 5-miRNA combination was validated through principal component analysis and hierarchical clustering analysis. Analysis of top 5 miRNA expressions between lung cancer patients and normal people was performed based on GSE137140 dataset, and their expression was validated by qPCR. The hierarchical clustering analysis was used to analyze the similarity of 5 miRNAs expression profiles. ROC analysis was undertaken on each miRNA. Results: We acquired top 5 miRNAs finally, with the Matthews correlation coefficient value as 0.988 and the area under the curve (AUC) value as 0.996. The 5 feature miRNAs were capable of distinguishing most cancer patients and normal people. Furthermore, except for the lowly expressed miR-6875-5p in lung cancer tissue, the other 4 miRNAs all expressed highly in cancer patients. Performance analysis revealed that their AUC values were 0.92, 0.96, 0.94, 0.95, and 0.93, respectively. Conclusion: By and large, the 5 feature miRNAs screened here were anticipated to be effective biomarkers for lung cancer.

背景:肺癌是致死率最高的恶性肿瘤之一,早期诊断可有效提升患者的生存率。本研究旨在筛选可用于肺癌患者诊断的血清微小RNA (miRNAs) 生物标志物。 方法:本研究通过limma包筛选得到416个显著差异表达的miRNAs,随后采用最小冗余最大相关性 (minimal-redundancy-maximal-relevance, mRMR) 方法进行特征排序。利用基于随机森林 (Random Forest, RF) 分类器的增量特征选择算法,选取具备最优预测性能的前5个miRNA组合。采用受试者工作特征 (Receiver Operator Characteristic, ROC) 曲线分析该5-miRNA组合对应的RF分类器性能。随后,通过主成分分析与层次聚类分析验证该5-miRNA组合的分类效果。基于GSE137140数据集分析肺癌患者与健康人群间的前5个miRNA表达差异,并通过实时定量聚合酶链式反应 (quantitative real-time polymerase chain reaction, qPCR) 验证其表达水平。采用层次聚类分析5个miRNAs的表达谱相似性。此外,对每个miRNA单独开展ROC分析。 结果:本研究最终筛选得到5个特征miRNAs,其马修斯相关系数 (Matthews Correlation Coefficient, MCC) 值为0.988,曲线下面积 (Area Under the Curve, AUC) 值为0.996。该5个特征miRNAs可有效区分绝大多数肺癌患者与健康人群。此外,除miR-6875-5p在肺癌组织中呈低表达外,其余4个miRNAs在肺癌患者体内均呈高表达。性能分析显示,这5个miRNAs单独的AUC值分别为0.92、0.96、0.94、0.95和0.93。 结论:总体而言,本研究筛选得到的5个特征miRNAs有望成为肺癌诊断的有效生物标志物。
创建时间:
2023-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作