Supplementary Material for: Screening of Serum miRNAs as Diagnostic Biomarkers for Lung Cancer Using the Minimal-Redundancy-Maximal-Relevance Algorithm and Random Forest Classifier Based on a Public Database
收藏Mendeley Data2024-06-25 更新2024-06-27 收录
下载链接:
https://karger.figshare.com/articles/dataset/Supplementary_Material_for_Screening_of_Serum_miRNAs_as_Diagnostic_Biomarkers_for_Lung_Cancer_Using_the_Minimal-Redundancy-Maximal-Relevance_Algorithm_and_Random_Forest_Classifier_Based_on_a_Public_Database/20418567
下载链接
链接失效反馈官方服务:
资源简介:
Background: Lung cancer is one of the deadliest cancers, early diagnosis of which can efficiently enhance patient’s survival. We aimed to screening out the serum miRNAs as diagnostic biomarkers for patients with lung cancer. Methods: A total of 416 remarkably differentially expressed miRNAs were acquired using the limma package, and next feature ranking was derived by the minimal-redundancy-maximal-relevance method. An incremental feature selection algorithm of a random forest (RF) classifier was utilized to choose the top 5 miRNA combination with the optimum predictive performance. The performance of the RF classifier of top 5 miRNAs was analyzed using the receiver operator characteristic (ROC) curve. Afterward, the classification effect of the 5-miRNA combination was validated through principal component analysis and hierarchical clustering analysis. Analysis of top 5 miRNA expressions between lung cancer patients and normal people was performed based on GSE137140 dataset, and their expression was validated by qPCR. The hierarchical clustering analysis was used to analyze the similarity of 5 miRNAs expression profiles. ROC analysis was undertaken on each miRNA. Results: We acquired top 5 miRNAs finally, with the Matthews correlation coefficient value as 0.988 and the area under the curve (AUC) value as 0.996. The 5 feature miRNAs were capable of distinguishing most cancer patients and normal people. Furthermore, except for the lowly expressed miR-6875-5p in lung cancer tissue, the other 4 miRNAs all expressed highly in cancer patients. Performance analysis revealed that their AUC values were 0.92, 0.96, 0.94, 0.95, and 0.93, respectively. Conclusion: By and large, the 5 feature miRNAs screened here were anticipated to be effective biomarkers for lung cancer.
背景:肺癌是致死率最高的癌症之一,早期诊断可有效提升患者的生存率。本研究旨在筛选可作为肺癌患者诊断生物标志物的血清微小RNA(miRNAs)。
方法:本研究通过limma包(limma package)筛选得到416个显著差异表达的miRNAs,随后采用最小冗余最大相关性(minimal-redundancy-maximal-relevance)方法进行特征排序。采用基于随机森林(Random Forest, RF)分类器的增量特征选择算法,筛选出预测性能最优的前5个miRNAs组合。通过受试者工作特征(Receiver Operator Characteristic, ROC)曲线分析该前5个miRNAs组成的RF分类器的性能。随后通过主成分分析与层次聚类分析验证该5-miRNA组合的分类效果。基于GSE137140数据集分析肺癌患者与正常人群的前5个miRNAs表达差异,并通过实时荧光定量聚合酶链反应(quantitative real-time polymerase chain reaction, qPCR)验证其表达水平。采用层次聚类分析该5个miRNAs的表达谱相似性,同时对单个miRNAs进行ROC分析。
结果:本研究最终筛选得到5个特征miRNAs,其马修斯相关系数(Matthews Correlation Coefficient, MCC)为0.988,曲线下面积(Area Under the Curve, AUC)为0.996。该5个特征miRNAs可有效区分绝大多数肺癌患者与正常人群。此外,除miR-6875-5p在肺癌组织中呈低表达外,其余4个miRNAs在肺癌患者体内均呈高表达。性能分析结果显示,这5个miRNAs单独的AUC值分别为0.92、0.96、0.94、0.95和0.93。
结论:总体而言,本研究筛选得到的5个特征miRNAs有望成为肺癌诊断的有效生物标志物。
创建时间:
2023-06-28



