five

Improved lung cancer classification by employing diverse molecular features of microRNAs. Improved lung cancer classification by employing diverse molecular features of microRNAs

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1022393
下载链接
链接失效反馈
官方服务:
资源简介:
Lung adenocarcinoma (LUAD) is one of the most common pathological and histological subtypes of primary lung cancer, with high morbidity and mortality. MicroRNAs (miRNAs) are endogenous small non-coding RNAs that regulate the expression of genes at post-transcriptional level. It was reported that A-to-I miRNA editing was decreased in tumors, suggesting the potential value of miRNA editing in cancer classification. However, existing miRNA-based cancer classification models mainly used the frequencies of miRNAs. In order to validate the contribution of miRNA editing information in cancer classification, we extracted three types of miRNA features, including the abundances of original miRNAs, the abundances of edited miRNAs, and the editing levels of miRNA editing sites. Our results show that four classification algorithms selected, i.e., kNN, C4.5, RF and SVM, generally had better performances on all features than on the abundances of miRNAs alone. Since the number of features were large, we used three feature selection (FS) methods to further improve the classification models. One of the FS methods, the DFL algorithm, selected only three features, i.e., the frequencies of hsa-miR-135b-5p, hsa-miR-210-3p and hsa-miR-182 48u (an edited miRNA), from 316 training samples. And all of the four classification algorithms achieved 100% accuracy on these three features for 79 independent testing samples. These results indicate that the additional information of miRNA editing are useful in improving the classification of LUAD samples. And the three miRNAs selected by DFL potentially represent an effective molecular signature for LUAD diagnosis. Overall design: Small RNA-Seq for 19 lung adenocarcinoma (LUAD) and 19 adjacent normal tissues were obtained, and put into liquid nitrogen immediately after resection. The total RNAs were retrieved and the small RNA sequencing libraries were prepared and sequenced by BGI (Shenzhen, China). Next, the mutation and editing sites of miRNAs were analyzed with the MiRME algorithm for all of these 38 sRNA-seq profiles and 357 public LUAD and normal samples. Then, abundance of original and edited miRNAs, editing levels of identified miRNA editing sites were obtained for these 395 samples. Four machine learning algorithms were used to classify these samples as LUAD or normal samples. Three Feature Selection algorithms were used to select molecular features that were accurate in predicting the samples.

肺腺癌(Lung adenocarcinoma, LUAD)是原发性肺癌最常见的病理组织学亚型之一,兼具高发病率与高死亡率。微小RNA(miRNAs)是一类内源性小型非编码RNA,可在转录后水平调控基因表达。已有研究证实,肿瘤中A-to-I型miRNA编辑水平显著降低,提示miRNA编辑在癌症分类中具备潜在应用价值。然而当前基于miRNA的癌症分类模型大多仅使用miRNA的表达频率。为验证miRNA编辑信息在癌症分类中的贡献度,我们提取了三类miRNA特征:原始miRNA的表达量、编辑后miRNA的表达量,以及miRNA编辑位点的编辑水平。结果显示,所选的四种分类算法——k近邻(kNN)、C4.5决策树、随机森林(RF)与支持向量机(SVM)——在全部三类特征上的整体表现均优于仅使用原始miRNA表达量的模型。鉴于特征维度较高,我们采用三种特征选择(Feature Selection, FS)方法以进一步优化分类模型。其中DFL算法仅从316份训练样本中筛选出三个特征:hsa-miR-135b-5p、hsa-miR-210-3p以及hsa-miR-182 48u(经编辑的miRNA)。针对79份独立测试样本,四种分类算法在这三个特征上均实现了100%的分类准确率。上述结果表明,miRNA编辑的额外信息可有效改善肺腺癌样本的分类效果,且DFL筛选出的这三种miRNA有望成为肺腺癌诊断的有效分子标志物。 实验整体设计:本研究收集了19份肺腺癌组织与19份配对癌旁正常组织样本,术后立即将样本置于液氮中保存。随后提取总RNA,构建小RNA测序(sRNA-seq)文库,并交由中国深圳华大基因(BGI)完成测序。针对上述38份小RNA测序数据,以及公开获取的357份肺腺癌与正常样本数据,我们使用MiRME算法分析miRNA的突变与编辑位点。最终为全部395份样本获取了原始miRNA与编辑后miRNA的表达量,以及已鉴定的miRNA编辑位点的编辑水平。采用四种机器学习算法将样本分类为肺腺癌或正常样本,并使用三种特征选择算法筛选出可精准预测样本类别的分子特征。
创建时间:
2023-09-29
二维码
社区交流群
二维码
科研交流群
商业服务