five

An integrated framework for the identification of potential miRNA-disease association based on novel negative samples extraction strategy

收藏
DataCite Commons2024-02-14 更新2024-07-27 收录
下载链接:
https://tandf.figshare.com/articles/dataset/An_integrated_framework_for_the_identification_of_potential_miRNA-disease_association_based_on_novel_negative_samples_extraction_strategy/7593017
下载链接
链接失效反馈
官方服务:
资源简介:
MicroRNAs (miRNAs) play an important role in prevention, diagnosis and treatment of human complex diseases. Predicting potential miRNA-disease associations could provide important prior information for medical researchers. Therefore, reliable computational models are expected to be an effective supplement for inferring associations between miRNAs and diseases. In this study, we developed a novel calculative model named Negative Samples Extraction based MiRNA-Disease Association prediction (NSEMDA). NSEMDA filtered reliable negative samples by two positive-unlabeled learning models, namely, the Spy and Rocchio techniques and calculated similarity weights for ambiguous samples. The positive samples, reliable negative samples and ambiguous samples with similarity weights were used to construct a Support Vector Machine-Similarity Weight model to predict miRNA-disease associations. NSEMDA improved the credibility of negative samples and reduced the impact of noise samples by introducing ambiguous samples with similarity weights to train prediction model. As a result, NSEMDA achieved the AUC of 0.8899 in global leave-one-out cross validation (LOOCV) and AUC of 0.8353 under local LOOCV. In 100 times 5-fold cross validation, NSEMDA obtained an average AUC of 0.8878 and standard deviation of 0.0014. These AUCs are higher than many classical models. Besides, we also carried out three kinds of case studies to evaluate the performance of NSEMDA. Among the top 50 potential related miRNAs of esophageal neoplasms, lung neoplasms and carcinoma hepatocellular predicted by NSEMDA, 46, 50 and 45 miRNAs were verified to be associated with the investigated disease by experimental evidences, respectively. Therefore, NSEMDA would be a reliable calculative model for inferring miRNA-disease associations.

微小RNA(MicroRNAs,miRNAs)在人类复杂疾病的预防、诊断与治疗中发挥关键作用。预测潜在的miRNA-疾病关联可为医学研究者提供重要的先验信息,因此可靠的计算模型有望成为推断miRNA与疾病关联的有效补充手段。本研究开发了一种新型计算模型,命名为基于负样本提取的miRNA-疾病关联预测模型(Negative Samples Extraction based MiRNA-Disease Association prediction,NSEMDA)。该模型通过两种正未标记学习(positive-unlabeled learning)模型——即Spy技术与Rocchio技术——筛选可靠负样本,并为模糊样本计算相似性权重;随后将带相似性权重的正样本、可靠负样本与模糊样本用于构建支持向量机-相似性权重(Support Vector Machine-Similarity Weight)模型,以预测miRNA-疾病关联。NSEMDA通过引入带相似性权重的模糊样本训练预测模型,提升了负样本的可信度,降低了噪声样本的干扰。实验结果显示,NSEMDA在全局留一交叉验证(leave-one-out cross validation,LOOCV)中取得了0.8899的受试者工作特征曲线下面积(Area Under Curve,AUC),在局部LOOCV中获得了0.8353的AUC;在100次5折交叉验证中,其平均AUC为0.8878,标准差为0.0014。上述AUC值均优于诸多经典模型。此外,本研究还开展了三类案例研究以评估NSEMDA的性能:在NSEMDA预测的食管肿瘤、肺肿瘤与肝细胞癌的前50个潜在关联miRNA中,分别有46、50和45个miRNA已通过实验证据证实与对应研究疾病存在关联。综上,NSEMDA是一款可靠的计算模型,可用于推断miRNA与疾病的关联。
提供机构:
Taylor & Francis
创建时间:
2019-01-16
二维码
社区交流群
二维码
科研交流群
商业服务