Integrated Sequence-Structure Motifs Suffice to Identify microRNA Precursors
收藏NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://figshare.com/articles/dataset/Integrated_Sequence_Structure_Motifs_Suffice_to_Identify_microRNA_Precursors/127568
下载链接
链接失效反馈官方服务:
资源简介:
BackgroundUpwards of 1200 miRNA loci have hitherto been annotated in the human genome. The specific features defining a miRNA precursor and deciding its recognition and subsequent processing are not yet exhaustively described and miRNA loci can thus not be computationally identified with sufficient confidence.
ResultsWe rendered pre-miRNA and non-pre-miRNA hairpins as strings of integrated sequence-structure information, and used the software Teiresias to identify sequence-structure motifs (ss-motifs) of variable length in these data sets. Using only ss-motifs as features in a Support Vector Machine (SVM) algorithm for pre-miRNA identification achieved 99.2% specificity and 97.6% sensitivity on a human test data set, which is comparable to previously published algorithms employing combinations of sequence-structure and additional features. Further analysis of the ss-motif information contents revealed strongly significant deviations from those of the respective training sets, revealing important potential clues as to how the sequence and structural information of RNA hairpins are utilized by the miRNA processing apparatus.
ConclusionIntegrated sequence-structure motifs of variable length apparently capture nearly all information required to distinguish miRNA precursors from other stem-loop structures.
背景:迄今已在人类基因组中注释了超过1200个微小RNA(miRNA)基因座。目前尚未全面阐明定义miRNA前体、决定其识别与后续加工过程的特异性特征,因此尚无法通过计算方法以足够可信度鉴定miRNA基因座。结果:我们将miRNA前体(pre-miRNA)与非miRNA前体发夹结构编码为整合序列-结构信息的字符串,并使用Teiresias软件在该数据集中识别可变长度的序列-结构基序(ss-motifs)。仅以ss-motifs作为特征,将其应用于支持向量机(SVM)算法以鉴定miRNA前体,在人类测试数据集上实现了99.2%的特异性与97.6%的灵敏度,该表现与此前已发表的、结合序列-结构特征与额外特征的算法相当。进一步对ss-motifs的信息含量分析显示,其与对应训练集的信息含量存在极显著差异,为解析RNA发夹结构的序列与结构信息如何被miRNA加工系统利用提供了重要潜在线索。结论:可变长度的整合序列-结构基序似乎涵盖了区分miRNA前体与其他茎环结构所需的几乎全部信息。
创建时间:
2012-03-15



