An assessment of the complexity of 3' UTRs relative to that of protein-coding sequences: models selected using two procedures
收藏Research Data Australia2024-08-17 收录
下载链接:
https://researchdata.edu.au/an-assessment-complexity-using-procedures/504488
下载链接
链接失效反馈官方服务:
资源简介:
The dataset comes from a study which assessed the complexity of 3′ UTRs (three prime untranslated regions) relative to that of protein-coding sequences, by comparing the extent to which segmental substructures can be detected within these two genomic fractions based on sequence composition and conservation.
For the dataset, two different procedures were applied to select the number of classes for each alignment; investigating Deviance Information Criterion V (DICV) values (Procedure 1) and investigating the stability of the classes (Procedure 2). The numbers of classes selected for each sequence by each procedure are summarised.
The data indicates that twelve to fourteen segment classes with distinct character frequencies can be distinguished in each of the three coding sequence alignments, using Procedure 1 or Procedure 2.
本数据集源自一项研究,该研究通过基于序列组成与保守性,对比两类基因组组分中节段亚结构的可检测程度,以此评估3′非翻译区(3′ untranslated regions, UTRs)相较于蛋白质编码序列的复杂度水平。
针对本数据集,研究采用两种不同流程为每条序列比对选择类别数量:流程1为探究偏差信息准则V(Deviance Information Criterion V, DICV)取值,流程2为考察类别的稳定性。两种流程为每条序列选定的类别数量均已汇总。
数据表明,采用流程1或流程2时,在三组蛋白质编码序列比对的每一组中,均可区分出12至14个具有独特字符频率的节段类别。
提供机构:
Queensland University of Technology



