five

The Random Nature of Genome Architecture: Predicting Open Reading Frame Distributions

收藏
NIAID Data Ecosystem2026-03-06 收录
下载链接:
https://figshare.com/articles/dataset/The_Random_Nature_of_Genome_Architecture_Predicting_Open_Reading_Frame_Distributions/146882
下载链接
链接失效反馈
官方服务:
资源简介:
BackgroundA better understanding of the size and abundance of open reading frames (ORFS) in whole genomes may shed light on the factors that control genome complexity. Here we examine the statistical distributions of open reading frames (i.e. distribution of start and stop codons) in the fully sequenced genomes of 297 prokaryotes, and 14 eukaryotes. Methodology/Principal FindingsBy fitting mixture models to data from whole genome sequences we show that the size-frequency distributions for ORFS are strikingly similar across prokaryotic and eukaryotic genomes. Moreover, we show that i) a large fraction (60–80%) of ORF size-frequency distributions can be predicted a priori with a stochastic assembly model based on GC content, and that (ii) size-frequency distributions of the remaining “non-random” ORFs are well-fitted by log-normal or gamma distributions, and similar to the size distributions of annotated proteins. Conclusions/SignificanceOur findings suggest stochastic processes have played a primary role in the evolution of genome complexity, and that common processes govern the conservation and loss of functional genomics units in both prokaryotes and eukaryotes.
创建时间:
2009-07-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作