Unraveling Patterns of Site-to-Site Synonymous Rates Variation and Associated Gene Properties of Protein Domains and Families
收藏NIAID Data Ecosystem2026-03-08 收录
下载链接:
https://figshare.com/articles/dataset/_Unraveling_Patterns_of_Site_to_Site_Synonymous_Rates_Variation_and_Associated_Gene_Properties_of_Protein_Domains_and_Families_/1045810
下载链接
链接失效反馈官方服务:
资源简介:
In protein-coding genes, synonymous mutations are often thought not to affect fitness and therefore are not subject to natural selection. Yet increasingly, cases of non-neutral evolution at certain synonymous sites were reported over the last decade. To evaluate the extent and the nature of site-specific selection on synonymous codons, we computed the site-to-site synonymous rate variation (SRV) and identified gene properties that make SRV more likely in a large database of protein-coding gene families and protein domains. To our knowledge, this is the first study that explores the determinants and patterns of the SRV in real data. We show that the SRV is widespread in the evolution of protein-coding sequences, putting in doubt the validity of the synonymous rate as a standard neutral proxy. While protein domains rarely undergo adaptive evolution, the SRV appears to play important role in optimizing the domain function at the level of DNA. In contrast, protein families are more likely to evolve by positive selection, but are less likely to exhibit SRV. Stronger SRV was detected in genes with stronger codon bias and tRNA reusage, those coding for proteins with larger number of interactions or forming larger number of structures, located in intracellular components and those involved in typically conserved complex processes and functions. Genes with extreme SRV show higher expression levels in nearly all tissues. This indicates that codon bias in a gene, which often correlates with gene expression, may often be a site-specific phenomenon regulating the speed of translation along the sequence, consistent with the co-translational folding hypothesis. Strikingly, genes with SRV were strongly overrepresented for metabolic pathways and those associated with several genetic diseases, particularly cancers and diabetes.
在蛋白编码基因中,同义突变通常被认为不会影响适合度,因此不受自然选择的作用。然而在过去十年间,越来越多的研究报道了特定同义位点发生非中性进化的案例。为了评估同义密码子位点特异性选择的范围与本质,我们在大型蛋白编码基因家族及蛋白质结构域数据库中,计算了位点间同义速率变异(site-to-site synonymous rate variation, SRV),并鉴定出更易发生SRV的基因特征。据我们所知,本研究是首个利用真实数据探索SRV决定因素与分布模式的工作。我们发现,SRV在蛋白编码序列的进化过程中广泛存在,这使得同义速率作为标准中性替代指标的合理性受到质疑。尽管蛋白质结构域极少发生适应性进化,但SRV似乎在DNA层面优化结构域功能中发挥了重要作用。与之相反,蛋白编码基因家族更倾向于通过正选择进行进化,却更不易表现出SRV。在具有更强密码子偏好性与tRNA复用特性的基因、编码拥有更多相互作用或形成更多结构的蛋白质的基因、定位于细胞内组分的基因,以及参与典型保守的复杂生物学过程与功能的基因中,检测到了更强的SRV。在几乎所有组织中,具有极端SRV的基因均表现出更高的表达水平。这表明,通常与基因表达水平相关的基因密码子偏好性,往往是一种调控序列沿翻译进程速度的位点特异性现象,这与共翻译折叠假说相一致。值得注意的是,携带SRV的基因在代谢通路相关基因以及多种遗传疾病相关基因中显著富集,尤其是与癌症和糖尿病相关的基因。
创建时间:
2016-01-15



