Frameshifts and wild-type protein sequences are always highly similar because the genetic code and genomes were optimized for frameshift tolerance
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Frameshifts_and_wild-type_protein_sequences_are_always_highly_similar_because_the_genetic_code_is_optimal_for_frameshift_tolerance/9948050
下载链接
链接失效反馈官方服务:
资源简介:
Frameshift
protein sequences encoded by alternative reading frames of coding genes have been
considered meaningless, and frameshift mutations have been considered of little
importance for the molecular evolution of coding genes and proteins. However, functional
frameshift homologs are widely existing. It was puzzling how a frameshift protein
kept its structure and functionality while its amino-acid sequence changed
substantially. Here we report that the similarities
among frameshifts and wild-type protein sequences are higher than the random
similarities, and are defined by both the genetic code and the genome.
In the standard genetic code, amino acids assigned to frameshift codon
substitutions are more conservative than those assigned to random
substitutions. The frameshift tolerability of the standard genetic code ranks
in the top 2.0-3.5% of all alternative genetic codes, showing that the natural genetic
code was optimized for frameshift tolerance. Moreover, frameshift-tolerable
codons and codon pairs appear more frequently in the genomes in higher species
than in lower species, showing that the frameshift tolerability was further
optimized by the usages of codons and codon pairs in these genomes.
S1a: Frame
similarities aligned by ClustalW or MSA; S1b: Frame
similarities aligned by FrameAlign; S2: FSSs of the natural genetic code; S3: FSSs of the
alternative genetic codes; S4: FSSs of different codon usages; S5: FSSs of different
usages of codon pairs.
创建时间:
2019-10-08



