five

Improving Evolutionary Models for Mitochondrial Protein Data with Site-Class Specific Amino Acid Exchangeability Matrices

收藏
NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://figshare.com/articles/dataset/Improving_Evolutionary_Models_for_Mitochondrial_Protein_Data_with_Site_Class_Specific_Amino_Acid_Exchangeability_Matrices__/155262
下载链接
链接失效反馈
官方服务:
资源简介:
Adequate modeling of mitochondrial sequence evolution is an essential component of mitochondrial phylogenomics (comparative mitogenomics). There is wide recognition within the field that lineage-specific aspects of mitochondrial evolution should be accommodated through lineage-specific amino-acid exchangeability matrices (e.g., mtMam for mammalian data). However, such a matrix must be applied to all sites and this implies that all sites are subject to the same, or largely similar, evolutionary constraints. This assumption is unjustified. Indeed, substantial differences are expected to arise from three-dimensional structures that impose different physiochemical environments on individual amino acid residues. The objectives of this paper are (1) to investigate the extent to which amino acid evolution varies among sites of mitochondrial proteins, and (2) to assess the potential benefits of explicitly modeling such variability. To achieve this, we developed a novel method for partitioning sites based on amino acid physiochemical properties. We apply this method to two datasets derived from complete mitochondrial genomes of mammals and fish, and use maximum likelihood to estimate amino acid exchangeabilities for the different groups of sites. Using this approach we identified large groups of sites evolving under unique physiochemical constraints. Estimates of amino acid exchangeabilities differed significantly among such groups. Moreover, we found that joint estimates of amino acid exchangeabilities do not adequately represent the natural variability in evolutionary processes among sites of mitochondrial proteins. Significant improvements in likelihood are obtained when the new matrices are employed. We also find that maximum likelihood estimates of branch lengths can be strongly impacted. We provide sets of matrices suitable for groups of sites subject to similar physiochemical constraints, and discuss how they might be used to analyze real data. We also discuss how the general approach might be employed to improve a variety of mitogenomic-based research activities.
创建时间:
2013-01-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作