five

The Community Coevolution Model with application to the study of evolutionary relationships between genes based on phylogenetic profiles

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.p8cz8w9rd
下载链接
链接失效反馈
官方服务:
资源简介:
Organismal traits can evolve in a coordinated way, with correlated patterns of gains and losses reflecting important evolutionary associations. Discovering these associations can reveal important information about the functional and ecological linkages among traits. Phylogenetic profiles treat individual genes as traits distributed across sets of genomes and can provide a fine-grained view of the genetic underpinnings of evolutionary processes in a set of genomes. Phylogenetic profiling has been used to identify genes that are functionally linked, and to identify common patterns of lateral gene transfer in microorganisms. However, comparative analysis of phylogenetic profiles and other trait distributions should take into account the phylogenetic relationships among the organisms under consideration. Here we propose the Community Coevolution Model (CCM), a new coevolutionary model to analyze the evolutionary associations among traits, with a focus on phylogenetic profiles. In the CCM, traits are considered to evolve as a community with interactions, and the transition rate for each trait depends on the current states of other traits. Surpassing other comparative methods for pairwise trait analysis, CCM has the additional advantage of being able to examine multiple traits as a community to reveal more dependency relationships. We also develop a simulation procedure to generate phylogenetic profiles with correlated evolutionary patterns that can be used as benchmark data for evaluation purposes. A simulation study demonstrates that CCM is more accurate than other methods including the Jaccard Index and three tree-aware methods. The parameterization of CCM makes the interpretation of the relations between genes more direct, which leads to Darwin's scenario being identified easily based on the estimated parameters. We show that CCM is more efficient and fits real data better than other methods resulting in higher likelihood scores with fewer parameters. An examination of 3786 phylogenetic profiles across a set of 659 bacterial genomes highlights linkages between genes with common functions, including many patterns that would not have been identified under a non-phylogenetic model of common distribution. We also applied the CCM to 44 proteins in the well-studied Mitochondrial Respiratory Complex I and recovered associations that mapped well onto the structural associations that exist in the complex. Methods The Community Coevolution Model (CCM), is a new coevolutionary model to analyze the evolutionary associations among binary traits. Supplementary Figures and Tables files: Supplementary Figures and Tables Table S3_Predictions_Unannotated_Genes_LZ The files contain the figures and tables that are mentioned in the CCM paper. LZ data sets: Data file: phylogenetic tree (LZ) Data file: profile matrix (LZ) The draft assembly of the bacterium “Lachnospiraceae bacterium 3-1-57FAACT1” (abbreviated as LZ), was isolated from a biopsy retrieved from the transverse colon of a female Crohn’s Disease patient at the time of colonoscopy (Liu et al. 2018). 658 completed and draft genomes from class Clostridia were retrieved from the National Center for Biotechnology Information (NCBI) for the comparative analysis of LZ. The phylogenetic tree was built through the AMPHORA2 pipeline (Wu and Scott 2012) and RAxML-HPC (Stamatakis 2006) using their concatenated, conserved protein sequences and another set of eight outgroup genomes from class Bacilli and phyla Actinobacteria and Proteobacteria were used for rooting. The phylogenetic profiles were constructed by comparing the complete set of LZ against all other genomes using rapsearch (Ye et al. 2011).
创建时间:
2022-08-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作