five

Species trees of 571 Rhizobiaceae genomes, with a focus on 41 Pseudorhizobium and Neorhizobium, based on core genome gene concatenates

收藏
Figshare2019-06-25 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/Species_trees_of_571_Rhizobiaceae_genomes_with_a_focus_on_41_Pseudorhizobium_and_Neorhizobium_based_on_core_genome_gene_concatenates/8316827
下载链接
链接失效反馈
官方服务:
资源简介:
Genomic dataset and gene family classificationWe assembled a complete bacterial genome dataset covering all known representative of the subgroup in the alphaproteobacterial families Rhizobiaceae and (sister group) Aurantimonadaceae. This dataset comprises all 564 genomes available from the NCBI RefSeq Assembly database on the 23 Apr 2018, filtering anomalous genomes and those with a contig N50 Reference species treesFrom the 571Rhizob genome dataset, we define the pseudo-core genome as genes occurring only in a single copy and present in at least 561 out of the 571 genomes (98%). The resulting pseudo-core genome gene set (thereafter referred as pCG571) includes 155 loci, which protein alignments were concatenated. This concatenated protein alignment was used to compute a reference species tree (SML571) with RAxML (Stamatakis 2014) under the model PROTCATLGX; branch supports were estimated by generating 200 rapid bootstraps under the same parameters. From the SML571 tree, we identified the well-supported clade grouping 41 genomes including all representative of Neorhizobium spp. and Pseudorhizobium spp. and our new isolates (dataset ‘41NeoPseudo’). To gain further phylogenetic resolution in this clade of interest, we restricted the pCG571 concatenated alignment to the 41 genomes of this smaller genomic dataset, which we used as input to the Phylobayes program for a more accurate (but computationally more expensive) Bayesian phylogenetic inference under the CAT-GTR+G4 model (Lartillot et al. 2007). This provided us with a robust non-ultrametric tree for the 41 genomes (SBA41). We finally used this SBA41 tree as a fixed input topology for Phylobayes to infer an ultrametric tree (unitless ‘time’ tree) under the CIR clock model (Lepage et al. 2007), further referred to as TBA41.
创建时间:
2019-06-25
二维码
社区交流群
二维码
科研交流群
商业服务