five

Phylogenomic dataset, Core-genome alignment and Species tree of 155 Bradyrhizobia

收藏
Figshare2021-04-08 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Phylogenomic_dataset_Core-genome_alignment_and_Species_tree_of_155_Bradyrhizobia/14388440
下载链接
链接失效反馈
官方服务:
资源简介:
Result of task 05 of the bioinformatic pipeline Pantagruel (Lassalle et al., 2019) for the reconstruction of the species tree of 155 Bradyrhizobium spp. and close relatives based on their core-genome.The genomes listed in 'assemblies_list' were downloaded from the National Center for Biotechnology Information (NCBI) RefSeq or GenBank databases and used as input for the bioinformatic pipeline Pantagruel to build a phylogenomic database. In short, coding sequences (CDSs) and the corresponding protein sequences were extracted from the RefSeq or GenBank annotation and then clustered into homologous gene families using MMSeqs2 (Steinegger and Söding, 2017). Homologues were aligned with clustal Omega (Sievers et al., 2011) and reverse-translated into CDS alignment with PAL2NAL (Suyama et al., 2006). 453 single-copy core gene families were selected based on their presence in at least 153 of the 155 genomes, thus allowing for rare losses or incomplete genome sequences. CDS alignments for these families were concatenated, resulting in 378 951 aligned nucleotide positions. From this core alignment, a maximum-likelihood (ML) tree was inferred using RAxML (v8.2.4) (Stamatakis et al., 2014) under the GTRCATX model, and branch lengths were then refined under the GTRGAMMAX model, while branch supports were estimated from 200 rapid bootstrap trees.Parameters for the replication of the Pantagruel run are provided in the shell script file 'environ_pantagruel_Brady2019.sh'.
创建时间:
2021-04-08
二维码
社区交流群
二维码
科研交流群
商业服务