ML tree of 456 mapped 7PET genomes
收藏DataCite Commons2025-06-01 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/ML_tree_of_456_mapped_7PET_genomes/16595999/2
下载链接
链接失效反馈官方服务:
资源简介:
Maximum-likelihood phylogeny of the 456 mapped 7PET genomes.<br>For variant calling, Illumina short reads were mapped against the novel reference strain CNRVC190243 genome. We mapped all 242 short read sets from 2018-2019 Yemeni <i>V. cholerae </i>isolates, provided read mapping were mapped at a sufficient depth (see below); we also mapped read sets from 218 contextual <i>V. cholerae</i> isolates linked to 7PET-T13 sublineage. Reads were trimmed with Trimmomatic, mapped to both CNRVC190243 reference chromosomes with BWA-MEM. Mapped genomes with an average read depth below 5x over the two chromosomes (n = 4, all from the novel Yemen read sets) were deemed of insufficient read depth and were excluded, for a final set of 456 mapped <i>V. cholerae</i> 7PET genomes. We used the software suite samtools/bcftools v1.9 to call variants with a minimum coverage of 10x read depth, excluding indels. Resulting consensus sequences were combined and processed with snp-sites (Page et al., 2016) to produce a single nucleotide polymorphism (SNP) alignment featuring 2,092 positions. <br>Alternative hypotheses of tolpologies were formulated based on the distribution of branch supports. The topology of the ML tree ouput by RaxML-NG (file with suffix tag "full.rooted") was compared to topologies featured in files with the tags "full.rooted.H9hsisterH9g" and "full.rooted.H9hsisterH9guniteH9c". Shimodaira-Hasegawa test were conducted, showing that the "full.rooted.H9hsisterH9guniteH9c" had better likelihood, and this topology was retained for further analyses.<br>The tree in files which names include the keyword *full* include the 456 genomes; the tree in files which names include the keyword *subcladeH9* is a subtree of the former and include only 352/456 genomes corresponding to the 7PET-T13<b> </b>sublineage and close relatives.<br>BactDating v1.1 was used to estimate a timed phylogeny (using 100,000 Monte-Carlo Markov chain iterations and otherwise default parameters) of the Yemen 2016-2019 genomes and relatives using the ML mapped genome tree (restricted to the 7PET-T13 genome tips) and day-resolved dates as input; median day of the year of isolation was used for isolates where these data were missing.
提供机构:
figshare
创建时间:
2023-08-03



