P. nodorum WA pangenome: supplementary material

Mendeley Data2024-01-31 更新2024-06-28 收录

下载链接：

https://figshare.com/articles/dataset/P_nodorum_WA_pangenome_supplementary_material/13325915/6

下载链接

链接失效反馈

官方服务：

资源简介：

Fig1. The structure and features of the Western Australian (WA) Parastagonospora nodorum population. The tree on the left shows the predicted phylogeny of WA and internationally-sampled P. nodorum isolates, with sampling time and locations indicated alongside. Locations are also indicated in the map of Australia (above). Sub-population groups from this and a previous study [47] are indicated alongside by colour and number. Effector presence-absence profiles (based on orthology) are indicated in black and white. Protein matches derived from a MetaEuk search against 5 known effectors – indicating effector protein isoform diversity – are shown in the right columns in red and white. Isoform seqeunce-similarity is indicated by trees above the column headings, with labels present for representative SNOO sequences used as MetaEuk queries, or SNOO sequences representing effector-like paralogs. Fig2. A circos plot showing SNP density over each of the 23 chromosomes in the SN15 genome assembly. The innermost track shows the proportion of bases covered by genes (CDS features, red) and transposable elements (TE, blue dots) in non-overlapping 10kb windows. For TEs, windows with TE base coverage less than 10% are not plotted. The heatmap shows SNP counts in 50 kb non-overlapping windows for each of the Western Australian isolates in the outer track (Table S8), with the colour scale boundaries set by the 10th, 50th, and 90th percentiles (50, 184, and 392, respectively). Fig3. A circos plot showing the proportion of RIP-like (CA↔TA or TG↔TA) mutations over transition (C↔T or G↔A) mutations for each of the 23 chromosomes in the SN15 genome assembly. The innermost track shows the proportion of bases covered by genes (CDS features) and transposable elements (TE) in 10kb non-overlapping windows. Windows with TE base coverage less than 10% are not plotted. The heatmap in the outermost track shows the proportion of RIP-like mutations over the number of transition mutations in 50 kb non-overlapping windows for each isolate (Table S8). Windows with fewer than 20 SNPs are plotted in white to avoid high ratios caused by a small number of RIP-like mutations. By chance, 25% of transition mutations would be expected to be part of a RIP-like dinucleotide pair change. Fig4. A circos plot showing each Parastagonospora nodorum genome assembly alignment coverage for each of the 23 chromosomes in P. nodorum SN15. The innermost track shows the proportion of bases covered by genes (CDS features) and transposable elements (TE) in 10kb non-overlapping windows. Windows with TE base coverage less than 10% are not plotted. The heatmap on the outside track shows average alignment coverage of each isolate genome assembly to SN15 in 50 kb non-overlapping windows (Table S8). Fig5. Dispensable and multi-copy orthogroups for each isolate in the P. nodorum pan-genome, omitting orthogroups present in all isolates. Heatmap rows represent each P. nodorum isolate, and columns indicate each of the dispensable or multicopy orthogroups. Heatmap colour indicates the number of copies of an orthogroup each isolate has. Orthogroups absent (blue), present with single copy genes (white) and present with multicopy genes (orange) are shown. Orthogroups corresponding to the ToxA (SNOO_16571A), Tox1 (SNOO_20078A), Tox3 (SNOO_08981AB) and Tox5 (SNOO_50320) loci are indicated below. Isolate collection locations are indicated by colour (right), and reference isolate IDs are also indicated (right). The top 3 scatter plots indicate orthogroups with any members with significant positive selection tests (p < 0.01). “dN/dS” indicates the overall dN/dS for the whole orthogroup. “dN/dS branch” indicates the dN/dS at the branch predicted to be under the highest selection. “dN/dS branch prop” indicates the proportion of sequences in the orthogroup predicted to be under positive selection. “Predector score” indicates where the highest scoring member of an orthogroup was greater than 0. Stab1. Additional published genomes used in this study. Stab2. Summary of Illumina sequencing read contamination detection. Stab3. Parameters used to filter short variants by quality, and statistics of variants in the filtered set. Stab4. Population diversity statistics and results of STRUCTURE analysis. Stab5. Genome assembly for all isolates sequenced in this study. Statistics were collected using BBtools stats and QUAST. Stab6. Summaries statistics of transposable elements, rRNA and tRNA genes, and repeat annotations for each assembled genome. Stab7. Summary statistics of gene predictions for each isolate. Numbers are provided for each prediction method. EVM refers to EvidenceModeler predictions. Stab8. SNP, counts RIP-like SNP ratios, and genome assembly alignment coverage data used to plot circular heatmaps in figures 2, 3, and 4. Stab9. Orthogroup counts for each isolate used to plot Fig5. Stab10. Functional annotation, selection, presence absence data for each orthogroup. Stab11. GO term and effector enrichment tests for predicted functions and groups of orthogroups. Sdat1. MultiQC reports of read trimming and quality control for Illumina sequencing reads. Sdat2. Boxplots showing short variant (SNP, insertion/deletion, Mixed) genotype quality (GQ) statistics for each isolate. Each chromosome in SN15 is shown on a separate page in the PDF. Sdat3. Violin plots showing short variant (SNP, insertion/deletion, Mixed) genotype read depth (DP) statistics for each isolate. Each chromosome in SN15 is shown on a separate page in the PDF. Sdat4. Bar plots showing amounts of missing short variant genotype information for each isolate. Each chromosome in SN15 is shown on a separate page in the PDF. Sdat5. SNP locus quality statistics visualised for each chromosome in SN15 on separate pages in the PDF. Sdat6. Insertion and Deletion (INDEL) locus quality statistics visualised for each chromosome in SN15 on separate pages in the PDF. Sdat7. Mixed variant (multi-nucleotide variations, or insertions/deletions with SNPs at the same locus) locus quality statistics visualised for each chromosome in SN15 on separate pages in the PDF. Sdat8. Kernel density estimate plots showing the distributions of short variant locus quality statistics. Sdat9. Maximum likelihood phylogenetic tree estimated from 45,194 SNPs using IQTree. Sdat10. MSA and trees of ToxA, 1, 3 CDS/codon-aligned regions from pan-genome, to support prevalence of RIP-like SNPs across pan-genome in confirmed effector loci Sdat11. Example dot plot alignments between scaffolds and chromosomes containing orthogroups in PAV clusters selected from Fig5. Sdat12. Phylogenetic tree estimated using ASTRAL, combining gene trees computed using FastTree from all single copy orthogroups. Sfig1. Phylogeographic representation of the WA P. nodorum populations. Sfig2. Tanglegram comparison of predicted SNP phylogeny with the SSR predicted tree from Phan et al. (2020). Sfig3. Comparison of population cluster assignment between this study and as identified by Phan et al. (2020). Sfig4. #isolates in clusters vs location. Sfig5. #isolates in clusters vs year. Sfig6. The first six principal components computed from bi-allelic SNPs vs location. Sfig7. The first six principal components computed from bi-allelic SNPs vs year. Sfig8. The first six principal components computed from bi-allelic SNPs vs year. Sfig9. Multi-gene tree computed by ASTRAL from all single copy orthogroups, compared with the tree predicted using SSR data presented by Phan et al. (2020). Sfig10. Plots showing relationship between RIP-like mutations and selection or effector predictions.

创建时间：

2024-01-31

5,000+

优质数据集

54 个

任务类型

进入经典数据集