Datasets and tools: Genotypes, Tannin Capacity, and Seasonality Influence the Structure and Function of Symptomless Fungal Communities in Aspen Leaves, Regardless of Historical Nitrogen Addition
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10839668
下载链接
链接失效反馈官方服务:
资源简介:
The repository associated with the following study:
Genotype, Tannin Capacity, and Seasonality Influence the Structure and Function of Symptomless Fungal Communities in Aspen Leaves, Regardless of Historical Nitrogen Addition
Abu Bakar Siddique1, Abu Bakar Siddique2,3, Benedicte Riber Albrectsen2*, Lovely Mahawar2*
1. Department of Plant Biology, Swedish University of Agricultural Sciences, 75007, Uppsala, Sweden.
2. Umeå Plant Science Centre (UPSC), Department of Plant Physiology, Umeå University, 90187 Umeå, Sweden.
3. Tasmanian Institute of Agriculture (TIA), University of Tasmania, Prospect 7250, Tasmania, Australia.
*Correspondence: benedicte.albrectsen@umu.se & lovely.mahawar@umu.se
Data guidence:A reproducible and nextflow-based 'nf-core/ampliseq' pipeline was used for analyzing raw sequencing data, followed by Guild analysis and R analysis. A full summary report of the bioinformatic analysis (step-by-step methods and description) can be found as an HTML file named summary_report.html. Bioinformatic results and entire R analysis can be found as sub-folders within a zip folder named bioinformatic_and_ranalysis_submission.zip (please extract the zip folder or file if you downloaded). Guild analysis can be found in the 'guild' subfolder within the 'r_analysis' folder (see within the zip folder). R and statistical analyses were visualized with the quarto document; please refer to file r_analysis_script_full_run_final.qmd. For downsampled bioinformatic & R analysis see 'rarefy' subfolder.
Bioinformatics:Data was processed using nf-core/ampliseq version 2.11.0dev, revision ce811bec9b (doi: 10.5281/zenodo.1493841) (Straub et al., 2020) of the nf-core collection of workflows (Ewels et al., 2020), utilising reproducible software environments from the Bioconda (Grüning et al., 2018) and Biocontainers (da Veiga Leprevost et al., 2017) projects.
In brief, Raw Illumina data (MiSeq v3 2 _ 300 bp paired-end reads) were demultiplexed by SciLifeLab and delivered as sample specific fastq files (submitted on SRA: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1090416), that were individually quality checked with FastQC (Andrews, 2010).
Cutadapt (Marcel et al., 2011) trimmed primers and all untrimmed sequences were discarded. Sequences that did not contain primer sequences were considered artifacts. Less than 100% of the sequences were discarded per sample and a mean of 96.4% of the sequences per sample passed the filtering. Adapter and primer-free sequences were processed as one pool (pooled)with DADA2 (Callahan et al., 2016) to eliminate PhiX contamination, trim reads (forward reads at 223 bp and reverse reads at 162 bp, reads shorter than this were discarded), discard reads with > 2 expected errors, correct errors, merge read pairs, and remove polymerase chain reaction (PCR) chimeras; ultimately, 2199 amplicon sequencing variants (ASVs) were obtained across all samples. Between 55.56% and 100% reads per sample (average 82.3%) were retained. The ASV count table contained in total 32632582 counts, at least 1 and at most 964860 per sample (average 87020).
VSEARCH (Rognes et al., 2016) clustered 2199 ASVs into 770 centroids with pairwise identity of 0.97. Barrnap (Seemann, 2013) filtered ASVs for bac,arc,mito,euk (bac: Bacteria, arc: Archaea, mito: Mitochondria, euk: Eukaryotes), 5 ASVs were removed with less than 0.019999999999996% counts per sample (765 ASVs passed).
Taxonomic classification was performed by DADA2 and the database ‘UNITE general FASTA release for Fungi - Version 9.0’ (Abarenkov, Kessy; Zirk, Allan; Piirmann, Timo; Pöhönen, Raivo; Ivanov, Filipp; Nilsson, R. Henrik; Kõljalg, Urmas (2023): UNITE general FASTA release for Fungi. Version 18.07.2023. UNITE Community. https://doi.org/10.15156/BIO/2938067).
ASV sequences, abundance and DADA2 taxonomic assignments were loaded into QIIME2 (Bolyen et al., 2019). Of 765 ASVs, 160 were removed because the taxonomic string contained any of (mitochondria,chloroplast,archaea,bacteria), had fewer than 5 total read counts over all samples (Brown et al., 2015), were present in fewer than 2 samples (605 ASVs passed). Within QIIME2, the final microbial community data was visualized in a barplot.
Bioinformatic codes are saved in 'Github repository'. That means the github repository contains step-by-step descriptions of bioinformatic setup in HPC (computer cluster) and pipeline 'nf-core/ampliseq' execution.
Tools or software versions:
ASSIGNSH: python: 3.9.1 pandas: 1.1.5BARRNAP: barrnap: 0.9BARRNAPSUMMARY: python: Python 3.9.1COMBINE_TABLE_DADA2: R: 4.0.3CUTADAPT_BASIC: cutadapt: 4.6CUTADAPT_SUMMARY_STD: python: Python 3.8.3DADA2_DENOISING: R: 4.3.2 dada2: 1.30.0DADA2_ERR: R: 4.3.2 dada2: 1.30.0DADA2_FILTNTRIM: R: 4.3.2 dada2: 1.30.0DADA2_MERGE: R: 4.1.1 dada2: 1.22.0DADA2_RMCHIMERA: R: 4.3.2 dada2: 1.30.0DADA2_STATS: R: 4.3.2 dada2: 1.30.0DADA2_TAXONOMY: R: 4.3.2 dada2: 1.30.0FILTER_CLUSTERS: python: 3.9.1 pandas: 1.1.5FILTER_SSU: R: 4.0.3 Biostrings: 2.58.0FILTER_STATS: python: 3.9.1 pandas: 1.1.5FORMAT_TAXONOMY: bash: 5.0.16FORMAT_TAXRESULTS_STD: python: 3.9.1 pandas: 1.1.5ITSX_CUTASV: ITSx: 1.1.3MERGE_STATS_FILTERSSU: R: 4.3.2MERGE_STATS_FILTERTAXA: R: 4.3.2MERGE_STATS_STD: R: 4.3.2PHYLOSEQ: R: 4.3.2 phyloseq: 1.46.0QIIME2_BARPLOT: qiime2: 2023.7.0QIIME2_EXPORT_ABSOLUTE: qiime2: 2023.7.0QIIME2_EXPORT_RELASV: qiime2: 2023.7.0QIIME2_EXPORT_RELTAX: qiime2: 2023.7.0QIIME2_INASV: qiime2: 2023.7.0QIIME2_INSEQ: qiime2: 2023.7.0QIIME2_SEQFILTERTABLE: qiime2: 2023.7.0QIIME2_TABLEFILTERTAXA: qiime2: 2023.7.0RENAME_RAW_DATA_FILES: sed: 4.7VSEARCH_CLUSTER: vsearch: 2.21.1VSEARCH_USEARCHGLOBAL: vsearch: 2.21.1Workflow: nf-core/ampliseq: v2.11.0dev-g6549c5b Nextflow: 24.04.4
List of references (Tools):
Pipeline
nf-core/ampliseq
Straub D, Blackwell N, Langarica-Fuentes A, Peltzer A, Nahnsen S, Kleindienst S. Interpretations of Environmental Microbial Community Studies Are Biased by the Selected 16S rRNA (Gene) Amplicon Sequencing Pipeline. Front Microbiol. 2020 Oct 23;11:550420. doi: 10.3389/fmicb.2020.550420. PMID: 33193131; PMCID: PMC7645116.
nf-core
Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.
Nextflow
Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.
Pipeline tools
Core tools
FastQC'
Andrews S. FastQC: a quality control tool for high throughput sequence data. 2010. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Cutadapt
Marcel, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. journal 17.1 (2011): pp-10. doi: 10.14806/ej.17.1.200.
Barrnap
Seemann T. barrnap 0.9 : rapid ribosomal RNA prediction.
DADA2
Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods. 2016 Jul;13(7):581-3. doi: 10.1038/nmeth.3869. Epub 2016 May 23. PMID: 27214047; PMCID: PMC4927377.
Taxonomic classification and database (only one database)
Classification by QIIME2 classifier
Bokulich NA, Kaehler BD, Rideout JR, Dillon M, Bolyen E, Knight R, Huttley GA, Gregory Caporaso J. Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin. Microbiome. 2018 May 17;6(1):90. doi: 10.1186/s40168-018-0470-z. PMID: 29773078; PMCID: PMC5956843.
UNITE - eukaryotic nuclear ribosomal ITS region
Kõljalg U, Larsson KH, Abarenkov K, Nilsson RH, Alexander IJ, Eberhardt U, Erland S, Høiland K, Kjøller R, Larsson E, Pennanen T, Sen R, Taylor AF, Tedersoo L, Vrålstad T, Ursing BM. UNITE: a database providing web-based methods for the molecular identification of ectomycorrhizal fungi. New Phytol. 2005 Jun;166(3):1063-8. doi: 10.1111/j.1469-8137.2005.01376.x. PMID: 15869663.
Abarenkov, K; Zirk, Al; Piirmann, T; Pöhönen, R; Ivanov, F; Nilsson, RH; Kõljalg, U (2023): UNITE general FASTA release for Fungi. Version 18.07.2023. UNITE Community. https://doi.org/10.15156/BIO/2938067
Downstream analysis
QIIME2
Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al-Ghalith GA, Alexander H, Alm EJ, Arumugam M, Asnicar F, Bai Y, Bisanz JE, Bittinger K, Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ, Caraballo-Rodríguez AM, Chase J, Cope EK, Da Silva R, Diener C, Dorrestein PC, Douglas GM, Durall DM, Duvallet C, Edwardson CF, Ernst M, Estaki M, Fouquier J, Gauglitz JM, Gibbons SM, Gibson DL, Gonzalez A, Gorlick K, Guo J, Hillmann B, Holmes S, Holste H, Huttenhower C, Huttley GA, Janssen S, Jarmusch AK, Jiang L, Kaehler BD, Kang KB, Keefe CR, Keim P, Kelley ST, Knights D, Koester I, Kosciolek T, Kreps J, Langille MGI, Lee J, Ley R, Liu YX, Loftfield E, Lozupone C, Maher M, Marotz C, Martin BD, McDonald D, McIver LJ, Melnik AV, Metcalf JL, Morgan SC, Morton JT, Naimey AT, Navas-Molina JA, Nothias LF, Orchanian SB, Pearson T, Peoples SL, Petras D, Preuss ML, Pruesse E, Rasmussen LB, Rivers A, Robeson MS 2nd, Rosenthal P, Segata N, Shaffer M, Shiffer A, Sinha R, Song SJ, Spear JR, Swafford AD, Thompson LR, Torres PJ, Trinh P, Tripathi A, Turnbaugh PJ, Ul-Hasan S, van der Hooft JJJ, Vargas F, Vázquez-Baeza Y, Vogtmann E, von Hippel M, Walters W, Wan Y, Wang M, Warren J, Weber KC, Williamson CHD, Willis AD, Xu ZZ, Zaneveld JR, Zhang Y, Zhu Q, Knight R, Caporaso JG. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat Biotechnol. 2019 Aug;37(8):852-857. doi: 10.1038/s41587-019-0209-9. Erratum in: Nat Biotechnol. 2019 Sep;37(9):1091. PMID: 31341288; PMCID: PMC7015180.
Adonis and VEGAN
Marti J Anderson. A new method for non-parametric multivariate analysis of variance. Austral ecology, 26(1):32–46, 2001.
Jari Oksanen, F. Guillaume Blanchet, Michael Friendly, Roeland Kindt, Pierre Legendre, Dan McGlinn, Peter R. Minchin, R. B. O’Hara, Gavin L. Simpson, Peter Solymos, M. Henry H. Stevens, Eduard Szoecs, and Helene Wagner. vegan: Community Ecology Package. 2018. R package version 2.5-3.
Non-default tools
ITSx
Bengtsson-Palme, J., Ryberg, M., Hartmann, M., Branco, S., Wang, Z., Godhe, A., De Wit, P., Sánchez-García, M., Ebersberger, I., de Sousa, F., Amend, A., Jumpponen, A., Unterseher, M., Kristiansson, E., Abarenkov, K., Bertrand, Y.J.K., Sanli, K., Eriksson, K.M., Vik, U., Veldre, V. and Nilsson, R.H.. Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for analysis of environmental sequencing data. Methods Ecol Evol 2013, 4: 914-919. doi: 10.1111/2041-210X.12073.
Summarizing software
MultiQC
Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
Singularity
Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
创建时间:
2025-01-16



