DADA2 formatted taxonomy from GTDBr95
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4392221
下载链接
链接失效反馈官方服务:
资源简介:
DADA2 requires taxonomy files in a specific format. This datasets are the files required to assign taxonomy using Genome Taxonomy Database (GTDB) database.
GTDB release 95.0 is the latest version of database released on July 17th, 2020. GTDB-r95 contains 30,238 bacterial and 1,672 archaeal species clusters which span 194,600 genomes. FASTA file of 16S rRNA gene sequences identified within the representative genomes of bacteria (21965) and Archaea (1126) were downloaded from this link on 24-12-2020. The link provides resources for GTDB species representatives hence, limiting one sequence per organism. The sequence headers were modified according to DADA2 requirements using regular expression based replace in Notepad++ (I was too lazy to do the same through awk/sed).
Files GTDBr95-Genus.fna and GTDBr95-Species.fna are to be used with assignTaxonomy and addSpecies commands of DADA2, respectively.
Prepared files were checked and found compatible when run on DADA2 v1.14.0 (R v3.6.3).
创建时间:
2020-12-24



