Indexed reference databases for KMA and CCMetagen
收藏Research Data Australia2024-12-14 收录
下载链接:
https://researchdata.edu.au/indexed-reference-databases-kma-ccmetagen/1371207
下载链接
链接失效反馈官方服务:
资源简介:
This database was built to identify taxa in metagenome samples using the CCMetagen pipeline. The whole NCBI nt collection allows a complete taxonomic overview, including from microbial eukaryotes that may be present in the dataset. This database is already indexed, ready to use with KMA and CCMetagen.
A manual describing how to use this dataset can be found at: https://github.com/vrmarcelino/CCMetagen
Additionally, a tutorial on the whole analysis of a set of metatranscriptome samples can be found at: https://github.com/vrmarcelino/CCMetagen/tree/master/tutorial
The database was built as follows:
The partially non-redundant nucleotide database was downloaded from the NCBI website (ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nt.gz) in January 2018. This database was formatted to include taxids in sequence headers.
Indexing was then performed with KMA using the commands:
kma_index -i nt_taxid.fas -o ncbi_nt -NI -Sparse TG
Three indexed databases are provided:
NCBI nucleotide collection
RefSeq database of bacterial and fungal genomes
提供机构:
The University of Sydney



