Investigation of machine learning algorithms for taxonomic classification of marine metagenomes
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7429931
下载链接
链接失效反馈官方服务:
资源简介:
Training, testing, and blind datasets used for machine learning algorithms for taxonomic classification of marine metagenomes:
K12.kmers.txt - 12bp k-mer vocabulary constructed by Jellyfish v1.1.11 from 47,894 genomes in GTDB release 202
MarRef_1.6.tsv - Metadata file downloaded from MarRef v1.6
MarRef.genustrain.fasta - Training set from MarRef v1.6 (seed=808) used for genus classification
MarRef.genustest.fasta - Testing set from MarRef v1.6 (seed=747) used for genus classification
MarRef.speciestrain.fasta - Training set from MarRef v1.6 (seed=808) used for species classification
MarRef.speciestest.fasta - Testing set from MarRef v1.6 (seed=747) used for species classification
MarRef.traintest.key.tsv - Table containing MarRef accession, GenBank accession, GenBank taxonomy ID, taxonomic information, and labels used for species and genus testing and training
anonymous_reads_*.fq - Blind datasets (1-10) in interleaved fastq format
reads_mapping_*.tsv - Key for blind datasets 1-10. Each sequence header is mapped to its corresponding MarRef accession and NCBI taxonomic ID.
创建时间:
2022-12-15



