Additional file 3 of Concatenation of paired-end reads improves taxonomic classification of amplicons for profiling microbial communities

Name: Additional file 3 of Concatenation of paired-end reads improves taxonomic classification of amplicons for profiling microbial communities
Creator: figshare
Published: 2021-10-13 03:48:26
License: 暂无描述

DataCite Commons2021-10-13 更新2024-07-28 收录

下载链接：

https://springernature.figshare.com/articles/dataset/Additional_file_3_of_Concatenation_of_paired-end_reads_improves_taxonomic_classification_of_amplicons_for_profiling_microbial_communities/16800212

下载链接

链接失效反馈

官方服务：

资源简介：

Additional file 3. Table S1. Mock community sequence information and trimmed length positions. Primer information includes the name of each primer in the literature and the associated sequence, read lengths of paired reads, length of the reads after length-trimming by median Q20 score, SRA and Bioproject accessions, and citation. Table S2. Bacteria present within each mock community. Table S3. DADA2 statistics for each pipeline per mock community. Input represents the number of reads used after the trimming and merging/concatenating steps; Filtered is the amount of sequences remaining after DADA2 filtering based on a maxee=2; Denoised is the number of sequences remaining after DADA2 denoising; Merged and % of input merged applies only to pipelines with merging by DADA2 (NMd, LMd , QMd and QdMd); Non-chimeric output sequences were passed into the taxonomic classifier; % output is the percent of sequences from the input left for taxonomic classification. Table S4. Genera that were originally determined as false positives (FP) but have synonyms of taxa from mock communities, and therefore were renamed to match taxa from mock communities (modified names now represent true positives – TPs). Table S5. True positive (TP), false positive (FP), and false negative (FN) genera that are shared and unique between GG and SILVA reference databases. Mocks are grouped per study. Table S6. TP, FP, and FN counts for each pipeline per mock along with the calculated precision, recall, and F-measure. SILVA reference database and an ASV abundance threshold of 0.01% was used to filter out low abundant ASVs before these calculations were performed. Pipelines arranged by descending F-measure score per mock. Table S7. True positive (TP), false positive (FP), and false negative (FN) bacterial genera found in each mock per pipeline with SILVA database.

提供机构：

figshare

创建时间：

2021-10-13