Biodiversity Soup II: A bulk‐sample metabarcoding pipeline emphasizing error reduction
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.ncjsxksrc
下载链接
链接失效反馈官方服务:
资源简介:
1. Despite widespread recognition of its great promise to aid
decision-making in environmental management, the applied use of
metabarcoding requires improvements to reduce the multiple errors that
arise during PCR amplification, sequencing, and library generation. We
present a co-designed wet-lab and bioinformatic workflow for metabarcoding
bulk samples that removes both false-positive (tag jumps, chimeras,
erroneous sequences) and false-negative (‘dropout’) errors. However, we
find that it is not possible to recover relative-abundance information
from amplicon data, due to persistent species-specific biases.
2. To present and validate our workflow, we created eight mock
arthropod soups, all containing the same 248 arthropod morphospecies but
differing in absolute and relative DNA
concentrations, and we ran them under five different PCR
conditions. Our pipeline includes qPCR-optimized PCR annealing temperature
and cycle number, twin-tagging, multiple independent PCR replicates per
sample, and negative and positive controls. In the bioinformatic portion,
we introduce Begum, which is a new version
of DAMe (Zepeda-Mendoza et al.
2016. BMC Res. Notes 9:255) that ignores
heterogeneity spacers, allows primer mismatches when demultiplexing
samples, and is more efficient.
Like DAMe, Begum removes tag-jumped reads and
removes sequence errors by keeping only sequences that appear in more than
one PCR above a minimum copy number per PCR. The filtering thresholds are
user-configurable. 3. We report that OTU
dropout frequency and taxonomic amplification bias are both reduced by
using a PCR annealing temperature and cycle number on the low ends of the
ranges currently used for the Leray-FolDegenRev primers.
We also report that tag jumps and erroneous sequences can be nearly
eliminated with Begum filtering, at the cost of only a
small rise in dropouts. We replicate published findings that uneven size
distribution of input biomasses leads to greater dropout frequency and
that OTU size is a poor predictor of species input biomass. Finally, we
find no evidence for ‘tag-biased’ PCR amplification. 4. To aid
learning, reproducibility, and the design and testing of alternative
metabarcoding pipelines, we provide our Illumina and input-species
sequence datasets, scripts, a spreadsheet for designing primer tags, and a
tutorial.
提供机构:
Dryad
创建时间:
2021-03-17



