Data from: Degenerate adaptor sequences for detecting PCR duplicates in reduced representation sequencing data improve genotype calling accuracy
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.34dt1
下载链接
链接失效反馈官方服务:
资源简介:
RAD-tag is a powerful tool for high-throughput genotyping. It relies on
PCR amplification of the starting material, following enzymatic digestion
and sequencing adaptor ligation. Amplification introduces duplicate reads
into the data, which arise from the same template molecule and are
statistically nonindependent, potentially introducing errors into genotype
calling. In shotgun sequencing, data duplicates are removed by filtering
reads starting at the same position in the alignment. However, restriction
enzymes target specific locations within the genome, causing reads to
start in the same place, and making it difficult to estimate the extent of
PCR duplication. Here, we introduce a slight change to the Illumina
sequencing adaptor chemistry, appending a unique four-base tag to the
first index read, which allows duplicate discrimination in aligned data.
This approach was validated on the Illumina MiSeq platform, using
double-digest libraries of ants (Wasmannia auropunctata) and yeast
(Saccharomyces cerevisiae) with known genotypes, producing modest though
statistically significant gains in the odds of calling a genotype
accurately. More importantly, removing duplicates also corrected for
strong sample-to-sample variability of genotype calling accuracy seen in
the ant samples. For libraries prepared from low-input degraded museum
bird samples (Mixornis gularis), which had low complexity, having been
generated from relatively few starting molecules, adaptor tags show that
virtually all of the genotypes were called with inflated confidence as a
result of PCR duplicates. Quantification of library complexity by adaptor
tagging does not significantly increase the difficulty of the overall
workflow or its cost, but corrects for differences in quality between
samples and permits analysis of low-input material.
提供机构:
Dryad
创建时间:
2014-08-08



