Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies
收藏DataCite Commons2026-03-05 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.w3r2280w0
下载链接
链接失效反馈官方服务:
资源简介:
Pathogen diversity resulting in quasispecies can enable persistence and
adaptation to host defenses and therapies. However, accurate quasispecies
characterization can be impeded by errors introduced during sample
handling and sequencing which can require extensive optimizations to
overcome. We present complete laboratory and bioinformatics workflows to
overcome many of these hurdles. The Pacific Biosciences single molecule
real-time platform was used to sequence PCR amplicons derived from cDNA
templates tagged with universal molecular identifiers (SMRT-UMI).
Optimized laboratory protocols were developed through extensive testing of
different sample preparation conditions to minimize between-template
recombination during PCR and the use of UMI allowed accurate template
quantitation as well as removal of point mutations introduced during PCR
and sequencing to produce a highly accurate consensus sequence from each
template. Handling of the large datasets produced from SMRT-UMI sequencing
was facilitated by a novel bioinformatic pipeline, Probabilistic Offspring
Resolver for Primer IDs (PORPIDpipeline), that automatically filters and
parses reads by sample, identifies and discards reads with UMIs likely
created from PCR and sequencing errors, generates consensus sequences,
checks for contamination within the dataset, and removes any sequence with
evidence of PCR recombination or early cycle PCR errors, resulting in
highly accurate sequence datasets. The optimized SMRT-UMI sequencing
method presented here represents a highly adaptable and established
starting point for accurate sequencing of diverse pathogens. These methods
are illustrated through characterization of human immunodeficiency virus
(HIV) quasispecies.
提供机构:
Dryad
创建时间:
2023-12-07



