A fast machine-learning-guided primer design pipeline for selective whole genome amplification
收藏DataCite Commons2025-04-01 更新2025-04-09 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.3n5tb2rm2
下载链接
链接失效反馈官方服务:
资源简介:
Addressing many of the major outstanding questions in the fields of
microbial evolution and pathogenesis will require analyses of populations
of microbial genomes. Although population genomic studies provide the
analytical resolution to investigate evolutionary and mechanistic
processes at fine spatial and temporal scales – precisely the scales at
which these processes occur – microbial population genomic research is
currently hindered by the practicalities of obtaining sufficient
quantities of the relatively pure microbial genomic DNA necessary for
next-generation sequencing. Here we present swga2.0, an optimized and
parallelized pipeline to design selective whole genome amplification
(SWGA) primer sets. Unlike previous methods, swga2.0 incorporates active
and machine learning methods to evaluate the amplification efficacy of
individual primers and primer sets. Additionally, swga2.0 optimizes primer
set search and evaluates strategies, including parallelization at each
stage of the pipeline, to dramatically decrease program runtime from weeks
to minutes. Here we describe the swga2.0 pipeline, including the empirical
data used to identify primer and primer set characteristics, that improve
amplification performance. Additionally, we evaluated the novel swga2.0
pipeline by designing primers sets that successfully amplify Prevotella
melaninogenica, an important component of the lung microbiome in cystic
fibrosis patients, from samples dominated by human DNA.
提供机构:
Dryad
创建时间:
2022-08-29



