five

Data from: RADcap: sequence capture of dual-digest RADseq libraries with identifiable duplicates and reduced missing data

收藏
DataONE2016-07-15 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Molecular ecologists seek to genotype hundreds to thousands of loci from hundreds to thousands of individuals at minimal cost per sample. Current methods, such as restriction site associated DNA sequencing (RADseq) and sequence capture, are constrained by costs associated with inefficient use of sequencing data and sample preparation. Here, we introduce RADcap, an approach that combines the major benefits of RADseq (low cost with specific start positions) with those of sequence capture (repeatable sequencing of specific loci) to significantly increase efficiency and reduce costs relative to current approaches. RADcap uses a new version of dual-digest RADseq (3RAD) to identify candidate SNP loci for capture bait design, and subsequently uses custom sequence capture baits to consistently enrich candidate SNP loci across many individuals. We combined this approach with a new library preparation method for identifying and removing PCR duplicates from 3RAD libraries, which allows researchers to process RADseq data using traditional pipelines, and we tested the RADcap method by genotyping sets of 96 to 384 Wisteria plants. Our results demonstrate that our RADcap method: (1) methodologically reduces (to <5%) and allows computational removal of PCR duplicate reads from data; (2) achieves 80-90% reads-on-target in 11 of 12 enrichments; (3) returns consistent coverage (≥4x) across >90% of individuals at up to 99.8% of the targeted loci; (4) produces consistently high occupancy matrices of genotypes across hundreds of individuals; and (5) costs significantly less than current approaches.

分子生态学家致力于以最低的单样本成本,对数百乃至数千个个体的数百至数千个基因座进行基因分型。当前的方法,如限制性位点相关DNA测序(restriction site associated DNA sequencing,RADseq)与序列捕获技术(sequence capture),均受限于测序数据利用效率低下以及样本制备相关的成本问题。本文提出RADcap方法,该方法整合了RADseq的核心优势(单样本成本低廉且具备特定起始位点)与序列捕获技术的优势(可对特定基因座进行可重复测序),相较现有方法可显著提升测序效率并降低成本。RADcap采用新一代双酶切RADseq(3RAD)技术,鉴定出用于捕获探针设计的候选单核苷酸多态性(single nucleotide polymorphism,SNP)基因座,随后通过定制化序列捕获探针实现在大量个体中对候选SNP基因座的稳定富集。本研究将该方法与一种新型文库制备方案相结合,该方案可从3RAD文库中鉴定并去除PCR重复序列,使研究人员能够借助传统分析流程处理RADseq数据;同时我们通过对96至384株紫藤(Wisteria)植株进行基因分型,对RADcap方法进行了验证。研究结果表明,RADcap方法具备以下优势:(1)在实验层面将PCR重复读段占比降至5%以下,且可通过计算方法从测序数据中移除此类重复读段;(2)在12次富集实验中的11次里,靶向读段占比可达80%至90%;(3)在多达99.8%的目标基因座上,超过90%的个体均可获得≥4倍的测序覆盖度;(4)可在数百个个体中生成一致性优异的基因型占据矩阵;(5)相较现有方法,测序成本显著降低。
创建时间:
2016-07-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作