five

RADcap: sequence capture of dual-digest RADseq libraries with identifiable duplicates and reduced missing data

收藏
NIAID Data Ecosystem2026-03-09 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.ss6c9
下载链接
链接失效反馈
官方服务:
资源简介:
Molecular ecologists seek to genotype hundreds to thousands of loci from hundreds to thousands of individuals at minimal cost per sample. Current methods, such as restriction site associated DNA sequencing (RADseq) and sequence capture, are constrained by costs associated with inefficient use of sequencing data and sample preparation. Here, we introduce RADcap, an approach that combines the major benefits of RADseq (low cost with specific start positions) with those of sequence capture (repeatable sequencing of specific loci) to significantly increase efficiency and reduce costs relative to current approaches. RADcap uses a new version of dual-digest RADseq (3RAD) to identify candidate SNP loci for capture bait design, and subsequently uses custom sequence capture baits to consistently enrich candidate SNP loci across many individuals. We combined this approach with a new library preparation method for identifying and removing PCR duplicates from 3RAD libraries, which allows researchers to process RADseq data using traditional pipelines, and we tested the RADcap method by genotyping sets of 96 to 384 Wisteria plants. Our results demonstrate that our RADcap method: (1) methodologically reduces (to <5%) and allows computational removal of PCR duplicate reads from data; (2) achieves 80-90% reads-on-target in 11 of 12 enrichments; (3) returns consistent coverage (≥4x) across >90% of individuals at up to 99.8% of the targeted loci; (4) produces consistently high occupancy matrices of genotypes across hundreds of individuals; and (5) costs significantly less than current approaches.

分子生态学家致力于以最低单样本成本,对数百至数千个个体的数百至数千个基因座(loci)开展基因分型工作。当前主流方法如限制性酶切位点相关DNA测序(RADseq)与序列捕获(sequence capture),均受限于测序数据利用效率低下与样本制备流程带来的高昂成本。本研究推出RADcap技术,该方法整合了RADseq的核心优势(成本低廉且具备特异性起始位点)与序列捕获的优势(可对特定基因座进行可重复测序),相较现有方法可显著提升测序效率并降低成本。RADcap采用新一代双酶切RADseq技术(3RAD)筛选可用于捕获探针设计的候选单核苷酸多态性(SNP)基因座,随后通过定制化序列捕获探针,实现对大量个体的候选SNP基因座的一致性富集。本研究将该技术与一种新型文库制备方法相结合,该方法可从3RAD文库中识别并去除PCR重复读段(PCR duplicates),研究人员可借此通过传统分析流程处理RADseq数据;同时,本研究通过对96至384株紫藤属植物进行基因分型,验证了RADcap技术的有效性。研究结果表明,RADcap技术具备以下优势:(1) 从实验层面将数据中的PCR重复读段占比降至5%以下,并可通过计算方法移除此类读段;(2) 在12次富集实验中的11次实现了80%~90%的靶区读段占比;(3) 在高达99.8%的目标基因座上,超过90%的个体均可获得≥4倍的测序深度;(4) 在数百个个体中均可生成一致性极高的基因型覆盖矩阵;(5) 相较现有方法成本显著降低。
创建时间:
2016-07-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作