five

Data_Sheet_2_Predictors of sequence capture in a large-scale anchored phylogenomics project.PDF

收藏
frontiersin.figshare.com2023-06-21 更新2025-01-15 收录
下载链接:
https://frontiersin.figshare.com/articles/dataset/Data_Sheet_2_Predictors_of_sequence_capture_in_a_large-scale_anchored_phylogenomics_project_PDF/21530328/1
下载链接
链接失效反馈
官方服务:
资源简介:
Next-generation sequencing (NGS) technologies have revolutionized phylogenomics by decreasing the cost and time required to generate sequence data from multiple markers or whole genomes. Further, the fragmented DNA of biological specimens collected decades ago can be sequenced with NGS, reducing the need for collecting fresh specimens. Sequence capture, also known as anchored hybrid enrichment, is a method to produce reduced representation libraries for NGS sequencing. The technique uses single-stranded oligonucleotide probes that hybridize with pre-selected regions of the genome that are sequenced via NGS, culminating in a dataset of numerous orthologous loci from multiple taxa. Phylogenetic analyses using these sequences have the potential to resolve deep and shallow phylogenetic relationships. Identifying the factors that affect sequence capture success could save time, money, and valuable specimens that might be destructively sampled despite low likelihood of sequencing success. We investigated the impacts of specimen age, preservation method, and DNA concentration on sequence capture (number of captured sequences and sequence quality) while accounting for taxonomy and extracted tissue type in a large-scale butterfly phylogenomics project. This project used two probe sets to extract 391 loci or a subset of 13 loci from over 6,000 butterfly specimens. We found that sequence capture is a resilient method capable of amplifying loci in samples of varying age (0–111 years), preservation method (alcohol, papered, pinned), and DNA concentration (0.020 ng/μl - 316 ng/ul). Regression analyses demonstrate that sequence capture is positively correlated with DNA concentration. However, sequence capture and DNA concentration are negatively correlated with sample age and preservation method. Our findings suggest that sequence capture projects should prioritize the use of alcohol-preserved samples younger than 20 years old when available. In the absence of such specimens, dried samples of any age can yield sequence data, albeit with returns that diminish with increasing age.

下一代测序技术(NGS)通过降低从多个标记或整个基因组生成序列数据所需的成本和时间,彻底改变了系统发育基因组学。此外,利用NGS技术,可以测序数十年前收集的生物标本的破碎DNA,从而减少收集新鲜标本的需求。序列捕获,亦称锚定杂交富集,是一种用于NGS测序的生成降低代表性文库的方法。该技术利用单链寡核苷酸探针,与通过NGS测序预选的基因组区域杂交,最终得到来自多个类群的众多同源位点数据集。利用这些序列进行的系统发育分析,有望解决深浅不同的系统发育关系。识别影响序列捕获成功率的因素,可以节省时间、金钱以及可能因测序成功率低而遭受破坏性采样的宝贵标本。在我们进行的规模庞大的蝴蝶系统发育基因组学项目中,我们研究了标本年龄、保存方法和DNA浓度对序列捕获(捕获序列数量和序列质量)的影响,同时考虑了分类学和提取的组织类型。该项目使用了两种探针集,从超过6000个蝴蝶标本中提取了391个位点或13个位点的子集。我们发现,序列捕获是一种具有弹性的方法,能够放大不同年龄(0-111年)、保存方法(酒精、纸贴、钉制)和DNA浓度(0.020 ng/μl - 316 ng/ul)的样本中的位点。回归分析表明,序列捕获与DNA浓度呈正相关。然而,序列捕获和DNA浓度与样本年龄和保存方法呈负相关。我们的研究结果表明,在可能的情况下,序列捕获项目应优先使用20岁以下酒精保存的标本。在没有此类标本的情况下,任何年龄的干燥标本都可以产生序列数据,尽管随着年龄的增长,回报会逐渐减少。
提供机构:
Frontiers
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作