five

Data_Sheet_1_Predictors of sequence capture in a large-scale anchored phylogenomics project.PDF

收藏
frontiersin.figshare.com2023-06-19 更新2025-01-15 收录
下载链接:
https://frontiersin.figshare.com/articles/dataset/Data_Sheet_1_Predictors_of_sequence_capture_in_a_large-scale_anchored_phylogenomics_project_PDF/21530322/1
下载链接
链接失效反馈
官方服务:
资源简介:
Next-generation sequencing (NGS) technologies have revolutionized phylogenomics by decreasing the cost and time required to generate sequence data from multiple markers or whole genomes. Further, the fragmented DNA of biological specimens collected decades ago can be sequenced with NGS, reducing the need for collecting fresh specimens. Sequence capture, also known as anchored hybrid enrichment, is a method to produce reduced representation libraries for NGS sequencing. The technique uses single-stranded oligonucleotide probes that hybridize with pre-selected regions of the genome that are sequenced via NGS, culminating in a dataset of numerous orthologous loci from multiple taxa. Phylogenetic analyses using these sequences have the potential to resolve deep and shallow phylogenetic relationships. Identifying the factors that affect sequence capture success could save time, money, and valuable specimens that might be destructively sampled despite low likelihood of sequencing success. We investigated the impacts of specimen age, preservation method, and DNA concentration on sequence capture (number of captured sequences and sequence quality) while accounting for taxonomy and extracted tissue type in a large-scale butterfly phylogenomics project. This project used two probe sets to extract 391 loci or a subset of 13 loci from over 6,000 butterfly specimens. We found that sequence capture is a resilient method capable of amplifying loci in samples of varying age (0–111 years), preservation method (alcohol, papered, pinned), and DNA concentration (0.020 ng/μl - 316 ng/ul). Regression analyses demonstrate that sequence capture is positively correlated with DNA concentration. However, sequence capture and DNA concentration are negatively correlated with sample age and preservation method. Our findings suggest that sequence capture projects should prioritize the use of alcohol-preserved samples younger than 20 years old when available. In the absence of such specimens, dried samples of any age can yield sequence data, albeit with returns that diminish with increasing age.

下一代测序技术(NGS)通过降低从多个标记或整个基因组生成序列数据所需的时间和成本,彻底改变了系统发育基因组学。此外,数十年来收集的生物样本的破碎DNA也可以通过NGS进行测序,从而减少了对收集新鲜样本的需求。序列捕获,亦称锚定杂交富集,是一种用于生产适用于NGS测序的降低代表性文库的方法。该技术利用单链寡核苷酸探针与通过NGS测序的预先选定的基因组区域进行杂交,最终形成包含多个物种同源位点的大数据集。利用这些序列进行的系统发育分析有可能解决深层次和浅层次的系统发育关系。识别影响序列捕获成功率的因素可以节省时间、金钱以及可能因测序成功率低而遭受破坏性采样的宝贵样本。我们在一个大规模蝴蝶系统发育基因组学项目中,研究了样本年龄、保存方法和DNA浓度对序列捕获(捕获的序列数量和序列质量)的影响,并考虑了分类学和提取的组织类型。该项目使用两个探针集从超过6000个蝴蝶样本中提取了391个位点或13个位点的子集。我们发现,序列捕获是一种具有弹性的方法,能够放大不同年龄(0-111年)、保存方法(酒精、纸质、固定)和DNA浓度(0.020 ng/μl - 316 ng/ul)的样本中的位点。回归分析表明,序列捕获与DNA浓度呈正相关。然而,序列捕获和DNA浓度与样本年龄和保存方法呈负相关。我们的研究结果表明,当有可供选择时,序列捕获项目应优先考虑使用20岁以下酒精保存的样本。在缺乏此类样本的情况下,任何年龄的干燥样本都可以产生序列数据,尽管随着年龄的增长,回报率会逐渐降低。
提供机构:
Frontiers
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作