five

Assessment of targeted enrichment locus capture across time and museums using odonate specimens

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.kprr4xh8z
下载链接
链接失效反馈
官方服务:
资源简介:
The use of gDNA isolated from museum specimens for high throughput sequencing, especially targeted sequencing in the context of phylogenetics, is a common practice. Yet, little understanding has been focused on comparing the quality of DNA and the results of sequencing museum DNAs. Dragonflies and damselflies are ubiquitous in freshwater ecosystems and are commonly collected and preserved insects in museum collections hence their use in this study. However, the history of odonate preservation across time and museums have resulted in wide variability in the success of viable DNA extraction, necessitating an assessment of their usefulness in genetic studies. Using Anchored Hybrid Enrichment probes, we sequenced DNA from samples at two museums, 48 from the American Museum of Natural History (AMNH) in NYC, USA, and 46 from the Naturalis Biodiversity Center (RMNH) in Leiden, Netherlands ranging from global collection localities and across a 120-year time span. We recovered at least 4 loci out of an >1000 locus probe set for all samples, with the average capture being ~385 loci. Neither specimen age nor size was a good predictor of locus capture, but recapture rates differed significantly between museums. Samples from the AMNH had lower overall locus capture than the RMNH, perhaps due to differences in specimen storage over time. Methods For taxon sampling, damselflies and dragonflies from the RMNH and AMNH were selected with an emphasis on having a breadth of sizes, families, and ages. We initially selected samples that ranged in age from 2001 (~20 years old) to 1909 (~112 years old). We chose 94 specimens; 64 were Anisoptera and 30 were Zygoptera, from 48 AMNH and 46 RMNH. Genomic DNA was isolated and sent to RAPID Genomics (Gainesville Florida) for library preparation and sequencing using Anchored Hybrid Enrichment probes. We trimmed adapters from raw reads for each sample using fastp and checked quality using multiQC. Following trimming, we assembled and assigned orthology to each targeted capture locus. Following assembly, we screened each locus for orthology by ensuring that the locus did not have BLAST hits to multiple places in the genome and, by ensuring the best reciprocal hits between the reference and the query sequence. We generated a multiple sequence alignment and concatenated the alignments using FASconCAT (Kück and Meusemann 2010) and generated an optimal partitioning scheme using relaxed clustering with the model fixed to GTR+G for each subset in IQtree v.2.1.3 (Minh et al. 2020). We selected a model for each subset in the partitioning scheme using ModelFinder and estimated a maximum likelihood tree with 1,000 ultrafast bootstrap replicates in IQtree.

利用从博物馆标本中提取的基因组DNA(gDNA)开展高通量测序,尤其是系统发育学研究中的靶向测序,已成为常规研究手段。然而,学界对博物馆来源DNA的质量差异及测序结果的对比分析仍关注较少。蜻蜓(dragonflies)与豆娘(damselflies)在淡水生态系统中分布广泛,亦是博物馆馆藏中最为常见的采集、保存昆虫类群,因此本研究选用该类群开展实验。但由于不同时期及不同博物馆的蜻蛉目(Odonata)标本保存历史存在差异,有效DNA提取的成功率差异显著,这使得评估其在遗传学研究中的应用价值成为必要。 本研究借助锚定杂交富集(Anchored Hybrid Enrichment)探针,对两家博物馆的标本DNA进行测序:其中美国纽约美国自然历史博物馆(AMNH)提供48份样本,荷兰莱顿自然生物多样性中心(RMNH)提供46份样本,这些样本的采集地覆盖全球,时间跨度达120年。所有样本均可从包含超1000个位点(locus)的探针组中捕获到至少4个位点,平均捕获位点约为385个。标本的保存时长与标本尺寸均无法有效预测位点捕获成功率,但不同博物馆的标本捕获率存在显著差异:美国自然历史博物馆的样本整体位点捕获率低于自然生物多样性中心,这可能与两家机构长期的标本存储方式差异有关。 实验方法 在类群采样环节,本研究选取了来自RMNH与AMNH的蜻蛉目标本,重点兼顾标本尺寸、类群科属以及保存时长的多样性。初始筛选的样本保存时长从2001年(约20年)至1909年(约112年)不等,最终共纳入94份标本:其中64份为差翅亚目(Anisoptera),30份为束翅亚目(Zygoptera),分别来自AMNH的48份与RMNH的46份。 基因组DNA被提取后,送至位于佛罗里达州盖恩斯维尔的RAPID Genomics公司进行文库制备与锚定杂交富集测序。我们使用fastp对每个样本的原始测序读段(reads)进行接头修剪,并通过MultiQC对测序质量进行评估。修剪完成后,我们对每个靶向捕获位点进行序列组装并确定其直系同源性(orthology)。组装完成后,我们通过两种方式筛选直系同源位点:一是确保该位点在基因组中无多区域BLAST比对结果,二是确保参考序列与查询序列间存在最佳双向比对结果。我们使用FASconCAT(Kück与Meusemann 2010)生成多序列比对并将比对结果串联,随后使用IQtree v.2.1.3(Minh等人2020),通过松弛聚类法为每个子集设置固定模型GTR+G,以生成最优分区方案。我们利用ModelFinder为分区方案中的每个子集选择最优模型,并在IQtree中基于1000次超快速自助法(ultrafast bootstrap)重复抽样,构建最大似然系统发育树。
创建时间:
2023-05-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作