Data from: Sorting specimen-rich invertebrate samples with cost-effective NGS barcodes: validating a reverse workflow for specimen processing

DataONE2018-01-11 更新2024-06-25 收录

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

官方服务：

资源简介：

Biologists frequently sort specimen-rich samples to species. This process is daunting when based on morphology, and disadvantageous if performed using molecular methods that destroy vouchers (e.g., metabarcoding). An alternative is barcoding every specimen in a bulk sample and then presorting the specimens using DNA barcodes, thus mitigating downstream morphological work on presorted units. Such a “reverse workflow” is too expensive using Sanger sequencing, but we here demonstrate that is feasible with an NGS barcoding pipeline that allows for cost-effective high throughput generation of short specimen-specific barcodes (313 bp of COI; lab cost <$0.50 per specimen) through Next Generation Sequencing of tagged amplicons. We applied our approach to a large sample of tropical ants, obtaining barcodes for 3290 of 4032 specimens (82%). NGS barcodes and their corresponding specimens were then sorted into molecular operational taxonomic units (mOTUs) based on objective clustering and Automated Barcode Gap Discovery (ABGD). High diversity of 88-90 mOTUs (4% clustering) was found and morphologically validated based on preserved vouchers. The mOTUs were overwhelmingly in agreement with morphospecies (match ratio 0.95 at 4% clustering). Because of lack of coverage in existing barcode databases, only 18 could be accurately identified to named species, but our study yielded new barcodes for 48 species, including 28 that are potentially new to science. With its low cost and technical simplicity, the NGS barcoding pipeline can be implemented by a large range of laboratories. It accelerates invertebrate species discovery, facilitates downstream taxonomic work, helps with building comprehensive barcode databases, and yields precise abundance information.

生物学家常需对富含标本的样本进行物种级分拣。基于形态学的分拣工作极具挑战性，而若采用会破坏凭证标本的分子方法（如元条形码（metabarcoding））则存在显著弊端。一种可行的替代方案是：对混合样本中的每一枚标本进行DNA条形码（DNA barcode）标记，随后借助条形码完成标本预分拣，以此降低后续需开展的形态学分拣工作量。此类「逆向工作流程」若采用桑格测序（Sanger Sequencing）实现成本高昂；本研究证实，借助下一代测序（Next Generation Sequencing, NGS）条形码流程可实现该方案的落地：通过对带标签的扩增子进行下一代测序，可高效且低成本地批量生成标本特异性短条形码（细胞色素c氧化酶亚基I（Cytochrome c oxidase I, COI）的313 bp片段，单标本实验室成本低于0.5美元）。我们将该方法应用于一批大型热带蚁类样本，成功为4032枚标本中的3290枚获取了有效条形码（成功率达82%）。随后基于客观聚类与自动化条形码间隙发现（Automated Barcode Gap Discovery, ABGD），将NGS条形码及其对应标本划分为分子操作分类单元（molecular operational taxonomic units, mOTUs）。研究共鉴定出88至90个mOTUs（以4%序列相似度聚类），并通过保存的凭证标本完成了形态学验证。mOTUs与形态物种的匹配度极高，在4%聚类阈值下匹配率达0.95。由于现有条形码数据库覆盖度不足，仅18个mOTUs可准确匹配到已命名物种；但本研究为48个物种获取了全新的条形码，其中28个可能为科学新类群。该NGS条形码流程成本低廉、操作简便，可被众多实验室推广应用。其可加快无脊椎动物物种发现速度、助力后续分类学研究、有助于构建全面的条形码数据库，同时可获取精准的物种丰度信息。

创建时间：

2018-01-11

5,000+

优质数据集

54 个

任务类型

进入经典数据集