Unlocking natural history collections to improve eDNA reference databases and biodiversity monitoring
收藏DataONE2025-10-16 更新2025-10-25 收录
下载链接:
https://search.dataone.org/view/sha256:4239e9b7ba904d01bd99e4b5cca0d5cffe82b3c94b37afb646b5ca2ea237c384
下载链接
链接失效反馈官方服务:
资源简介:
Biodiversity changes due to human activities highlight the need for efficient biodiversity monitoring approaches. Environmental DNA (eDNA) metabarcoding offers a non-invasive method used for biodiversity monitoring and ecosystem assessment, but its accuracy depends on comprehensive DNA reference databases. Natural history collections often contain rare or difficult-to-obtain samples that can serve as a valuable resource to fill gaps in eDNA reference databases. Here, we discuss the utility of specimens from natural history collections in supporting future eDNA applications. Museomicsâthe application of -omics techniques to museum specimensâoffers a promising avenue for improving eDNA reference databases by increasing species coverage. Furthermore, museomics can provide transferable methodological advancements for extracting genetic material from samples with low and degraded DNA. The integration of natural history collections, museomics, and eDNA approaches has the potential to signific..., Dataset for analyzing the potential of museum specimens to improve the DNA reference database
To examine the cumulative number of species sequenced for a given DNA barcode/mitochondrial genome (also referred to as mitogenome) over the years, we retrieved all data available from NCBI using the R package rentrez v1.2.3 (Winter 2017). We searched the nucleotide database for the rRNA 12S, rRNA 16S, rRNA 18S, cytochrome B (cytB), cytochrome oxidase I (COI) barcodes, as well as for the complete mitogenomes for all fish orders. In addition, we also retrieved all the fish species with available data on the sequence read archive (SRA) using the Entrez Direct (Kans 2024), which provides access to the NCBI databases from a Unix terminal window.
To highlight the potential of museum specimens for increasing the number of species with an available barcode/mitogenome sequence, we first downloaded all available datasets on the Global Biodiversity Information Facility (GBIF) listing fish specimens store..., , # Unlocking natural history collections to improve eDNA reference databases and biodiversity monitoring
### Description of the data and file structure
The dataset consists of a main folder, data.zip.
**Various**
* kit_custom_prices.xlsx - price estimate for DNA extraction and ssDNA library prep using a commercial kit or the custom protocol from Nicolas Straube.
**barcodes_data**
output from the `cumul_barcodes_plot.R` script.
* *species_with_barcodes.csv* - list of all fishes (marine + freshwater) with a given barcode available, according to NCBI. (1) species name, (2) NCBI taxon ID, (3) date when the species sequence was first uploaded on NCBI, (4) marker of interest, (5) year the species sequence was first uploaded on NCBI.
**occurence_data**
contains a different type of list of species (museum, 12S availability, etc.)
* combined_gbif_species.csv - output from the script `museum_potential/1_process_gbif_datasets.R`. Contains all the species of fish found in the main natural ...,
人类活动引发的生物多样性变化凸显了对高效生物多样性监测方法的迫切需求。环境DNA(eDNA,Environmental DNA)元条形码技术为生物多样性监测与生态系统评估提供了一种非侵入式手段,但其准确性依赖于完善的DNA参考数据库。自然历史馆藏通常包含珍稀或难以获取的样本,可作为填补eDNA参考数据库空白的宝贵资源。本文探讨了自然历史馆藏标本在支撑未来eDNA应用方面的实用价值。博物馆组学(museomics)——即组学技术在博物馆标本中的应用——通过提升物种覆盖度,为优化eDNA参考数据库提供了一条极具前景的路径。此外,博物馆组学还可提供可迁移的方法学进展,用于从低质量、降解型DNA样本中提取遗传物质。自然历史馆藏、博物馆组学与eDNA技术的整合具有显著……潜力。本数据集用于分析博物馆标本优化DNA参考数据库的潜力。
为探究给定DNA条形码/线粒体基因组(mitogenome,mitochondrial genome)的逐年累计测序物种数量,我们借助R软件包rentrez v1.2.3(Winter 2017)检索了美国国家生物技术信息中心(National Center for Biotechnology Information, NCBI)的所有可用数据。我们在核苷酸数据库中检索了rRNA 12S、rRNA 16S、rRNA 18S、细胞色素B(cytB)、细胞色素氧化酶I(COI)条形码,以及所有鱼类目级分类单元的完整线粒体基因组。此外,我们还通过Entrez Direct工具(Kans 2024)——一款可通过Unix终端窗口访问NCBI数据库的工具——检索了序列读取档案(Sequence Read Archive, SRA)中存有数据的所有鱼类物种信息。
为凸显博物馆标本在提升可获取条形码/线粒体基因组序列的物种数量方面的潜力,我们首先下载了全球生物多样性信息设施(Global Biodiversity Information Facility, GBIF)上的所有可用数据集,这些数据集收录了鱼类馆藏标本……
# 解锁自然历史馆藏以优化eDNA参考数据库与生物多样性监测
### 数据与文件结构说明
本数据集包含一个主文件夹data.zip。
**通用文件**
* kit_custom_prices.xlsx:使用商业试剂盒或Nicolas Straube的自定义实验方案进行DNA提取与单链DNA(ssDNA,single-stranded DNA)文库制备的价格估算表。
**barcodes_data 文件夹**
存放`cumul_barcodes_plot.R`脚本的输出结果:
* *species_with_barcodes.csv*:根据NCBI数据库整理的所有鱼类(海洋+淡水)的已知条形码信息列表,各字段依次为:(1) 物种学名,(2) NCBI分类单元ID,(3) 该物种序列首次上传至NCBI的日期,(4) 目标分子标记,(5) 该物种序列首次上传至NCBI的年份。
**occurence_data 文件夹**
包含另一类物种列表(如博物馆馆藏物种、12S标记可用性等):
* combined_gbif_species.csv:脚本`museum_potential/1_process_gbif_datasets.R`的输出结果,收录了主要自然……中发现的所有鱼类物种。
创建时间:
2025-10-17



