Data from: Towards large-scale monitoring of biodiversity: a Human-Assisted Molecular Identification (HAMI) framework using metabarcoding while accounting for abundances and systemic errors
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10731928
下载链接
链接失效反馈官方服务:
资源简介:
Publication abstract
Our ability to monitor the biodiversity decline is affected by biases related to the scale of observation and species identification errors, often due to a lack of human expertise. To overcome this issue, DNA-based identification methods have been proposed as an effective way to conduct large-scale biodiversity monitoring. In particular, metabarcoding approaches, applied to mixed arthropod samples, have immense potential for fast and reliable assessment of large-scale biodiversity, but cannot provide reliable abundance estimates. In addition, widespread type I and type II errors (i.e., false positives and false negatives) are rarely taken into account in metabarcoding biodiversity assessments.
To overcome these obstacles, we introduce the Human-Assisted Molecular Identification (HAMI) framework, a rapid and reliable semi-automated molecular identification method based on a combination of metabarcoding and image-based parataxonomic expertise. This approach was tested on a highly diverse pilot group (beetles) within the 500-ENI network, a national biodiversity monitoring initiative covering more than 500 agricultural field margins in mainland France. We assessed the advantages of using HAMI over the exclusive use of molecular identification by examining 491 samples.
When relying exclusively on molecular approaches, on average 23% of the specific composition is missed, but the more species in a sample, the worse it is. This is mainly associated with primer bias and tissue quality. More, on average, 20% of the species identified by molecular-only approaches are false positive mostly linked to cross-sample contaminations. The remainder resulting from environmental contaminants or erroneous barcodes published as Coleoptera while they correspond to Wolbachia endosymbionts.
Molecular methodologies have the potential to perform large-scale monitoring and reduce knowledge gaps about community composition, species distribution and the anthropic factors impacting changes in arthropod biodiversity. However, our results underline that their use alone is insufficient. The combination of molecular methodologies and parataxonomic expertise in HAMI significantly reduces metabarcoding bias, identifies specimens requiring further investigation and erroneous barcodes, streamlines the time-consuming taxonomist expertise stage and provides abundance data. This method optimises the parataxonomist’s investment in key check points and could be generalised to other taxonomic groups for more efficient and reliable large-scale monitoring of biodiversity.
File description:
MiSeq raw sequences of the COI barcode from 491 Coleoptera field samples :
The Raw_sequencage_data ZIP directory contains the FASTQ files of the paired-end reads (R1: reads 1; R2: reads 2) produced for each Coleoptera field samples in duplicate using the MiSeq platform GenSeq (ISEM - University of Montpellier)
The HAMI_results_R zip directory contains Rmarkdown script files (.Rmd and .html) and associated data used to analyse the systemic errors of the metabarcoding approach (N= 491 Coleoptera field samples).
The HAMI_FRAMEWORK zip directory contains all the codes associated with the HAMI pipeline, as well as a ReadMe file and a test data set.
创建时间:
2024-03-01



