Estimation of species abundance based on the number of segregating sites using environmental DNA (eDNA)

NIAID Data Ecosystem2026-05-01 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.w3r2280zz

下载链接

链接失效反馈

官方服务：

资源简介：

The advancement of environmental DNA (eDNA) has enabled rapid and non-invasive species detection in aquatic environments. While most studies focus on detecting species presence or absence, recent research has explored using eDNA data to quantify species abundance. This estimation usually is based on the concentration of targeted eDNA. However, eDNA concentration can be influenced by various factors, both biotic and abiotic, which can obscure the relationship between concentration and species abundance. In this study, we suggest using the number of segregating sites as a proxy for estimating species abundance. We investigated this relationship in silico, in vitro, and in situ (mesocosm experiments) using two brackish goby species, Acanthogobius hasta and Tridentiger bifasciatus. Analysis of simulated and in vitro data, where DNA was mixed from a known number of individuals, revealed a strong correlation between the number of segregating sites and species abundance (R2 > 0.9; P < 0.01). Results from the mesocosm experiment confirmed this correlation (R2 = 0.70, P < 0.01). This correlation remained consistent despite biotic factors such as body size and feeding behavior of the fish (P > 0.05). Cross-validation tests demonstrated that the number of segregating sites predicts species abundance more accurately and reliably than eDNA concentration. In conclusion, the number of segregating sites is a precise and robust indicator of species abundance compared to eDNA concentration, offering a significant enhancement to the quantitative capabilities of eDNA technology. Methods We first assessed the relationship between the number of segregating sites and species abundance by entirely simulated sequences. The length of simulated sequences was set at 17,000 bp, close to the total size of 11 target segments. The number of simulated sequences/individuals was 1000, and sequences were generated at the mutation rate of 10-6 /bp/gen. To account for mutation rate variation among different species, we also generated another two datasets at the mutation rate of 10-7 /bp/gen and 10-8 /bp/gen. All data were generated using the software Fastsimcoal2 (Excoffier & Foll, 2011). A subset of sequences were randomly chosen from the simulated data, ranging from 20 to 980 sequences with intervals of 20. Selected sequences were aligned using MUSCLE v1.0 (Edgar, 2004), then the number of segregating sites was counted from alignments. The simulation process was repeated three times at each specified number of sequences. The correlation between the number of segregating sites and the number of individuals/sequences was estimated by regression analysis using Microsoft Excel, in which the number of segregating sites was the dependent variable (y), the number of individuals/sequences was the independent variable (x) and the significance of the correlation was estimated by R2, and P values.

创建时间：

2024-04-24