Data from: Counting with DNA in metabarcoding studies: how should we convert sequence reads to dietary data?

DataONE2018-05-31 更新2024-06-08 收录

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

官方服务：

资源简介：

Advances in DNA sequencing technology have revolutionised the field of molecular analysis of trophic interactions and it is now possible to recover counts of food DNA sequences from a wide range of dietary samples. But what do these counts mean? To obtain an accurate estimate of a consumer’s diet should we work strictly with datasets summarising frequency of occurrence of different food taxa, or is it possible to use relative number of sequences? Both approaches are applied to obtain semi-quantitative diet summaries, but occurrence data is often promoted as a more conservative and reliable option due to taxa-specific biases in recovery of sequences. We explore representative dietary metabarcoding datasets and point out that diet summaries based on occurrence data often overestimate the importance of food consumed in small quantities (potentially including low-level contaminants) and are sensitive to the count threshold used to define an occurrence. Our simulations indicate that using relative read abundance (RRA) information often provide a more accurate view of population-level diet even with moderate recovery biases incorporated; however, RRA summaries are sensitive to recovery biases impacting common diet taxa. Both approaches are more accurate when the mean number of food taxa in samples is small. The ideas presented here highlight the need to consider all sources of bias and to justify the methods used to interpret count data in dietary metabarcoding studies. We encourage researchers to continue addressing methodological challenges, and acknowledge unanswered questions to help spur future investigations in this rapidly developing area of research.

DNA测序技术的进步彻底革新了营养相互作用的分子分析领域，如今已可从各类饮食样本中获取食物DNA序列的计数数据。但这些计数数据代表了什么？若要准确估算消费者的饮食组成，我们是应仅基于汇总不同食物类群出现频率的数据集开展研究，还是可使用序列相对数量？目前两种方法均被用于获取半定量饮食概况，但由于类群特异性的序列回收率偏差，出现频率数据常被视作更为保守且可靠的选择。我们对典型的饮食元条形码（metabarcoding）数据集展开分析后发现，基于出现频率数据得到的饮食概况往往会高估少量摄入食物（可能包括低水平污染物）的重要性，且对用于定义“出现”的计数阈值十分敏感。模拟实验结果表明，即便纳入了中等程度的回收率偏差，使用相对读长丰度（relative read abundance, RRA）信息通常能更准确地反映种群水平的饮食组成；不过，相对读长丰度概况对影响常见饮食类群的回收率偏差更为敏感。当样本中平均食物类群数量较少时，两种方法的准确性均会提升。本文提出的观点凸显了考量所有偏差来源，并为解释饮食元条形码研究中计数数据所采用的方法提供依据的必要性。我们呼吁研究者持续攻克方法学难题，并阐明尚未解决的问题，以推动这一快速发展的研究领域未来的探索工作。

创建时间：

2018-05-31

5,000+

优质数据集

54 个

任务类型

进入经典数据集