five

Data from: A fuzzy-set-theory-based approach to analyze species membership in DNA barcoding

收藏
DataONE2011-04-04 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Reliable assignation of an unknown query sequence to its correct species remains a methodological problem for the growing field of DNA barcoding. While great advances have been achieved recently, species identification from barcodes can still be unreliable if the relevant biodiversity has been insufficiently sampled. We here propose a new notion of species membership for DNA barcoding - fuzzy membership, based on fuzzy set theory - and illustrate its successful application to four real datasets (bats, fishes, butterflies and flies) with more than 5000 random simulations. Two of the datasets comprise especially dense species/population level samples. In comparison with current DNA barcoding methods, the newly proposed minimum distance (MD) plus fuzzy set approach, and another computationally simple method, “best close match”, outperform two computationally sophisticated Bayesian and BootstrapNJ methods. The new method proposed here has great power in reducing false positive species identification compared with other methods when conspecifics of the query are absent from the reference database.

将未知查询序列可靠地分配至其对应物种,仍是快速发展的DNA条形码(DNA barcoding)领域面临的方法论难题。尽管近期该领域已取得显著进展,但倘若相关生物多样性的采样覆盖不足,基于条形码的物种识别结果仍可能存在可靠性问题。本文提出一种面向DNA条形码的物种归属新框架——基于模糊集理论(fuzzy set theory)的模糊归属,并将其成功应用于蝙蝠、鱼类、蝴蝶与蝇类共四个真实数据集,开展了超过5000次随机模拟实验。其中两个数据集包含极为密集的物种种群级样本。相较于现有DNA条形码分析方法,本文新提出的最小距离(minimum distance, MD)结合模糊集方法,以及另一种计算简便的“最优近缘匹配(best close match)”方法,其性能优于两种计算复杂度较高的贝叶斯(Bayesian)与BootstrapNJ方法。当参考数据库中不存在查询序列的同种个体时,本文提出的新方法在降低假阳性物种识别率方面,相较其他方法展现出更优异的性能。
创建时间:
2011-04-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作