five

Data from: The evolution of phylogeographic datasets

收藏
DataONE2015-02-09 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Empirical phylogeographic studies have progressively sampled greater numbers of loci over time, in part motivated by theoretical papers showing that estimates of key demographic parameters improve as the number of loci increases. Recently, next-generation sequencing has been applied to questions about organismal history, with the promise of revolutionizing the field. However, no systematic assessment of how phylogeographic datasets have changed over time with respect to overall size and information content has been performed. Here, we quantify the changing nature of these genetic datasets over the past 20 years, focusing on papers published in Molecular Ecology. We found that the number of independent loci, the total number of alleles sampled, and the total number of single nucleotide polymorphisms (SNPs) per dataset has improved over time, with particularly dramatic increases within the past five years. Interestingly, uniparentally-inherited organellar markers (e.g., animal mitochondrial and plant chloroplast DNA) continue to represent an important component of phylogeographic data. Single-species studies (cf. comparative studies) that focus on vertebrates (particularly fish and to some extent, birds) represent the gold standard of phylogeographic data collection. Based on the current trajectory seen in our survey data, forecast modelling indicated that the median number of SNPs per dataset for studies published by the end of the year 2016 may approach ~20,000. This survey provides baseline information for understanding the evolution of phylogeographic datasets, and underscores the fact that development of analytical methods for handling very large genetic datasets will be critical for facilitating growth of the field.

经验性系统发生地理学(phylogeography)研究随着时间推移,逐步采用了更多数量的基因座(locus,复数loci)样本,这在一定程度上源于理论研究表明,随着基因座数量增加,关键种群统计参数的估算精度会得到提升。近年来,新一代测序技术(next-generation sequencing)已被应用于生物演化历史相关研究,有望推动该领域的革命性变革。然而,目前尚未有系统性评估,用以分析系统发生地理学数据集随时间在整体规模与信息含量层面的变化情况。本研究针对过去20年间发表于《分子生态学》(Molecular Ecology)的相关论文,量化分析了这类遗传数据集的动态变化特征。研究结果显示,随着时间推移,每个数据集的独立基因座数量、采样等位基因总数以及单核苷酸多态性(single nucleotide polymorphisms, SNPs)总数均有所提升,近五年的增长尤为显著。值得注意的是,单亲遗传的细胞器标记(如动物线粒体DNA与植物叶绿体DNA)仍是系统发生地理学数据的重要组成部分。以脊椎动物(尤其是鱼类,在一定程度上也包括鸟类)为研究对象的单物种研究(相较于比较研究),是系统发生地理学数据采集的金标准。基于本研究调研数据中的当前发展趋势,预测模型显示,截至2016年底发表的相关研究中,每个数据集的单核苷酸多态性中位数可能接近20000个。本调研为理解系统发生地理学数据集的演化提供了基准信息,同时强调了以下核心事实:开发用于处理超大型遗传数据集的分析方法,对推动该领域的发展至关重要。
创建时间:
2015-02-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作