five

Data from: A simple approach for maximizing the overlap of phylogenetic and comparative data

收藏
figshare.mq.edu.au2023-05-30 更新2025-01-15 收录
下载链接:
https://figshare.mq.edu.au/articles/dataset/Data_from_A_simple_approach_for_maximizing_the_overlap_of_phylogenetic_and_comparative_data/20044796/1
下载链接
链接失效反馈
官方服务:
资源简介:
Biologists are increasingly using curated, public data sets to conduct phylogenetic comparative analyses. Unfortunately, there is often a mismatch between species for which there is phylogenetic data and those for which other data are available. As a result, researchers are commonly forced to either drop species from analyses entirely or else impute the missing data. A simple strategy to improve the overlap of phylogenetic and comparative data is to swap species in the tree that lack data with ‘phylogenetically equivalent’ species that have data. While this procedure is logically straightforward, it quickly becomes very challenging to do by hand. Here, we present algorithms that use topological and taxonomic information to maximize the number of swaps without altering the structure of the phylogeny. We have implemented our method in a new R package phyndr, which will allow researchers to apply our algorithm to empirical data sets. It is relatively efficient such that taxon swaps can be quickly computed, even for large trees. To facilitate the use of taxonomic knowledge, we created a separate data package taxonlookup; it contains a curated, versioned taxonomic lookup for land plants and is interoperable with phyndr. Emerging online data bases and statistical advances are making it possible for researchers to investigate evolutionary questions at unprecedented scales. However, in this effort species mismatch among data sources will increasingly be a problem; evolutionary informatics tools, such as phyndr and taxonlookup, can help alleviate this issue. Usage Notes Land plant taxonomic lookup tableThis dataset is a stable version (version 1.0.1) of the dataset contained in the taxonlookup R package (see https://github.com/traitecoevo/taxonlookup for the most recent version). It contains a taxonomic reference table for 16,913 genera of land plants along with the number of recognized species in each genus.plant_lookup.csv

生物学家日益倾向于利用精心整理的公共数据集进行系统发育比较分析。遗憾的是,通常存在一个现象,即拥有系统发育数据的物种与拥有其他类型数据的物种之间存在不匹配。因此,研究人员通常不得不完全排除某些物种,或者对缺失的数据进行估算。为了提升系统发育数据与比较数据之间的重叠度,一种简单的策略是交换树中缺乏数据的物种与拥有数据的‘系统发育等效’物种。尽管这一程序在逻辑上看似简单,但手工操作很快就会变得极为复杂。在此,我们提出了利用拓扑学和分类学信息以最大化交换次数而不改变系统发育结构之算法。我们已在新的R包phyndr中实现了该方法,这使得研究人员能够将我们的算法应用于实证数据集。该方法相对高效,即便对于大型树状结构,也能够快速计算物种交换。为促进分类学知识的利用,我们创建了单独的数据包taxonlookup;它包含了一个针对陆地植物的精心整理、版本化的分类学查询表,并与phyndr兼容。新兴的在线数据库和统计学的进步使得研究人员能够在前所未有的规模上研究进化问题。然而,在这一过程中,数据源中物种的不匹配将越来越成为一个问题;诸如phyndr和taxonlookup之类的进化信息学工具可以帮助缓解这一问题。 使用说明 陆地植物分类查询表 本数据集是taxonlookup R包(详见https://github.com/traitecoevo/taxonlookup获取最新版本)中包含的数据集的稳定版本(版本1.0.1)。它包含了一个包含16,913个属的陆地植物分类参考表,以及每个属中认可物种的数量.plant_lookup.csv
提供机构:
Macquarie University
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作