five

How many characters are needed to reconstruct a phylogeny?

收藏
DataONE2025-09-26 更新2025-10-04 收录
下载链接:
https://search.dataone.org/view/sha256:d1b8fb62cbfbce29c6a6b97d39e64b341fe0261cfa98b2b9f5a016ec2b61f28e
下载链接
链接失效反馈
官方服务:
资源简介:
Despite increased recent attention towards Bayesian phylogenetics and its applications in understanding macroevolutionary processes, it remains unclear how many discrete characters are needed to accurately estimate tree topologies in a Bayesian framework. This could be particularly relevant for morphological datasets used in phylogenetics, as they usually consist of few dozens to few hundreds of characters—orders of magnitude smaller than most molecular datasets. I designed a simulation study in the software RevBayes to explore how the number of sampled discrete characters affects accuracy and precision of Bayesian phylogenetic estimates, under various setups differing in number of taxa, average number of state changes per character (i.e., tree length), and number of states per character. Results indicate that between 100 and 500 variable characters are necessary to reach sufficient accuracy and precision of phylogenetic estimates for as low as 20 tips. All other parameters being equal,..., , # Dryad dataset Dataset DOI: [10.5061/dryad.63xsj3vd8](10.5061/dryad.63xsj3vd8) ## Description of the data and file structure This compressed file archive contains all data and scripts used in the simulation study, as well as scripts to check for MCMC convergence, calculate metrics of tree accuracy and precision, and to plot results. ### Files and variables #### File: SupplementaryDataForSubmission.zip **Description:**  `data` folder: contains data files used for the simulation study * `sim_trees` subfolder: Contains all trees used to simulated data, in Newick format. It is organized in additional subfolders depending on the number of taxa (tips) of the tree (5, 10, 20, 50, 100, or 200), and on the expected tree length (1, 3, or 10). * `Ntaxa` subfolders: Contain all simulated datasets in Nexus format, divided by number of taxa (N = 5, 10, 20, 50, 100, or 200), and with additional subfolders depending on the expected tree length (1, 3, or 10) and on the number of states per char...,
创建时间:
2025-09-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作