five

Comparing partition and mixture models with Akaike information criteria

收藏
DataONE2026-01-29 更新2026-02-07 收录
下载链接:
https://search.dataone.org/view/sha256:39cf94783050e63162984c0f8ce8f6bfaea16da29554f5b56b9ee5047b058535
下载链接
链接失效反馈
官方服务:
资源简介:
Sophisticated phylogenetic models often include mixture and/or partition model components. It was recently noted that information criteria tend to favour partition models over mixture models even in some cases where the latter are misspecified and give poor topological estimation. We show that this problem arises because partition models and mixture models fundamentally differ in their probability calculations: mixture models calculate site-wise likelihoods as the marginal probability of the data averaging over parameter vectors that might have arisen at a site whereas partition model site likelihoods are calculated as the probability of the site pattern conditional upon a fixed assigned parameter vector at that site. These differing probability calculations lead to AIC estimates that are not comparable. We explore three generally applicable ways of correcting the issue. , , , # Comparing partition and mixture models with Akaike information criteria [https://doi.org/10.5061/dryad.3xsj3txrb](https://doi.org/10.5061/dryad.3xsj3txrb) ## Description of the data and file structure We have submitted simulated-data-sets.zip which contains a directory, simulated-data-sets, with simulated data sets from the paper. That directory has subdirectories named 0, 5, ..., 50 indicating the percentage of missclassified sites, each of which have further subdirectories, 1, 2, ..., 100 indicating which of the 100 simulated data sets/setting were considered. Each of these subdirectories have three files: part1.seqfile, part2.seqfile, concat.seqfile giving the sequence data for the first partition, the second partition and the concatenated data Simulation was from the Jukes-Cantor substitution model and an unrooted four taxon tree that has taxa labeled 0, 1, 2 and 3, split as 01|23. Data was simulated for two separate partitions, leading to part1.seqfile and part2.seqfile. For ...,
创建时间:
2026-01-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作