five

Data from: TreeFix: statistically informed gene tree error correction using species trees

收藏
DataONE2012-08-30 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Accurate gene tree reconstruction is a fundamental problem in phylogenetics, with many important applications. However, sequence data alone often lack enough information to confidently support one gene tree topology over many competing alternatives. Here, we present a novel framework for combining sequence data and species tree information, and we describe an implementation of this framework in TreeFix, a new phylogenetic program for improving gene tree reconstructions. Given a gene tree (preferably computed using a maximum likelihood phylogenetic program), TreeFix finds a "statistically equivalent" gene tree that minimizes a species tree based cost function. We have applied TreeFix to two clades of 12 Drosophila and 16 fungal genomes, as well as to simulated phylogenies, and show that it dramatically improves reconstructions compared to current state-of-the-art programs. Given its accuracy, speed, and simplicity, TreeFix should be applicable to a wide range of analyses and have many important implications for future investigations of gene evolution. The source code and a sample dataset are available at http://compbio.mit.edu/treefix.

精准基因树重建是系统发育学中的基础性核心问题,具备诸多重要应用价值。然而,仅依赖序列数据往往无法提供足够信息,以确切判定某一基因树拓扑结构优于诸多竞争性备选方案。本文提出一种整合序列数据与物种树信息的全新框架,并介绍了该框架在TreeFix中的实现——TreeFix是一款用于优化基因树重建结果的新型系统发育程序。给定一棵基因树(优先通过最大似然系统发育程序构建),TreeFix可寻找到一棵"统计等价"的基因树,该树可最小化基于物种树的代价函数。我们将TreeFix应用于两个演化支,分别包含12种果蝇属(Drosophila)物种与16种真菌基因组,同时还将其应用于模拟系统发育数据集,结果显示相较于当前主流顶尖程序,TreeFix可显著提升基因树重建的精度。鉴于其高精度、高速度与简洁易用性,TreeFix可适用于诸多分析场景,并将为未来的基因演化研究带来诸多重要启示。源代码与示例数据集可于http://compbio.mit.edu/treefix获取。
创建时间:
2012-08-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作