Data from: APPLES: Scalable distance-based phylogenetic placement with or without alignments
收藏DataCite Commons2026-03-11 更新2025-04-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.78nf7dq
下载链接
链接失效反馈官方服务:
资源简介:
Placing a new species on an existing phylogeny has increasing relevance to
several applications. Placement can be used to update phylogenies in a
scalable fashion and can help identify unknown query samples using
(meta-)barcoding, skimming, or metagenomic data. Maximum likelihood (ML)
methods of phylogenetic placement exist, but these methods are not
scalable to reference trees with many thousands of leaves, limiting their
ability to enjoy benefits of dense taxon sampling in modern reference
libraries. They also rely on assembled sequences for the reference set and
aligned sequences for the query. Thus, ML methods cannot analyze datasets
where the reference consists of unassembled reads, a scenario relevant to
emerging applications of genome-skimming for sample identification. We
introduce APPLES, a distance-based method for phylogenetic placement.
Compared to ML, APPLES is an order of magnitude faster and more memory
efficient, and unlike ML, it is able to place on large backbone trees
(tested for up to 200,000 leaves). We show that using dense references
improves accuracy substantially so that APPLES on dense trees is more
accurate than ML on sparser trees, where it can run. Finally, APPLES can
accurately identify samples without assembled reference or aligned queries
using kmer-based distances, a scenario that ML cannot handle. APPLES is
available publically at github.com/balabanmetin/apples.
提供机构:
Dryad
创建时间:
2019-10-08



