five

Model trees and associated simulated nucleotide sequences for testing phylogenetic inference methods

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4034643
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains 142 tar.gz archive files, each containing nucleotide sequence data that have been simulated using INDELible for testing alignment-free phylogenetic inference methods. These datasets were generated by using the results (trees and model parameters) of 142 phylogenomic analyses of real-case data as model (available here). Initial sequence length was 5 Mbs, and an indel rate of 0.01 was set with indel length drawn from [1, 50000] according to a Zipf distribution with parameter 1.5 (see INDELible manual). Each archive contains the following files/directories: GTR.params.trees.tsv     a tab-delimited file summarizing the real-case GTR+Γ model parameters and the phylogenetic tree used to simulate the sequence dataset (gathered from https://zenodo.org/record/4034261) tax.tsv                  a tab-delimited file containing the initial (col 1) and simplified (col 2) taxon names model.nwk                a Newick-formatted file containing the initial model tree (gathered from GTR.params.trees.tsv) with simplified leaf names (following tax.tsv) control.txt              the INDELible input file used to simulate the evolution of a sequence along the tree in model.nwk seq/                     a directory containing the simulated sequences (one FASTA file per leaf in the tree in model.nwk) ___ Criscuolo A (2020) On the transformation of MinHash-based uncorrected distances into proper evolutionary distances for phylogenetic inference. F1000Research, 9:1309. doi:10.12688/f1000research.26930.1
创建时间:
2020-11-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作