Fast and accurate bootstrap confidence limits on genome-scale phylogenies using little bootstraps
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Fast_and_accurate_bootstrap_confidence_limits_on_genome-scale_phylogenies_using_little_bootstraps/14130494
下载链接
链接失效反馈官方服务:
资源简介:
Data Description
Simulated_Data:
This directory contains a simulated DNA sequence alignment (Simulated_data.fasta)
with 446 sequences and 134131 sites. This dataset was simulated under auto
correlated evolutionary rates among lineages. A set of 100 genes were simulated
using a wide range of realistic biological parameters. The Simulated_data is the concatenation of
these simulated genes. The tree file (Sim_candiate_tree.nwk) is the candidate
tree for this dataset. Fig 2(a) –(d) in the main text have been produced by
analyzing this dataset.
Small_datasets:
This directory contains three single-gene DNA sequence datasets with 446 species
and 4000 to 10000 sites. The tree file (Small_data_candidate_tree.nwk) is used
as a candidate tree for the analysis. Fig 2(e)-(f) have been produced by
analyzing these datasets.
Empirical_Mammal_Dataset:
The empirical DNA sequence dataset has been provided in this directory. The
eutherian mammal dataset consists of 447 nuclear genes and 37 mammalian species
sequences. The concatenated sequence alignment has 1,391,742 sites. The tree
file (MAM2_candiate_tree.nwk) is the candidate tree for this empirical dataset.
Large_Simulated_Datasets:
There are five different simulated DNA sequence datasets, all with 446 species
and a varying number of sites (pnas_test_50k: 50,000 sites, pnas_test_100k: 100,000
sites, pnas_test_200k: 200,000 sites, pnas_test_400k: 400,000 sites, pnas_test_all:
536,534 sites). The first four datasets were generated by a random sample of
sites from the pnas_test_all dataset. The pnas_test_all.fasta sequence alignment
was generated by concatenating four sets of randomly selected 100 genes which
were simulated using different rate variation models. Fig 2(g) has been
produced by analyzing these datasets.
创建时间:
2021-02-27



