Data and scripts for: Bayesian Phylogenetic Analysis on multi-core Compute Architectures: Implementation and evaluation of BEAGLE in RevBayes with MPI
收藏DataONE2024-07-13 更新2024-07-27 收录
下载链接:
https://search.dataone.org/view/sha256:c56e108ee29e0f211df31b7d5d38c2003e46e6139cf1f3ea202398636d04b6c3
下载链接
链接失效反馈官方服务:
资源简介:
Phylogenies are central to many research areas in biology and commonly estimated using likelihood-based methods. Unfortunately, any likelihood-based method, including Bayesian inference, can be restrictively slow for large datasetsâwith many taxa and/or many sites in the sequence alignmentâor complex substitution models. The primary limiting factor when using large datasets and/or complex models in probabilistic phylogenetic analyses is the likelihood calculation, which dominates the total computation time. To address this bottleneck, we incorporated the high-performance phylogenetic library BEAGLE into RevBayes, which enables multi-threading on multi-core CPUs and GPUs, as well as hardware-specific vectorized instructions for faster likelihood calculations. Our new implementation of RevBayes+BEAGLE retains the flexibility and dynamic nature that users expect from vanilla RevBayes. Additionally, we implemented a native parallelization within RevBayes without an external library using th..., The data were simulated using RevBayes., The data can be viewed using any alignment viewer that supports Nexus files and the scripts can be run in RevBayes., # Bayesian Phylogenetic Analysis on multi-core Compute Architectures: Implementation and evaluation of BEAGLE in RevBayes with MPI
Scripts and example datasets for evaluating the computational speed when running RevBayes with MPI or threading or GPU.
**Description of the data and file structure**
* speedup-plot-mcmc-iter.pdf: Plot showing the speed-up when using different number of MCMC simulations.
* revbayes_beagle_benchmark.zip: This zip folder contains the RevBayes scripts to perform the MCMC runs. Each script can be run with RevBayes and needs the simulated data.
* datasets.zip: This zip folder contains two separate folders ***DNA*** and ***AA.***Â Each folder contains 4 fasta files with simulated datasets for running the example analyses. The files contain for 64 and 256 taxa and 1000 and 10000 sites, respectively. All files are in *fasta* format. In detail the files are:
* DNA/JC_t256_s10K.fasta: Alignment of DNA with 256 taxa and 10k sites simulated under the Jukes-Cantor su...
系统发育树(Phylogenies)是生物学诸多研究领域的核心内容,通常采用基于似然的方法进行推断。遗憾的是,任何基于似然的方法(包括贝叶斯推断)在处理大型数据集——即包含大量类群和/或序列联配中的大量位点——或复杂替换模型时,都会遭遇显著的速度瓶颈。在概率系统发育分析中使用大型数据集和/或复杂模型时,最主要的限制因素是似然计算,其占据了总计算时长的绝大多数比例。为解决这一瓶颈问题,我们将高性能系统发育库BEAGLE集成至RevBayes中,该库支持多核CPU与图形处理器(GPU)的多线程运算,同时支持针对特定硬件的向量化指令以加速似然计算。我们全新实现的RevBayes+BEAGLE保留了原生RevBayes所具备的灵活性与动态特性。此外,我们还在RevBayes内部实现了无需外部依赖库的原生并行化方案(原文此处截断)。本数据集通过RevBayes模拟生成,可通过任意支持Nexus格式的序列联配查看器进行浏览,相关脚本可在RevBayes中运行。
# 面向多核计算架构的贝叶斯系统发育分析:BEAGLE在RevBayes中的MPI实现与性能评估
本数据集包含用于评估RevBayes结合消息传递接口(MPI)、多线程或GPU运行时计算速度的脚本与示例数据集。
## 数据与文件结构说明
* speedup-plot-mcmc-iter.pdf:展示不同MCMC迭代次数下加速比的可视化图表。
* revbayes_beagle_benchmark.zip:该压缩包包含用于执行MCMC运行的RevBayes脚本,每份脚本均可通过RevBayes运行,且需要配套的模拟数据集。
* datasets.zip:该压缩包包含两个独立文件夹***DNA***与***AA***。每个文件夹内均包含4个FASTA格式的模拟数据集,用于示例分析,分别对应64、256个类群以及1000、10000个位点。所有文件均采用FASTA格式,具体文件如下:
* DNA/JC_t256_s10K.fasta:基于Jukes-Cantor模型模拟的、包含256个类群与10000个位点的DNA序列联配(原文此处截断)
创建时间:
2024-07-14



