five

Supporting workflows for "Accumulating computational resource usage of genomic data analysis workflow to optimize cloud computing instance selection"

收藏
Mendeley Data2024-06-25 更新2024-06-27 收录
下载链接:
http://gigadb.org/dataset/100584
下载链接
链接失效反馈
官方服务:
资源简介:
Container virtualization technologies such as Docker are popular in the bioinformatics domain as they improve portability and reproducibility of software deployment. Along with software packaged in containers, the workflow description standards Common Workflow Language also enable to perform data analysis on multiple different computing environments with ease. These technologies accelerate the use of on-demand cloud computing platform which can scale out according to the amount of data. However, to optimize the time and the budget on a cloud usage, users need to select a suitable instance type corresponding to the resource requirements of their workflows. We developed CWL-metrics, a utility tool for cwltool, the reference implementation of CWL, to collect runtime metrics of Docker containers and workflow metadata to analyze resource requirement of workflows. We demonstrate the analysis by using seven transcriptome quantification workflows on six instance types. The result showed instance type options of lower financial cost and faster execution time with required amount of computational resources. The summary of resource requirements of workflow executions provided by CWL-metrics can help users to optimize the selection of cloud computing instances. The runtime metrics data also help users to share workflows among different workflow management frameworks. A Jupyter notebook file reproducing all the figures in the manuscript is available here
创建时间:
2023-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作