Supporting data for "Bio-Docklets: Virtualization Containers for Single-Step Execution of NGS Pipelines."

Name: Supporting data for "Bio-Docklets: Virtualization Containers for Single-Step Execution of NGS Pipelines."
Creator: GigaScience Database
Published: 2025-05-26 17:24:58
License: 暂无描述

DataCite Commons2025-05-26 更新2025-04-15 收录

下载链接：

http://gigadb.org/dataset/100323

下载链接

链接失效反馈

官方服务：

资源简介：

Processing of Next-Generation Sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized post-analysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers, towards seamless deployment of pre-configured bioinformatics software and pipelines on any computational platform.<br> We present an approach for abstracting the complex data operations of multi-step, bioinformatics pipelines for NGS data analysis. As examples, we have deployed two pipelines for RNAseq and CHIPseq, pre-configured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines as simple as running a single bioinformatics tool. This is achieved using a meta-script that automatically starts the Bio-Docklets, and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface (API). The pipeline output is post-processed by integration with the Visual Omics Explorer (VOE) framework, providing interactive data visualizations that users can access through a web browser. <br> Our goal is to enable easy access to NGS data analysis pipelines for non-bioinformatics experts, on any computing environment whether a laboratory workstation, university computer cluster, or a cloud service provider. Beyond end-users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets.<br>

提供机构：

GigaScience Database

创建时间：

2017-06-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集