Training Data for "Binning of metagenomic sequencing data" tutorial

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/7845137

下载链接

链接失效反馈

官方服务：

资源简介：

Metagenomics is the study of genetic material recovered directly from environmental samples, such as soil, water, or gut contents, without the need for isolation or cultivation of individual organisms. Metagenomics binning is a process used to classify DNA sequences obtained from metagenomic sequencing into discrete groups, or bins, based on their similarity to each other. The goal of metagenomics binning is to assign the DNA sequences to the organisms or taxonomic groups that they originate from, allowing for a better understanding of the diversity and functions of the microbial communities present in the sample. This is typically achieved through computational methods that use sequence similarity, composition, and other features to group the sequences into bins. There are two main types of metagenomics binning: reference-based and de novo. reference-based binning involves aligning the sequences to a database of known genomes or reference sequences de novo binning involves clustering the sequences based on similarity without prior knowledge of the organisms or reference sequences present in the sample. Both methods have their strengths and limitations, and researchers often use a combination of approaches to improve the accuracy of their binning results. Metagenomics binning is an important tool for understanding the functional potential of microbial communities in various environments and has applications in fields such as biotechnology, environmental science, and human health. In this tutorial, we will learn how to run metagenomic binning tools and evaluate the quality of the results. In order to do that, we will use data from the study: Temporal shotgun metagenomic dissection of the coffee fermentation ecosystem and MetaBAT2 algorithm. For an in-depth analysis of the structure and functions of the coffee microbiome, a temporal shotgun metagenomic study (six time points) was performed. The six samples have been sequenced with Illumina MiSeq utilizing whole genome sequencing. Based on the 6 original dataset of the coffee fermentation system, we generated mock datasets for this tutorial.

创建时间：

2023-04-20

5,000+

优质数据集

54 个

任务类型

进入经典数据集