Metagenome assembled genomes from mouse gut microbiota
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.x0k6djhq5
下载链接
链接失效反馈官方服务:
资源简介:
The source of protein in a person's diet affects their total life expectancy. However, the mechanisms by which dietary protein sources differentially impact human health and life expectancy are poorly understood. Dietary choices impact the composition and function of the intestinal microbiota that ultimately modulate host health. This raises the possibility that health outcomes based on dietary protein sources might be driven by interactions between dietary protein and the gut microbiota. In this study, we determined the effects of seven different sources of dietary protein on the gut microbiota of male and female mice (C57BL/6J) using an integrated metagenomics-metaproteomics approach. In order of feeding, the diets were 20% soy protein, 20% casein protein, 20% brown rice protein, 40% soy protein, 20% yeast protein, 40% casein protein, 20% pea protein, 20% egg white protein, and 20% chicken bone protein. We fed a total of 10-12 mice each diet for one week and then collected a fecal sample at the end of the week before proceeding to the next diet. We then analyzed the fecal samples using an integrated metagenomics-metaproteomics approach that included generating a genome-resolved metagenome with metagenome assembled genomes (MAGs). This data set represents all the MAGs generated in this study. To generate the MAGs, we pooled DNA extracted from feces taken from each mouse cage (four cages in total) across four different diets (20% rice, 40% soy, 20% yeast, 40% casein). We sent the DNA to be sequenced at the NCSU GSL and generated between 51,152,549 and 74,618,259 paired-end reads (150bp x 2) per sample using a NovaSeq 6000 Sequencing System, Illumina. We analyzed these reads using a genome-resolved metagenomic pipeline, which included read QC, assembly using metaSPAdes and MEGAHIT, binning using MetaBAT 2, and bin quality checking using CheckM. This data set includes the metagenomic assemblies for each cage, a co-assembly, and 454 MAGs with a completion score of >30% and a contamination scores of <10% if the completion score was 50% and <5% if the completion score was between 30-50%. We also include a re-assembled Bacteroides thetaiotaomicron genome along with its associated reads and checkM scores and a BinGroupingTableWithGTDB.txt file that includes the bin identifiers, the species cluster group at 95% ANI, CheckM scores, and GTDB taxonomy information. This data set represents a group of MAGs that could be used along with other metagenomic measurements as part of a catalog of the C57BL/6J mouse microbiota.
Methods
We fed conventional C57BL/6J mice a series of defined diets where the source of dietary protein was the only difference, which made up twenty or forty percent of the diet. The dietary protein sources used were purified protein. The diets were fed to 12 mice across four cages for one week each. A fecal sample was collected at the end of each week. The diets were fed in this order: standard chow, 20% soy, 20% casein, 20% rice, 40% soy, 20% yeast, 40% casein, 20% pea, 20% egg white protein, 20% chicken bone broth, and lastly at the end of the experiment half of the mice were fed the 20% soy and half the mice the 20% casein diet again as a control. We pooled samples by cage from the 20% rice diet, 40% soy diet, 20% yeast diet, and 40% casein diet to generate four metagenomes, one for each cage. DNA was extracted with the QIAamp DNA stool mini kit (Qiagen) and sequenced the DNA on an Illumina NovaSeq 6000 sequencer.
We assembled raw reads using a genome-resolved metagenomics approach. We removed PhiX74 (NCBI GenBank accession CP004084.1) and mouse genome (mm10) contaminating sequences using BBSplit and removed adapters using BBDuk (BBMap, Version 38.06), parameters: mink = 6, minlength =20 (Bushnell, 2014). We assembled decontaminated reads individually using MetaSPAdes (v3.12.0) -k 33,55,99 (Nurk et al., 2017) and co-assembled them using MEGAHIT (v1.2.4) –kmin 31 –k-step 10 (D. Li et al., 2015). We mapped reads from all four samples to all five assemblies using bbmap, and binned the contigs using MetaBAT (v2.12.1) (Kang et al., 2019). We assessed the quality of the bins using CheckM (v1.1.3) (Parks et al., 2015) and automatically accepted medium quality bins with a completion score greater than 50% and less than 10% (Bowers et al., 2017). Since the purpose of this was to assign proteins to species, we further accepted bins that were greater than >30% complete and <5% contaminated. We clustered the bins into species groups by 95% ANI using dRep (v2.6.2) (Olm et al., 2017, 2020) and assigned taxonomy using GTDB-Tk (v1.3.0, ref r95) (Chaumeil et al., 2020).
We reassembled the Bacteroides thetaiotaomicron genome by mapping the reads from one of our metagenomic samples to all the contigs that we identified as belonging to B. theta using BBSplit (BBMap, Version 38.06). We then assembled reads that mapped to B. theta using metaSPAdes to assemble a new genome.
创建时间:
2025-03-25



