A catalog of genes, genomes and species of the cat (Felis catus) intestinal microbiota
收藏Recherche Data Gouv France2023-01-01 更新2026-04-09 收录
下载链接:
https://entrepot.recherche.data.gouv.fr/citation?persistentId=doi:10.57745/1FIHIT
下载链接
链接失效反馈官方服务:
资源简介:
Dataset overview This dataset provides: a non-redundant high-quality catalog of 1.3 million genes 6,622 Metagenome-Assembled Genomes (MAGs) 344 Metagenomic Species Pangenomes (MSPs) This dataset can be used to analyze shotgun sequencing data of the cat gut microbiota. How to use this dataset Create a gene abundance table by aligning reads from each sample against the catalog. For this purpose, you can use Meteor or NGLess. Then, normalize raw counts by gene length. Taxonomic profiling: the abundance of each species can be estimated as the average abundance of its 100 first core genes. To reduce the false positive rate, only consider that a species is present if at least 10/100 marker genes are detected. Methods Data sources This dataset was constructed using two different bioprojects: PRJNA758898 from Ma et al. 2022. 16 samples from 16 animals. PRJEB9357 from Deusch et al. 2015. 88 samples from 30 animals. PRJEB4391 from Deusch et al. 2014. 36 samples from 18 animals. PRJNA944553. 30 samples from 30 animals. PRJNA908260 from Bai et al. 2023. 8 samples from 8 animals. PRJNA923753 from Ho et al. 2023. 1 sample. Metagenomic assembly De novo metagenomic assembly was performed on samples listed above. First, sequencing adapters removal and read trimming was performed with fastp. Reads mapped on the host genome (GCF_018350175.1) with bowtie2 were removed with samtools. Finally, Metagenomic assembly was performed with metaSPAdes. Contigs of less than 1500 bp were removed. MAGs recovery MAGs were generated with COMEBin (multi-coverage mode) and MAGs quality was assessed with CheckM2. MAGs with completeness < 70% or contamination > 5% or N50 < 5Kb were discarded. Pairwise Average Nucleotide Identity (ANI) was computed for all recovered MAGs with fastANI and dereplication at species level (ANI cutoff = 95%). Non-redundant gene catalog Genes were predicted on all contigs from metagenomic assemblies with Prodigal (parameters : -m -p meta). Genes were pooled and clustered with cd-hit-est (parameters -c 0.95 -aS 0.90 -G 0 -d 0 -M 0 -T 0) by choosing those from the longest contigs as representatives. MSPs recovery Samples from multiple cohorts (listed above + PRJNA906124 from Lee et al. 2022) were aligned against the non-redundant gene catalog with the Meteor software suite to produce a raw gene abundance table (1,3M genes quantified in 212 samples). Then, co-abundant genes were binned in 344 Metagenomic Species Pan-genomes (MSPs, i.e. gene clusters that likely belong to the same microbial species) using MSPminer. MAGs and MSPs taxonomic annotation Dereplicated MAGs were annotated with GTDB-Tk based on GTDB r214. Then, MAGs taxonomic annotation was propagated to the corresponding MSPs. Construction of the phylogenetic tree 39 universal phylogenetic markers genes were extracted from the dereplicated MAGs with fetchMGs. Then, the markers were separately aligned with MUSCLE. The 40 alignments were merged and trimmed with trimAl (parameters: -automated1). Finally, the phylogenetic tree was computed with FastTreeMP (parameters: -gamma -pseudo -spr -mlacc 3 -slownni).
创建时间:
2023-01-01



