five

Fecal metagenomics and metabolomics sequencing

收藏
DataCite Commons2025-12-05 更新2026-05-05 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=ee079e98b37248a48dc0c52d426baf4b
下载链接
链接失效反馈
官方服务:
资源简介:
Illumina high-throughput sequencing was done using the shotgun metagenomic sequencing with a paired-end strategy to produce short-fragment libraries. Fastp was used to quality filter raw reads to remove poor quality reads, and to remove host-derived contamination, Bowie2 was used to map the reads against the human genome. The assembly of the clean reads began with the assemblage of contigs group of 300 bp and above using MEGAHIT and quality of assembly was assessed with QUAST. Prediction of coding sequences was performed using MetaGeneMark and a non-redundant gene catalog was generated using MMseqs2 (sequence identity 90% coverage 80%). Annotation of the databases was performed against functional databases such as KEGG, eggNOG, Pfam, SwissProt, CARD, and CAZy with taxonomic classification being done by comparison to the NCBI Nrdatabase.To assess functional and taxonomic diversity, we applied principal component analysis (PCA), principal coordinate analysis (PCoA), non-metric multidimensional scaling, and unweighted pair group method with arithmetic mean clustering. Differences between groups were evaluated using ANOSIM and PERMANOVA. Differential genes and taxa were identified through both parametric methods (Welch’s t-test, ANOVA) and non-parametric approaches (Wilcoxon rank-sum test, Kruskal-Wallis test), as well as metagenomeSeq. All statistics were carried out in R with the help of relevant packages, such as vegan.The analysis of metabolomics was undertaken on a high-resolution mass spectrometry system (Waters ACQUITY I-Class PLUS UPLC coupled with Xevo G2-XS QTof). The HSS T3 column combined with formic acid in a water solution and acetonitrile was used as a mobile phase, and the chromatographic separation was performed. Sample analyses were done in positive and negative modes of ionization with injection volume 1 μL.The MSe mode was run on mass spectrometric data; capillary voltage was set to +2000/-1500 V (positive/negative); cone voltage was set to 30 V; desolvation temperature was set to 500 °C; and the desired parameters were set to 800 L/h of desolvation gas flow rate. Progenesis QI was used to identify peaks in raw data, align and annotate them. The identification of metabolites was done by means of matching the database and an in-house spectral library, which included a mass deviation threshold value of 100 ppm.All data were normalized to the total peak area. The measurement of the reproducibility to groups was conducted by applying PCA and Spearman correlation. Differential metabolites were identified with the help of orthogonal partial least squares-discriminant analysis (OPLS-DA) and confirmed by 200 permutation tests. The criteria used to select the variables were FC >1, P < 0.05 (Student’s t-test), and variable importance in projection >1. Pathway enrichment analysis has been done with KEGG and HMDB and lipid maps database, and significance measured using hypergeometric distribution test.
提供机构:
Science Data Bank
创建时间:
2025-12-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作