five

Supplemental material for Baltic Sea Genome Size project

收藏
DataCite Commons2025-01-15 更新2025-04-16 收录
下载链接:
https://figshare.scilifelab.se/articles/dataset/Supplemental_material_for_Baltic_Sea_Genome_Size_project/21378294/1
下载链接
链接失效反馈
官方服务:
资源简介:
In this repository you can find the supplemental figure, material and tables for the project "Genome size variation between pelagic and benthic communities across prokaryotic taxonomy and environmental gradients in the Baltic Sea" <br> Microbial genome size can be used as a predictor to explain the ecology and metabolism of Bacteria and Archaea across major biomes. Despite their ecological significance, the contribution of microbial genome size to differences in metabolic potential of benthic and pelagic prokaryotes are poorly studied. Here, we investigated how taxonomy and microbial genome size varies between benthic and pelagic habitats across environmental gradients of the brackish Baltic Sea. We also explored the relationships between that variation, the environmental heterogeneity, and microbial functions in these habitats. By analyzing Baltic metagenomes and MAGs, we observe that pelagic brackish Bacteria and Archaea present smaller genome sizes on average than pelagic marine and freshwater prokaryotes. Moreover, we found that prokaryotic genome sizes in Baltic sediments (3.47 Mbp) are significantly bigger than in the water column (2.96 Mbp). These differences in genome size persisted in Bacteria from the phyla level to the order level. For pelagic prokaryotes, the smallest genomes coded for a higher number of module steps per Mbp than bigger genomes for most of the functions, such as amino acid metabolism and central carbohydrate metabolism. However, we observed that nitrogen metabolism was almost absent in pelagic genomes and was mostly present in benthic genomes. Finally, we also show that Bacteria inhabiting Baltic sediments and water column not only differ in taxonomy, but also in their metabolic potential, such as the Wood-Ljungdahl pathway or presence of different hydrogenases.   Dataset description Figure S1. Overview of the average genome size (AGS) of metagenomes across gradients of salinity and depth. <br> Supplementary Material 1. Statistical analysis of the effect of abiotic variables and their interactions on the average genome size (AGS) of full Baltic microbial communities. Tables show the results of ANOVA type II analysis for both sediments and water column. <br> Supplementary Material 2. Overview of the presence of module steps for several metabolic categories. We also provide a list of all functional categories used with the different module categories included, and the number of module steps used in the analysis. <br> Supplementary Material 3. Table with all Actinobacteriota genomes used to calculate marker genes to use in CheckM quality assessment. We include information on GenBank Accession, lake, tribe, assembly size (Mbp), GC content (%) and reference. Two figures are included. First a boxplot showing the differences on completeness of the of the 8 phyla with most bins in the StratfreshDB database. We have included the completeness estimations using both default CheckM parameters and specific set of marker genes in both phyla Patescibacteria and Actinobacteriota. Stars indicate significant differences p &lt; 0.05 (Wilcoxon non-parametric test). We also provide a figure showing the markers used by CheckM with default parameters for the 8 analyzed phyla. <br> Supplementary Table 1. Table with information of all metagenomes used in this research project, including: sample run in NCBI/ENA, estimated average genome size (Mbp), depth (m), salinity (PSU), temperature (C) and oxygen concentration (mg/L), sample material processing (m), environment, latitude, longitude, study accession, online database, sample accession and reference.  <br> Supplementary Table 2. Table with information of all metagenome assembled genomes (MAGs) used in this research project, including: bin id, mOTU, completeness (%), contamination (%), GC (%), coding density (%), bin length, estimated genome size (Mbp), environment, domain, phylum, class, order, family, genus and species.  <br> Supplementary Table 3. Table with information of all pelagic metagenome assembled genomes (MAGs) used to compare genome size across major pelagic environments (marine, freshwater and brackish). All brackish MAGs belong (41), while the genomic information for marine and freshwater MAGs was retrieved from a previous study (2) that compiled them from (11,42). We include bin id, completeness (%), contamination (%), bin length, estimated genome size (Mbp), ecosystem type, domain, phylum, class, order, family, genus and species. <br>

本仓库包含项目"波罗的海跨原核生物分类学及环境梯度的远洋与底栖群落基因组大小变异"的补充图表、材料及表格<br>微生物基因组大小可作为预测因子,用于解释不同主要生物群落中细菌和古菌的生态学及代谢特征。尽管其具有重要生态意义,但微生物基因组大小对底栖和远洋原核生物代谢潜力差异的贡献研究尚浅。本研究探讨了在咸淡水域波罗的海的环境梯度下,底栖与远洋生境之间的分类学特征及微生物基因组大小如何变化。我们还探究了这些变化与环境异质性及生境中微生物功能之间的关系。通过分析波罗的海宏基因组及宏基因组组装基因组(MAGs),我们发现远洋咸淡水域的细菌和古菌平均基因组大小小于远洋海洋及淡水原核生物。此外,我们发现波罗的海底质中(3.47 Mbp)的原核生物基因组大小显著大于水体中(2.96 Mbp)的。这些基因组大小差异在细菌中从门水平持续至目水平。对于远洋原核生物,在大多数功能(如氨基酸代谢和中心碳水化合物代谢)中,最小基因组每Mbp编码的模块步骤数量多于较大基因组。然而,我们观察到氮代谢在远洋基因组中几乎缺失,而主要存在于底栖基因组中。最后,我们还发现,栖息于波罗的海底质和水体中的细菌不仅在分类学上存在差异,其代谢潜力(如伍德-永达尔途径或不同氢化酶的存在)也有所不同。<br>图S1. 宏基因组平均基因组大小(AGS)沿盐度和深度梯度的概览。<br>补充材料1. 非生物变量及其相互作用对波罗的海完整微生物群落平均基因组大小(AGS)影响的统计分析。表格展示了底质和水体的II型方差分析(ANOVA)结果。<br>补充材料2. 多个代谢类别的模块步骤存在情况概览。我们还提供了所用所有功能类别的列表(含不同模块类别)及分析中使用的模块步骤数量。<br>补充材料3. 用于计算CheckM质量评估标记基因的所有放线菌门(Actinobacteriota)基因组表格。我们包含GenBank登录号、湖泊、类群、组装大小(Mbp)、GC含量(%)及参考文献信息。包含两幅图:第一幅为箱线图,展示StratfreshDB数据库中拥有最多bin的8个门的完整性差异;我们纳入了使用默认CheckM参数及特定标记基因集对帕氏细菌门(Patescibacteria)和放线菌门(Actinobacteriota)的完整性估计结果,星号表示差异显著(p < 0.05, Wilcoxon非参数检验)。我们还提供了一幅图,展示CheckM使用默认参数对8个分析门所用标记的情况。<br>补充表1. 本研究项目所用所有宏基因组的信息表,包括:NCBI/ENA中的样本运行信息、估计平均基因组大小(Mbp)、深度(m)、盐度(PSU)、温度(℃)、氧浓度(mg/L)、样本材料处理(μm)、环境、纬度、经度、研究登录号、在线数据库、样本登录号及参考文献。 <br>补充表2. 本研究项目所用所有宏基因组组装基因组(MAGs)的信息表,包括:bin ID、mOTU、完整性(%)、污染率(%)、GC含量(%)、编码密度(%)、bin长度、估计基因组大小(Mbp)、环境、域、门、纲、目、科、属及种。 <br>补充表3. 用于比较主要远洋环境(海洋、淡水及咸淡水)基因组大小的所有远洋宏基因组组装基因组(MAGs)信息表。所有咸淡水MAGs共41个,而海洋和淡水MAGs的基因组信息来自先前一项研究(2),该研究汇编自(11,42)。表中包含bin ID、完整性(%)、污染率(%)、bin长度、估计基因组大小(Mbp)、生态系统类型、域、门、纲、目、科、属及种。 <br>
提供机构:
Stockholm University
创建时间:
2022-10-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作