DNA-based monitoring of bacterial and protist diversity in the Baltic Sea
收藏DataCite Commons2026-02-13 更新2025-04-16 收录
下载链接:
https://figshare.scilifelab.se/articles/dataset/DNA-based_monitoring_of_bacterial_and_protist_diversity_in_the_Baltic_Sea/28673273/1
下载链接
链接失效反馈官方服务:
资源简介:
<pre>Here we share the code, the sequencing processing output, and the intermediate data files for the work on bacterial and protist diversity patterns in the Baltic Sea area based on 16S and 18S metabarcoding as implemented two times for a year alongside the Swedish coastline monitoring programme. This work is available as a preprint:<br><br>Distinct bacterial and protist plankton diversity dynamics uncovered through DNA-based monitoring in the Baltic Sea area, Krzysztof T Jurdzinski, Meike AC Latz, Anders Torstensson, Sonia Brugel, Mikael Hedblom, Yue O O Hu, Markus Lindh, Agneta Andersson, Bengt Karlson, Anders F Andersson, bioRxiv 2024.08.14.607742; doi: https://doi.org/10.1101/2024.08.14.607742 <br><br>Documentation files:<br><br>README.md - description of the files, including all the files within the zipped folders.<br>environment.yml - conda environment with software/packages needed to run all the included scripts.<br>workflow.sh - a bash script defining the workflow.<br><br>Zipped folders with data processing documentation and intermediate files<br><br>ampliseq_16S.zip - this directory includes the scripts used to run the nf-core/ampliseq pipeline on the V3-V4 16S metabarcoding samples, as well as output files needed for downstream analysis.<br><br>ampliseq_18S.zip - same as ampliseq_16S.zip, but for the the V4 18S metabarcoding.<br><br>taxa_reannotation.zip - each subdirectory contains results of taxonomic re-annotation of the metabarcoding results and the scripts to obtain them. Both 2015-2017 and 2019-2020 datasets were re-annotated with the GTDB corrected for mislabled sequences using SATIVA and with PR2 version 5.0.0 for 16S and 18S respectively. Both 16S datasets were re-annotated using the SILVA database (version 138.1). <br><br>data_2015_2017.zip -these files correspond to the data for the samples from 2015 to 2017 (+ storage test for some 2019 samples). This is new data, later down the pipeline merged with the 2019-2020 dataset.<br><br>merged_data.zip - this folder contains merged across the 2015-2017 and the 2019-2020 datasets, based on the files from folders data_2015_2017 and data_2019_2020-<br><br>GSHHG.zip - Global Self-consistent, Hierarchical, High-resolution Geography Database (GSHHG) version 2.3.7 file needed to plot maps, as downloaded from the NOAA website.<br><br><br>Herlemann_et_al_2016.zip - data from the transect-based study by Herlemann et al., 2016.<br><br>read_downsampling.zip - This folder includes the scripts used to rarefy raw reads and the key output files. It is all based on 16S data.<br><br>Zipped folders with key R scripts<br><br>processing_code.zip - R scripts used for multiple steps of intermediate data table processing.<br>analysis_figures_code.zip - R scripts used to analyze the data and generate the figures.</pre>
本研究基于16S和18S宏条形码(metabarcoding)技术,结合瑞典海岸监测计划,于一年内开展了两次波罗的海区域细菌与原生生物多样性模式的调查工作。在此,我们分享该研究的代码、测序处理结果及中间数据文件。
本研究已作为预印本发布:<br><br>《通过基于DNA的监测揭示波罗的海区域独特的细菌与原生生物浮游生物多样性动态》,Krzysztof T Jurdzinski、Meike AC Latz、Anders Torstensson、Sonia Brugel、Mikael Hedblom、Yue O O Hu、Markus Lindh、Agneta Andersson、Bengt Karlson、Anders F Andersson,bioRxiv 2024.08.14.607742;doi:https://doi.org/10.1101/2024.08.14.607742 <br><br>
文档文件:<br><br>README.md - 文件说明,包括压缩文件夹内的所有文件。<br>environment.yml - 运行所有包含脚本所需的Conda环境配置(含软件/包)。<br>workflow.sh - 定义工作流程的Bash脚本。<br><br>
包含数据处理文档及中间文件的压缩文件夹<br><br>ampliseq_16S.zip - 该目录包含用于在V3-V4 16S宏条形码样本上运行nf-core/ampliseq流程的脚本,以及下游分析所需的输出文件。<br><br>ampliseq_18S.zip - 与ampliseq_16S.zip内容类似,但针对V4 18S宏条形码。<br><br>taxa_reannotation.zip - 每个子目录包含宏条形码结果的分类重注释结果及其获取脚本。2015-2017和2019-2020数据集均使用经SATIVA校正标签错误序列的GTDB(16S)和PR2版本5.0.0(18S)进行重注释;两个16S数据集还使用SILVA数据库(版本138.1)进行了重注释。<br><br>data_2015_2017.zip - 这些文件对应2015-2017年样本的数据(含部分2019年样本的存储测试),为新数据,后续在流程中与2019-2020数据集合并。<br><br>merged_data.zip - 该文件夹包含基于data_2015_2017和data_2019_2020-文件夹文件合并的2015-2017与2019-2020数据集。<br><br>GSHHG.zip - 用于绘制地图的全球自洽分层高分辨率地理数据库(Global Self-consistent, Hierarchical, High-resolution Geography Database,GSHHG)版本2.3.7文件,从NOAA网站下载。<br><br>Herlemann_et_al_2016.zip - Herlemann等人2016年基于断面研究的数据。<br><br>read_downsampling.zip - 该文件夹包含用于稀疏化原始读数的脚本及关键输出文件,全部基于16S数据。<br><br>
包含关键R脚本的压缩文件夹<br><br>processing_code.zip - 用于中间数据表多步骤处理的R脚本。<br>analysis_figures_code.zip - 用于数据分析及图表生成的R脚本。
提供机构:
KTH Royal Institute of Technology
创建时间:
2025-04-11



