Two decades of metagenomic data on soil biodiversity in Germany frm samples preserved in the German Environmental Specimen Bank
收藏DataCite Commons2025-09-07 更新2025-09-08 收录
下载链接:
https://figshare.com/articles/dataset/Two_decades_of_metagenomic_data_on_soil_biodiversity_in_Germany_frm_samples_preserved_in_the_German_Environmental_Specimen_Bank/30069451/1
下载链接
链接失效反馈官方服务:
资源简介:
The samples were collected as part of the routine soil sampling of the German Environmental Specimen Bank. The sampling sites include 11 locations in nine sampling regions of different ecosystem types (conurbation, agrarian, forestry and near-natural) in Germany. DNA was extracted with the ZymoBIOMICS™ DNA/RNA Miniprep Kit (Zymo Research, Freiburg, Germany). Libraries were prepared and purified according to the BEST protocol. Sequencing was performed on an Illumina NovaSeq 6000 PE150 platform at Novogene (Cambridge, UK), with a sequencing depth of 2 GB per sample. Sequences were automatically trimmed and quality checked with Autotrim v0.6.1 and FastQC. Taxonomic classification was performed sequentially using Kraken2 v2.1.2. For building the reference databases, representative or reference RefSeq assemblies were downloaded from NCBI (downloaded on January 13, 2023) for the following RefSeq categories: human, bacteria, and fungi and plants combined. For invertebrates, the newly created MetaInvert database was used, which contains genomes of 232 Central European invertebrate soil species (Collins et al. 2023). Each reference dataset was used to build a Kraken2 database with default k-mer size k = 35. For assignment, sequences were compared with the human database to control for potential contamination. Unassigned sequences were compared with the bacteria database, then to the fungi and plants database, and finally to the MetaInvert database. Results of the sequential assignments were merged. Raw sequencing data is available in NCBI as BioProject PRJNA1276286. More details here: https://www.biorxiv.org/content/10.1101/2025.07.21.665925v1
本数据集的样本采集自德国环境标本库的常规土壤采样项目。采样点位覆盖德国9个不同生态系统类型(城市建成区、农业区、林区及近自然区域)的11个采样地点。采用ZymoBIOMICS™ DNA/RNA微量提取试剂盒(Zymo Research,德国弗莱堡)完成基因组DNA提取。依照BEST实验流程完成文库构建与纯化操作。测序工作由英国剑桥的诺禾致源(Novogene)在Illumina NovaSeq 6000 PE150平台上完成,单样本测序深度为2 GB。序列自动修剪与质量评估通过Autotrim v0.6.1与FastQC工具完成。分类学物种注释采用Kraken2 v2.1.2分阶段进行。构建参考数据库时,于2023年1月13日从美国国家生物技术信息中心(NCBI)下载了以下参考序列数据库(RefSeq)类别的代表性或标准组装序列:人类、细菌、真菌与植物组合数据集。针对无脊椎动物,使用新建的MetaInvert数据库,该库包含232种中欧土壤无脊椎动物的基因组序列(Collins等,2023)。所有参考数据集均采用默认k值k=35构建Kraken2注释数据库。注释阶段首先将序列与人类数据库比对以排查潜在外源污染;未比对成功的序列依次与细菌数据库、真菌与植物数据库、MetaInvert数据库进行比对。将各阶段的比对注释结果进行合并整合。原始测序数据已以BioProject编号PRJNA1276286的形式上传至NCBI。更多详细信息可参阅:https://www.biorxiv.org/content/10.1101/2025.07.21.665925v1
提供机构:
figshare
创建时间:
2025-09-07



