five

Amplicon sequence variants from the Insect Biome Atlas project

收藏
figshare.scilifelab.se2024-11-25 更新2025-01-21 收录
下载链接:
https://figshare.scilifelab.se/articles/dataset/Amplicon_sequence_variants_from_the_Insect_Biome_Atlas_project/25480681/6
下载链接
链接失效反馈
官方服务:
资源简介:
General informationThe Insect Biome Atlas project was supported by the Knut and Alice Wallenberg Foundation (dnr 2017.0088). The project analyzed the insect faunas of Sweden and Madagascar, and their associated microbiomes, mainly using DNA metabarcoding of Malaise trap samples collected in 2019 (Sweden) or 2019–2020 (Madagascar).Please cite this version of the dataset as: Miraldo A, Iwaszkiewicz-Eggebrecht E, Sundh J, Lokeshwaran M, Granqvist E, Andersson AF, Lukasik P, Roslin T, Tack A, Ronquist F. 2024. Dataset of amplicon sequence variants (ASVs) from the Insect Biome Atlas Project, version 5. https://doi.org/10.17044/scilifelab.25480681Dataset descriptionThis dataset contains amplicon sequence variants (ASVs) generated from high-throughput sequencing of the cytochrome c oxidase subunit I (COI) gene from Malaise trap samples (lysates, homogenates and preservative ethanol) and soil and litter samples. It includes ASV sequences and abundance information (number of reads) as well as metadata files that are needed to interpret and analyse the data further. Future versions of the dataset will include additional data. NB! All ASV files include ASVs that represent biological and synthetic spike-ins.MethodsSamples were sequenced using Illumina technology. Raw data are available at the European Nucleotide Archive (ENA) under project PRJEB61109. The raw sequence data was preprocessed using a Snakemake workflow. Preprocessed reads were then used as input to the AmpliSeq Nextflow (v.2.1.0) pipeline to generate ASVs.Available dataTwo types of files are provided: ASV files and metadata files. Files marked with 'SE' and 'MG' contain data from Sweden and Madagascar, respectively.The file shasum.txt contains checksums for each of the files.ASV filesASV sequences in fasta format are found in files CO1_asv_seqs_SE.fasta.gz and CO1_asv_seqs_MG.fasta.gz. Counts of ASVs in each sample are in CO1_asv_counts_SE.tsv.gz and CO1_asv_counts.MG.tsv.gz. The Swedish dataset contains 821,559 ASVs in 6,169 samples. The Madagascar dataset contains 701,769 ASVs in 2,286 samples.Metadata filesFour types of metadata files are included:sequencing_metadata files with information about samples that were processed in the lab and sequencedsamples_metadata files with information about samples that were collected in the field.sites_metadata files with information about sites where samples were collected.sipke-ins metadata files with information about spike-ins added to each malaise trap sample at the time of sample processing in the lab.Sequencing metadata filesThe two sequencing metadata files CO1_sequencing_metadata_SE.tsv and CO1_sequencing_metadata_MG.tsv contain information about samples that were sequenced. For details on the columns of these files, see the README.txt file.Samples metadata filesFour samples_metadata files are included in this dataset with information about each sample that was collected in the field. For samples collected with malaise traps we have two files, one for each country: samples_metadata_malaise_SE.tsv and samples_metadata_malaise_MG.tsv. See the README.txt file for details about the columns of these files.For arthropod samples collected from litter and soil we have two files, one for each country: samples_metadata_soil_litter_SE.tsv and samples_metadata_litter_MG.tsv. Note that for Madagascar we did not collect arthropod samples from soil. Also note that for Madagascar we collected four leaf litter samples at each trap location, one sample in each direction of the Malaise trap (front, back, left and right); whilst for Sweden we collected only one sample at each trap location. For details on the columns of these files, see the README.txt file.Sites metadata filesThere are two files that contain information about sampling sites, one for each country: sites_metadata_SE.tsv and sites_metadata_MG.tsv. See the README.txt file for more information.Spike-ins metadata filesWe provide three files with information about spike-ins used when processing samples in the lab: biological_spikes_taxonomy_SE.tsv and biological_spikes_taxonomy_MG.tsv contain taxonomic information on biological spike ins while the file synthetic_spikes_info.tsv has information on synthetic spike ins. See README.txt for more information.Other complementary data filesWe present complementary data on soil chemistry collected at each sampling location in both Sweden and Madagascar, stand characteristics collected at each sampling location in Madagascar and biomass/count data for a selected number of malaise trap samples from the Insect Biome Atlas project (n=24) and the Swedish Insect Inventory Project (n=224).Soil chemistry dataWe provide two datasets, one for each country, on soil chemistry (soil_chemistry_SE.tsv and soil_chemistry_MG.tsv) that store information on soil nutrients from soil samples collected at the same sampling sites as the arthropod communities. Topsoil (0-20cm) was sampled at 5 sites around each Malaise trap in both Sweden and Madagascar: one soil core (6 cm diameter) at the center of trap and one soil core on each of the four “sides” of the trap five meters away from the trap. Soil samples at each site were taken as composite samples from the five locations. Soil samples collected in Sweden were analysed at Eurofins in Sweden and the ones collected in Madagascar were analysed at the Laboratoire des Radioisotopes in Madagascar. As samples from each country were analysed at different laboratories the variables on soil nutrients presented in each dataset differ slightly. See the README.txt file for more information on the columns of each of these files.Stand characteristics dataStanding characteristics were only measured in Madagascar as extensive data on landscape composition and vegetation structure at the sampling sites in Sweden had already been compiled as part of the National Inventory of Landscapes in Sweden (NILS) and data are publicly available here.The file stand_characteristics_MG.tsv contains information on a set of standing characteristics from Madagascar related to tree density (DBH, shading, etc). Information about columns in this file is found in the README.txt file.Biomass and count dataTo allow an assessment of how the biomass of a Malaise trap sample translates to the number of specimens, we provide two files describing samples from Sweden, for which we measured the biomass and also counted all the specimens in the sample. The first set comprises 24 samples from the IBA field campaign (biomass_count_IBA.tsv), and the second set comprises 224 samples from a separate Swedish Malaise trapping campaign (Swedish Insect Inventory Project) in 2018–2019 (biomass_count_SIIP.tsv). For the latter dataset, we provide the site and sample metadata in the same file. Details about columns in these files are found in the README.txt file.References:Egnér, H., Riehm, H., & Domingo, W. (1960). Untersuchungen über die chemische Bodenanalyse als Grundlage für die Beurteilung des Nährstoffzustandes der Böden. II. Chemische Extraktionsmethoden zur Phosphor-und Kaliumbestimmung. Kungliga Lantbrukshögskolans Annaler, 26, 199–215.

基本信息 昆虫生物群落图谱项目由克努特和艾丽斯·瓦伦贝格基金会(项目编号:2017.0088)资助。该项目分析了瑞典和马达加斯加的昆虫群落及其相关微生物组,主要采用2019年(瑞典)或2019-2020年(马达加斯加)收集的马尔塞陷阱样品的DNA条形码分析(lysates, homogenates and preservative ethanol)。请引用此数据集版本:Miraldo A, Iwaszkiewicz-Eggebrecht E, Sundh J, Lokeshwaran M, Granqvist E, Andersson AF, Lukasik P, Roslin T, Tack A, Ronquist F. 2024. Insect Biome Atlas项目扩增子序列变异(ASV)数据集,版本5. https://doi.org/10.17044/scilifelab.25480681 数据集描述 本数据集包含从马尔塞陷阱样品(lysates, homogenates and preservative ethanol)、土壤和凋落物样品中高通量测序得到的细胞色素c氧化酶亚基I(COI)基因的扩增子序列变异(ASVs)。它包括ASV序列和丰度信息(读数数量)以及用于进一步解释和分析数据的元数据文件。未来版本的数据集将包含更多数据。请注意!所有ASV文件均包含代表生物和合成spike-ins的ASVs。 方法 样品使用Illumina技术进行测序。原始数据存储在欧洲分子生物学实验室(ENA)下的项目PRJEB61109中。原始序列数据使用Snakemake工作流程进行预处理。然后,预处理后的读数被用作AmpliSeq Nextflow(v.2.1.0)管道的输入以生成ASVs。 可用数据 提供两种类型的文件:ASV文件和元数据文件。标记为'SE'和'MG'的文件分别包含瑞典和马达加斯加的数据。 文件shasum.txt包含每个文件的校验和。 ASV文件 COI_asv_seqs_SE.fasta.gz和CO1_asv_seqs_MG.fasta.gz文件中包含fasta格式的ASV序列。每个样品中ASVs的计数在CO1_asv_counts_SE.tsv.gz和CO1_asv_counts.MG.tsv.gz中。瑞典数据集包含821,559个ASVs,分布在6,169个样品中。马达加斯加数据集包含701,769个ASVs,分布在2,286个样品中。 元数据文件 包含四种类型的元数据文件: - sequencing_metadata文件,包含在实验室处理和测序的样品信息。 - samples_metadata文件,包含在野外采集的样品信息。 - sites_metadata文件,包含采集样品的地点信息。 - sipke-ins元数据文件,包含在实验室样品处理过程中添加到每个马尔塞陷阱样品的spike-ins信息。 测序元数据文件 CO1_sequencing_metadata_SE.tsv和CO1_sequencing_metadata_MG.tsv两个测序元数据文件包含测序样品的信息。关于这些文件列的详细信息,请参阅README.txt文件。 样品元数据文件 包含四个samples_metadata文件,其中包含每个在野外采集的样品的信息。对于使用马尔塞陷阱采集的样品,有两个文件,每个国家一个:samples_metadata_malaise_SE.tsv和samples_metadata_malaise_MG.tsv。关于这些文件列的详细信息,请参阅README.txt文件。 对于从凋落物和土壤中采集的节肢动物样品,有两个文件,每个国家一个:samples_metadata_soil_litter_SE.tsv和samples_metadata_litter_MG.tsv。请注意,对于马达加斯加,我们没有从土壤中采集节肢动物样品。此外,请注意,对于马达加斯加,我们在每个陷阱位置采集了四个叶凋落物样品,每个方向一个(前、后、左、右);而对于瑞典,我们在每个陷阱位置只采集了一个样品。关于这些文件列的详细信息,请参阅README.txt文件。 地点元数据文件 有两个文件包含有关采样地点的信息,每个国家一个:sites_metadata_SE.tsv和sites_metadata_MG.tsv。关于这些文件列的详细信息,请参阅README.txt文件。 spike-ins元数据文件 我们提供了三个文件,包含有关在实验室处理样品时使用的spike-ins的信息:biological_spikes_taxonomy_SE.tsv和biological_spikes_taxonomy_MG.tsv包含生物spike-ins的分类信息,而synthetic_spikes_info.tsv包含合成spike-ins的信息。有关这些文件的详细信息,请参阅README.txt文件。 其他补充数据文件 我们提供了关于在瑞典和马达加斯加每个采样地点收集的土壤化学的补充数据,马达加斯加每个采样地点的林分特征以及来自Insect Biome Atlas项目(n=24)和瑞典昆虫清单项目(n=224)选定数量的马尔塞陷阱样品的生物量和计数数据。 土壤化学数据 我们为每个国家提供两个关于土壤化学(soil_chemistry_SE.tsv和soil_chemistry_MG.tsv)的数据集,其中存储了与昆虫群落采集地点相同的土壤样品中的土壤养分信息。在瑞典和马达加斯加的每个马尔塞陷阱周围5个地点(0-20cm的表层土壤)采集了一个土壤芯(直径6cm)位于陷阱中心,以及陷阱四侧各5米处的一个土壤芯。每个地点的土壤样品是从五个位置采集的复合样品。在瑞典采集的土壤样品在瑞典的Eurofins进行分析,而在马达加斯加采集的样品在马达加斯加的Laboratoire des Radioisotopes进行分析。由于每个国家的样品在不同的实验室进行分析,因此每个数据集中土壤养分变量略有不同。关于这些文件列的详细信息,请参阅README.txt文件。 林分特征数据 在马达加斯加进行了林分特征的测量,因为瑞典采样地点的景观组成和植被结构数据已经作为瑞典国家景观清单(NILS)的一部分编制完成,并且数据已公开可用。stand_characteristics_MG.tsv文件包含与树密度(DBH,遮荫等)相关的马达加斯加林分特征的详细信息。关于此文件列的详细信息,请参阅README.txt文件。 生物量和计数数据 为了评估马尔塞陷阱样品的生物量如何转换为标本数量,我们提供了两个描述瑞典样品的文件,我们测量了生物量并计数了样品中的所有标本。第一组包括来自IBA野外调查的24个样品(biomass_count_IBA.tsv),第二组包括2018-2019年瑞典昆虫清单项目(Swedish Insect Inventory Project)中单独的瑞典马尔塞陷阱调查的224个样品(biomass_count_SIIP.tsv)。对于后者数据集,我们在同一文件中提供地点和样品元数据。关于这些文件列的详细信息,请参阅README.txt文件。 参考文献 Egnér, H., Riehm, H., & Domingo, W. (1960). Untersuchungen über die chemische Bodenanalyse als Grundlage für die Beurteilung des Nährstoffzustandes der Böden. II. Chemische Extraktionsmethoden zur Phosphor-und Kaliumbestimmung. Kungliga Lantbrukshögskolans Annaler, 26, 199–215.
提供机构:
SciLifeLab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作