Populations genomics of deep-sea hydrothermal vent copepod Stygiopontius lauensis: from raw fasta files to filtered vcf file
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.zkh1893g2
下载链接
链接失效反馈官方服务:
资源简介:
Copepoda is the most abundant taxon in deep-sea hydrothermal vents, where hard substrate is available. Despite the increasing interest in seafloor massive sulphides exploitation, there have been no population genomic studies conducted on vent meiofauna, which are known to contribute over 50% to metazoan biodiversity at vents. To bridge this knowledge gap, restriction site-associated DNA sequencing, specifically 2b-RADseq, was used to retrieve thousands of genome-wide single nucleotide polymorphisms (SNPs) from abundant populations of the vent-obligate copepod Stygiopontius lauensis from the Lau Basin. SNPs were used to investigate population structure, demographic histories, genotype-environment associations at a basin scale. Genetic analyses also helped to evaluate the suitability of tailored larval dispersal models and the parameterization of life history traits that better fit the population patterns observed in the genomic dataset for the target organism. Highly structured populations were observed on both spatial and temporal scales, with divergence of populations between the north, mid, and south of the basin estimated to have occurred after the creation of the major transform fault dividing the Australian and the Niuafo’ou tectonic plate (350 kya), with relatively recent secondary contact events (< 20 kya). Larval dispersal models were able to predict the high levels of structure and the highly asymmetric northward low-level gene flow observed in the genomic data. These results differ from most studies conducted on megafauna in the region, elucidating the need to incorporate smaller size when considering site-prospecting for deep-sea exploitation of seafloor massive sulphides, and the creation of area-based management tools to protect areas at risk of local extinction, should mining occur.
Methods
This dataset was collected by implementing a modified (for the target species) version of the original 2b-RAD protocols (Wang et al., 2012) on 149 individuals of Stygiopontius lauensis copepods. These were collected from hydrothermal vents in the Lau Basin (Southwest Pacific Ocean). The pooled library was composed of samples collected in 2016 and in 2019. Samples from 2016 were sequenced on an Illumina Nextseq 500, while those from 2019 were sequenced on an Illumina Nextseq 2000 at University Medical Centre, Utrecht (UMC). Samples were demultiplexed by UMC. Raw fastq files were further demultiplexed by specimen using a modified script that implements Cutadapt, finding specimen-specific barcodes. These files were then filtered for low-quality reads and duplicates in the same script, resulting in 149 fasta files being used for downstream analysis.
Fasta files were run through DiscoSnpRad++ using a kmer length of 15, and an auto read depth. The resulting vcf file was then run through the DiscoSnpRad postprocessing scripts to generate a vcf file containing 1 SNP per locus. This resulting vcf file was filtered for missing data (cutoff = 0.3) and heterozygosity (cutoff = 0.6). Individuals with data > 0.5 were also removed.
创建时间:
2024-03-08



