Populations genomics of deep-sea hydrothermal vent copepod Stygiopontius lauensis: from raw fasta files to filtered vcf file
收藏DataONE2024-03-08 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:b01b28f25e905f1c79b94d0da28fcbaf22ff4c18763b22210295c196250c11d6
下载链接
链接失效反馈官方服务:
资源简介:
Copepoda is the most abundant taxon in deep-sea hydrothermal vents, where hard substrate is available. Despite the increasing interest in seafloor massive sulphides exploitation, there have been no population genomic studies conducted on vent meiofauna, which are known to contribute over 50% to metazoan biodiversity at vents. To bridge this knowledge gap, restriction site-associated DNA sequencing, specifically 2b-RADseq, was used to retrieve thousands of genome-wide single nucleotide polymorphisms (SNPs) from abundant populations of the vent-obligate copepod Stygiopontius lauensis from the Lau Basin. SNPs were used to investigate population structure, demographic histories, genotype-environment associations at a basin scale. Genetic analyses also helped to evaluate the suitability of tailored larval dispersal models and the parameterization of life history traits that better fit the population patterns observed in the genomic dataset for the target organism. Highly structured populatio..., This dataset was collected by implementing a modified (for the target species) version of the original 2b-RAD protocols (Wang et al., 2012) on 149 individuals of Stygiopontius lauensis copepods. These were collected from hydrothermal vents in the Lau Basin (Southwest Pacific Ocean). The pooled library was composed of samples collected in 2016 and in 2019. Samples from 2016 were sequenced on an Illumina Nextseq 500, while those from 2019 were sequenced on an Illumina Nextseq 2000 at University Medical Centre, Utrecht (UMC). Samples were demultiplexed by UMC. Raw fastq files were further demultiplexed by specimen using a modified script that implements Cutadapt, finding specimen-specific barcodes. These files were then filtered for low-quality reads and duplicates in the same script, resulting in 149 fasta files being used for downstream analysis.Â
Fasta files were run through DiscoSnpRad++ using a kmer length of 15, and an auto read depth. The resulting vcf file was then run through the ..., , # Populations genomics of deep-sea hydrothermal vent Copepod Stygiopontius lauensis: from raw fasta files to filtered vcf file
[https://doi.org/10.5061/dryad.zkh1893g2](https://doi.org/10.5061/dryad.zkh1893g2)
## Description of the data and file structure
This dataset contains Fasta files which are demultiplexed by specimen and labeled as follows: specimen_ventsite_filtered/notfiltered. Where the samples are filtered, they are filtered by GC content (RADs with GC higher than 50% were removed). These files are the raw files used for all analyses in the study. Additionally, a popmap.txt file can be found containing the mapping information needed to group the Fasta files and subsequent genotypes by metadata of the vent site and year collected. For population genomics, either a global alignment method such as STACKS2 or a k-mer-based approach for calling Single Nucleotide Polymorphisms (SNPs) such as DiscoSnpRAD++ can be applied to the raw fasta files following the instructions for each ...
创建时间:
2025-07-28



