five

Data and analysis scripts to accompany: Coral genetic structure in the Western Indian Ocean mirrors ocean circulation and thermal stress history

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.931zcrjxv
下载链接
链接失效反馈
官方服务:
资源简介:
With ever-growing concerns over the conditions of coral reefs under warming oceans comes a need to better understand the connectivity and adaptive capacities of reef-building corals from across large oceanic regional extents. This dataset accompanies the published article by Guillaume et al (2026) in Evolutionary Applications, where we applied a seascape genomics approach to model (i) population connectivity and (ii) thermal adaptive potentials for two keystone coral species across the West Indian Ocean (WIO). Specifically, we sampled 345 Acropora muricata and 403 Pocillopora damicornis individuals around islands of three WIO regions–Seychelles, Mauritius, and Rodrigues–as part of a UN development programme (UNDP; funding information below). Genomic data were obtained from DNA extracted from sampled coral tissue and sequenced with DArT-seq. We used a bioinformatic pipeline to process and filter reads, which were then used to assess population structure and connectivity, identify putative genomic regions under thermal selection, and produce  maps of adaptive potential across the WIO. This repository hosts one pdf per species that steps the user through all the main analyses of the publication. These pdfs have been produced using the Rmarkdown script provided, which will run when all the accompanying data (sample information, environmental variables, genomic data) are downloaded into one folder.  Methods Sample design and study species: Two keystone reef-building coral species, Acropora muricata and Pocillopora damicornis, were sampled from 15 reefs with contrasting environmental conditions around the Seychelles, Mauritius, and Rodrigues islands in the WIO in 2022. These species have distinct life history strategies, warranting independent investigations of population structure and thermal tolerance. A. muricata is predominately a broadcast spawning coral with synchronised reproduction in November to January, followed by larvae that generally settle within 10–14 days of fertilisation. P. damicornis has a mixed reproductive modes, reproducing sexually via year-round broadcast spawning or by producing brooding planula larvae with rapid settlement rates. Clonal propagation is also common, with the production of asexual larvae or polyps that can disperse over large distances (>50km). Overall, 345 A. muricata and 403 P. damicornis colonies were sampled for DNA extraction and single nucleotide polymorphism (SNP) genotyping. All analyses were performed in the R environment, following the script attached. Environmental variables: We used 10 uncorrelated environmental and geomorphic variables to characterise the seascape at the study sites (details of variables in Environmental_variable_description.docx). We further use sSST, mDHW and BackReef as predictors in genotype–environment association (GEA) analyses. Genomic data: DNA extraction and SNP genotyping followed methods of Selmoni et al. (2021), where extracted DNA was sent to Diversity Arrays Technology (Canberra, Australia) for quality check screening and genotype-by-sequencing using the DArT-sequencing method (DArT-seq). Here, we provide the metadata for the FASTQ files uploaded to GEOME and NCBI (BioProject PRJNA1277000; Metadata_DARTinfo_species.csv). The Dar-T-seq analytical pipeline resulted in 73,253 bi-allelic SNPs genotyped for 345 A. muricata individuals and 65,708 SNPs for 403 P. damicornis individuals (SNPS_RAW_species.rda). These SNPs were called against the chromosome-level reference genome of Acropora millepora (v2.1; GCF_013753865.1; Fuller et al., 2020) for A. muricata, and the scaffold-level reference genome assembly of P. damicornis (v1; GCF_003704095.1; Cunning et al., 2018) for P. damicornis. After filtering (detailed below) our final genotype matrix comprised of 211 A. muricata individuals with 7663 SNPs and 97 P. damicornis individuals with 1319 (SNPS_FILT_species.rda). SNP filtering: SNP filtering was performed for each species separately. We first removed putatively cryptic individuals likely present for our sampled species, where ‘cryptic’ is defined as genetically distinct groups among sets of colonies identified in-situ as the same species (Grupstra et al., 2024; Riginos et al., 2024). To this end, we performed a soft filtering for loci and individual missingness (50% threshold each), then ran a principal coordinate analysis (PCoA) on Euclidean distances of the genotype matrix (an ordination-based method that can handle missing data). We used PCoA axes to visually identify clusters of genetically distinct individuals co-occurring with most of the sampled individuals. Individuals from these clusters are putatively cryptic and were removed, as recommended by standard guidelines. We then identified and removed clonal genotypes on a hard filtered dataset (loci and individuals each pruned for 80% missingness). Clones were identified as those sharing over 95% or 90% of their genotypes for A. muricata and P. damicornis, respectively, using the gl.report.replicates function in the dartR.base package (v. 1.0.5; Mijangos et al., 2022). The choice of threshold corresponds to the separation between first-degree relatives and replicated individuals in the histogram of pairwise relatedness values per species. For each group of putative clones, we retained the one individual with the least missing SNP data when genetic similarity was found. We identified and removed another cluster of putatively cryptic individuals for P. damicornis as per the first cryptic filtering step. Finally, we performed a hard filtering on the genotype matrix pruned for cryptic and clonal individuals, applying a missingness threshold of 80% for loci and individuals, before excluding rare alleles with a MAF <5%. Global MAF thresholds were applied to retain population-specific SNP variants for downstream population structure and genotype–environment analyses. Neutral genetic structure: Neutral genetic structure was assessed for each species using complementary methods: i) Principal Coordinate Analysis (PCoA) and ii) Sparse Nonnegative Matrix Factorization (sNMF), where we used a putatively neutral genomic dataset (hereafter termed ‘neutral genotype matrix’) by removing outlier loci identified using a genome scan of the filtered genotype matrix with pcadapt. Isolation-by-Distance vs Isolation-by-Resistance: Patterns of Isolation-by-Distance (IBD) and Isolation-by-Resistance (IBR) were identified between reefs by regressing linearised genetic distance (i.e., FST/1-FST; Rousset, 1997) against Euclidean or sea distances, respectively. IBD was calculated from the log10 of Euclidean distances obtained by converting degree latitude and longitude to cartesian coordinates. IBR was calculated from the log10 of sea distances derived from ocean current data for each calendar month and annually. Genotype–Environment Associations (GEA): We performed GEAs using multivariate redundancy analyses (RDA) at the site-level following methods of Capblancq and Forester (2021). For each sampling site, we converted individual genotypes of the filtered dataset to allele frequencies per locus by averaging observed genotypes across individuals, ignoring missing data. We imputed allele frequencies for loci missing information at the site-level using the median allele frequency from the other sites in the same region (i.e., Seychelles, Mauritius or Rodrigues). We chose explanatory environmental variables for the RDA using a bidirectional stepwide model selection procedure (ordistep), which retained mDHW (mean degree heating week) and sSST (sd of sea surface temperature) for both species, as well as BackReef (proportion of backreef around site) for P. damicornis.  To detect putative genomic regions under selection, we performed a multivariate GEA using a partial RDA (pRDA) on the entire genotype matrix, with the ordistep selected variables as predictors. Reef-level population structure was used to condition the RDA, limiting false positives while potentially reducing power to detect true outlier loci along neutral gradients Gene Ontology (GO) enrichment analysis: Gene Ontology (GO) enrichment analyses were used to assess the putative molecular function(s) of significant outlier SNPs (i.e., q-value<0.05) from the pRDA. Estimating the Adaptive Seascape: We estimated the adaptive seascape across the WIO to identify reefs potentially harbouring adaptive genetic variation linked to thermal gradients. For each species, we performed an RDA to quantify genotype–environment interactions using site-level allele frequencies as the response matrix, using the ordistep selected environmental variables. We then projected environmental conditions from all mapped WIO reefs into the RDA ordination space to derive a genetic-based index of adaptation (Adaptive Index) for each environmental pixel of the seascape. This Adaptive Index reflects how genetic variation is structured relative to thermal gradients, allowing us to map the spatial distribution of reefs where allelic compositions are most consistent with thermal adaptive potential across the WIO seascape Funding statement and acknowledgements This study was funded by the Adaptation Fund (AF), who financed the “Restoring Marine Ecosystem Services by Rehabilitating Coral Reefs to Meet a Changing Climate Future” project (PIMS No. 5736), implemented by United Nations Development Programme (UNDP) Mauritius and Seychelles, with the support of the Ministry of Agro-Industry, Food Security, Blue Economy, and Fisheries, and in the Republic of Seychelles with the support of the Ministry of Agriculture, Climate Change and Environment. They were sampled under permits SPA 30 and A0157. We gratefully acknowledge the contributions of the following institutions to field sampling efforts: Albion Fisheries Research Center, Eco-Sud, Reef Conservation, Marine Conservation Society Seychelles, Mauritius Oceanographic Institute, Nature Seychelles, Seychelles National Park Authority, Rodrigues Regional Assembly and Shoals Rodrigues. We thank the UNDP Mauritius and Seychelles for their support in coordinating activities throughout the project. We also thank Katherine Prata and Zoe Meziere for discussions regarding genomic analyses. ASG acknowledges support from the Swiss National Science Foundation (ASG acknowledges support from the Swiss National Science Foundation (SNSF; Postdoc Mobility Fellowship P500PB_230450). The data were used in the following scientific publication: Guillaume, A.S., Joost, S., Curpen, S., Dumur Neelayy, D., Harree-Somah, L., Sadasing, O., Saponari, L., Dale, C., Barret, L., Andrews, N., Leckraz, S.K., François, R., Seetapah, V., Munusami, V., Bacha Gian, S., Jhangeer-Khan, R., Mahoune, T., Chumun, P.K., Poretti, M., Berteaux-Lecellier, V., Lecellier, G., Selmoni, O. (2026). Coral genetic structure in the Western Indian Ocean mirrors ocean circulation and thermal stress history. Evolutionary Applications. https://doi.org/10/1111/eva.70206 A.S.G. and O.S. wrote the scripts for these analyses, with final versioning and archiving by A.S.G.
创建时间:
2026-02-19
二维码
社区交流群
二维码
科研交流群
商业服务