Landscape genomics reveals genetic signals of environmental adaptation of African wild eggplant
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.zkh1893g3
下载链接
链接失效反馈官方服务:
资源简介:
Crop wild relatives possess desirable traits that confer resilience to various environmental stresses. We applied landscape genomics, that associates environment with genomic variation to understand the genetic basis of their adaptation.
In this study, we applied landscape genomics to examine the differences in allele frequency of 15,416 Single Nucleotide Polymorphisms (SNPs) among 153 accessions of wild eggplant relatives from Africa, the principal hotspot of these wild relatives. Further, we explored the correlation between the genetic variations and the bio-climatic and soil conditions at their collection sites.
Our results showed that the environment has a greater impact on the genetic variation in the eggplant wild relative populations compared to the geographical distances between collection sites while controlling for population structure. These findings indicate the relevance of the environment in shaping genetic variation in eggplant relatives over time. We detected also candidate SNPs associated with ten environmental factors. Some of these SNPs signal genes involved in pathways that help with adaptation to environmental stresses such as drought, heat, cold, salinity, pests, and diseases.
Methods
SNP dataset
According to the manufacturer's instructions, we isolated the genomic DNA from fresh leaves of five seedlings per accession using the FavorPrep Plant Genomic DNA Extraction Mini Kit (FAVORGEN). We then constructed the sequencing library following the approach of Elshire et al. (2011). Genomic DNA was quantified by Qubit and normalized to 100ng in 96-well plates. We digested the DNA samples using the restriction enzyme ApeKI and ligated them with two adapters for sequencing, followed by the polymerase chain reaction to amplify the target DNA fragments to complete the sequencing library preparation. A service provider did sequencing with the Illumina HiseqX platform in a pair-end 150bp run.For the SNP calling, we followed mainly the manual of Stacks software (Catchen et al., 2013). In short, we filtered the raw reads by quality and demultiplexed using the process radtags program. We then mapped the retained reads to the eggplant reference genome (Eggplant_V4.1.fa) (Barchi et al., 2021) using the Burrows-Wheeler Aligner (BWA) version 0.7.17 (Li & Durbin, 2009). We sorted and indexed the reads using Samtools version 1.15.1 (Li et al., 2009), after which we performed the variant calling using the gstacks and population programs in Stacks software. We further filtered the SNPs and the accessions with less than 20% missing data and a Minor Allele Frequency (MAF) > 0.05, giving the final high-quality SNP dataset comprising 15,146 SNPs.
Environmental variable dataset
We downloaded the grids for 19 bioclimatic variables, solar radiation, wind speed, and vapor pressure derived from WorlClim 2.1 (Fick & Hijmans, 2017) at a resolution of 2.5 minutes. The 19 bioclimatic variables were each downloaded as annual data averages between 1970 and 2000. We averaged the monthly solar radiation, wind, and vapor pressure rasters to obtain annual value rasters from this period. We downloaded the soil data from the SoilGrids database released in 2016 (https://soilgrids.org/) through ISRIC—WDC Soils (Hengl et al., 2017) at 250-meter resolution and at a depth of 15-30 cm, approximately the depth at which the eggplant roots can grow. Soil variables included nitrogen, soil organic carbon, organic carbon density, organic carbon stock, cation exchange capacity, pH, clay sand, and silt content. The soil dataset resolution was aggregated to match that of the climate data using the resample and extent functions of the raster package in R (Hijmans, 2023), ensuring they are consistent in both resolution and extent. The environmental variables for each accession with the extract function of the R raster package (Hijman, 2023) using the GIS coordinates at sampling points to obtain a full data set of all the climate and soil variables. For our modeling, we selected the environmental variables based on Variance Inflation Factors (VIFs) selecting for variables with a VIF less than 5.
创建时间:
2023-09-18



