five

Evolution of the correlated genomic variation landscape across a divergence continuum in the genus Castanopsis

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.kkwh70scm
下载链接
链接失效反馈
官方服务:
资源简介:
The heterogeneous landscape of genomic variation has been well documented in population genomic studies. However, disentangling the intricate interplay of evolutionary forces influencing the genetic variation landscape over time remains challenging. In this study, we assembled a chromosome-level genome for Castanopsis eyrei and sequenced the whole genomes of 276 individuals from 12 Castanopsis species, spanning a broad divergence continuum. We found highly correlated genomic variation landscapes across these species. Furthermore, variations in genetic diversity and differentiation along the genome were strongly associated with recombination rates and gene density. These results suggest that long-term linked selection and conserved genomic features have contributed to the formation of a common genomic variation landscape. By examining how correlations between population summary statistics change throughout the species divergence continuum, we determined that background selection alone does not fully explain the observed patterns of genomic variation; the effects of recurrent selective sweeps must be considered. We further revealed that extensive gene flow has significantly influenced patterns of genomic variation in Castanopsis species. The estimated admixture proportion correlated positively with recombination rate and negatively with gene density, supporting a scenario of selection against gene flow. Additionally, putative introgression regions exhibited strong signals of positive selection, an enrichment of functional genes, and reduced genetic burdens, indicating that adaptive introgression has played a role in shaping the genomes of hybridizing species. This study provides insights into how different evolutionary forces have interacted in driving the evolution of the genomic variation landscape. Methods Individuals (N = 267) were collected from 12 Castanopsis species, including: 21 C. carlesii; 25 C. fargesii; 25 C. eyrei; 24 C. lamontii; 28 C. fabri; 19 C. hystrix; 20 C. fordii; 26 C. tibetana; 10 C. chinensis; 23 C. sclerophylla; 24 C. jucunda; and 22 C. fissa (Supplementary Table S1). For each individual, genomic DNA was extracted from silica-dried leaves using a Plant DNA Kit (Bioteke, Beijing, China) and sequenced on the Illumina NovaSeq 6000 platform (150-bp paired-end reads) with a target coverage of 30×. Raw sequencing data were cleaned using Trimmomatic v.0.38 (Bolger et al. 2014) to remove low quality sequences. Cleaned reads were then aligned to the C. eyrei reference genome using BWA v.0.7.15 (Li and Durbin 2010), and genotypes called using HaplotypeCaller implemented in GATK v.4.1 (Depristo et al. 2011). All individuals included in this study exhibited a high mapping rate (90.26%-98.32%), with a relative low mapping rate appearing to be individual-specific rather than species-specific (Supplementary Table S1 and Fig S20), suggesting that there is no species-specific bias due to divergence from the reference. These results suggested that the effects of reference bias were likely minimal in this study. To further minimize bias in SNP and genotype calling, SNPs that met any of the following conditions were discarded: (1) located within repetitive regions of the C. eyrei reference genome; (2) more than two alleles present; (3) sequencing depth > 100 or < 5; (4) missing rate ≥ 0.3; (4) heterozygosity rate (proportion of heterozygotes among all genotypes) > 0.5; (6) indels. Additionally, only homozygous genotypes supported by ≥ 4 reads were considered. For heterozygous genotypes, the minor allele was required to be supported by ≥ 2 reads, and the read ratio (number of reads supporting the minor allele/the number of reads supporting the major allele) was required to be > 0.1 and < 0.9.
创建时间:
2024-07-08
二维码
社区交流群
二维码
科研交流群
商业服务