five

Multispecies pangenomes reveal pervasive influence of population size on evolution of structural variants

收藏
DataONE2025-07-23 更新2025-08-16 收录
下载链接:
https://search.dataone.org/view/sha256:062a388813d30e3b382cc0abe6a06e1efd507353c4424209768ec316b0098b31
下载链接
链接失效反馈
官方服务:
资源简介:
Structural variants (SVs) are widespread in vertebrate genomes, yet their evolutionary dynamics remain poorly understood. Using 45 long-read de novo genome assemblies and pangenome tools, we analyze SVs within three closely related species of North American jays (Aphelocoma, scrub-jays) displaying a 60-fold range in effective population size. We find rapid evolution of genome architecture, including ~100 Mb variation in genome size driven by dynamic satellite landscapes with unexpectedly long (> 10 kb) repeat units and widespread variation in gene content, influencing gene expression. SVs exhibit slightly deleterious dynamics modulated by variant length and population size, with strong evidence of adaptive fixation only in large populations. Our results demonstrate how population size shapes the distribution of SVs and the importance of pangenomes to characterizing genomic diversity., Forty-four genomes from three species of North American scrub jays (Aphelocoma insularis, A. woodhouseii and A. coerulescens) and one outgroup (Yucatán Jay, Cyanocorax yucatanicus) were sequenced using PacBio HiFi technology.  The sequence reads were assembled into primary assemblies and two haplotype assemblies using hifiasm (Cheng et al. 2021). We used various pangenome tools, including the Pangenome Graph Builder (PGGB; Garrison et al. 2024) and minigraph (Li et al. 2020) to detect and characterize structural variants, including inversions, within and between species. We used RepeatModeler2 and RepeatMasker to annotate repetitive elements (Smit et al. 2015 , Flynn et al. 2020).  We conducted demographic analysis with PSMC (Li et al. 2011), bpp (Rannala et al. 2017) and other programs. We used Panacus to estimate growth curves for the pangenome graphs (Parmigiani et al. 2024), and fastDFE (Sendrowski et al. 2024) and anavar (Barton et al. 2018) to estimate the distribution o..., , # Data from: Multispecies pangenomes reveal pervasive influence of population size on evolution of structural variants [https://doi.org/10.5061/dryad.8pk0p2p01](https://doi.org/10.5061/dryad.8pk0p2p01) ## Description of the data and file structure ### Files and variables **all_haps_repmask_nornd_cat_CS_CY.bed.gz** **Description:** This file contains a streamlined version of the output of RepeatMasker for each haplotype in the data set, including outgroups. The file is in bed format. The [RepeatMasker outfile](https://www.repeatmasker.org/webrepeatmaskerhelp.html) was converted to bed format by the [rmsk2bed command of bedops](https://bedops.readthedocs.io/en/latest/content/reference/file-management/conversion/rmsk2bed.html). The file contains 6 columns: Reference contig of haplotype; start coordinate of repeat; end coordinate of repeat; the type of repeat; strand; and category of repeat. The contig names begin with a two letter code indicating the species: AA=Aphelocoma californic...,
创建时间:
2025-07-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作