five

Optimizing sampling design for landscape genomics

收藏
DataONE2024-11-18 更新2025-04-26 收录
下载链接:
https://search.dataone.org/view/sha256:d4e8a8f75b65e2112ccce2792df38c203cc41982d532d38cb94711d80f36ce7d
下载链接
链接失效反馈
官方服务:
资源简介:
Landscape genomic approaches for detecting genotype-environment associations (GEA), isolation by distance (IBD), and isolation by environment (IBE) have seen a dramatic increase in use, but there have been few thorough analyses of the influence of sampling strategy on their performance under realistic genomic and environmental conditions. We simulated 24,000 datasets across a range of scenarios with complex population dynamics and realistic landscape structure to evaluate the effects of the spatial distribution and number of samples on common landscape genomics methods. Our results show that common analyses are relatively robust to sampling scheme as long as sampling covers enough environmental and geographic space. We found that for detecting adaptive loci and estimating IBE, sampling schemes that were explicitly designed to increase coverage of available environmental space matched or outperformed sampling schemes that only considered geographic space. When sampling does not cover ade..., This dataset was generated from simulations run in Python version 3.9.7 (Van Rossum & Drake, 2009) using Geonomics version 1.3.9 (Terasaki Hart et al., 2021). We ran simulations varying population size, migration rate, selection strength, spatial autocorrelation, and environmental correlation, each at a “low” and “high” level. We ran 10 replications of each simulation to capture variation in results due to stochasticity. Together with three sets of simulated landscapes, this produced a total of 960 simulations (30 repetitions of each of 32 unique parametrizations). This dataset contains a compressed tarball (.tar.gz) with 960 pairs of CSV files and Variant Call Format (VCF) files with genomic data for each of the 960 simulations. A complete description of the methods used to collect and process this dataset is available in the corresponding paper (Bishop et al., 2024). The corresponding code used to create these simulations is archived on Zenodo (DOI 10.5281/zenodo.14009716)., , # Data from: Optimizing sampling design for landscape genomics [https://doi.org/10.5061/dryad.63xsj3v8s](https://doi.org/10.5061/dryad.63xsj3v8s) This dataset contains a compressed tarball (.tar.gz) with the simulation data used in \"Optimizing sampling design for landscape genomics\" ## Description of the data and file structure The tarball must first be unpacked. For example, this can be done using this bash code: `tar -xzvf LGS_simulation_archive.tar.gz` The tarball contains 960 pairs of CSV files and Variant Call Format (VCF) files with genomic data for each of the 960 simulations. The tarball also contains CSV files ending in NONNEUTS which provide the indices of the adaptive loci corresponding to each simulated trait. Each file is titled as such: mod-K[1 or 2]_phi[50 or 100]_m[25 or 100]_H[5 or 50]_r[30 or 60]_it-[1-10]_t-6000_spp-spp_0 The values within brackets represent the different low/high parameter levels (e.g., K1 = small population and K2 = large population) or the it...
创建时间:
2024-11-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作