Data from: RADseq underestimates diversity and introduces genealogical biases due to nonrandom haplotype sampling
收藏DataCite Commons2025-06-01 更新2025-04-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.c9s0g
下载链接
链接失效反馈官方服务:
资源简介:
Reduced representation genome-sequencing approaches based on restriction
digestion are enabling large-scale marker generation and facilitating
genomic studies in a wide range of model and nonmodel systems. However,
sampling chromosomes based on restriction digestion may introduce a bias
in allele frequency estimation due to polymorphisms in restriction sites.
To explore the effects of this nonrandom sampling and its sensitivity to
different evolutionary parameters, we developed a coalescent-simulation
framework to mimic the biased recovery of chromosomes in restriction-based
short-read sequencing experiments (RADseq). We analysed simulated DNA
sequence datasets and compared known values from simulations with those
that would be estimated using a RADseq approach from the same samples. We
compare these ‘true’ and ‘estimated’ values of commonly used summary
statistics, π, θw, Tajima's D and FST. We show that loci with missing
haplotypes have estimated summary statistic values that can deviate
dramatically from true values and are also enriched for particular
genealogical histories. These biases are sensitive to nonequilibrium
demography, such as bottlenecks and population expansion. In silico
digests with 102 completely sequenced Drosophila melanogaster genomes
yielded results similar to our findings from coalescent simulations.
Though the potential of RADseq for marker discovery and trait mapping in
nonmodel systems remains undisputed, our results urge caution when
applying this technique to make population genetic inferences.
提供机构:
Dryad
创建时间:
2013-02-18



