Data from: Next-generation sequencing for molecular ecology: a caveat regarding pooled samples
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.cp8t9
下载链接
链接失效反馈官方服务:
资源简介:
We develop a model based on the Dirichlet-compound multinomial
distribution (CMD) and Ewens sampling formula to predict the fraction of
SNP loci that will appear fixed for alternate alleles between two pooled
samples drawn from the same underlying population. We apply this model to
next-generation sequencing (NGS) data from Baltic Sea herring recently
published by (Corander et al., , Molecular Ecology, 2931–2940), and show
that there are many more fixed loci than expected in the absence of
genetic structure. However, we show through coalescent simulations that
the degree of population structure required to explain the fraction of
alternatively fixed SNPs is extraordinarily high and that the surplus of
fixed loci is more likely a consequence of limited representation of
individual gene copies in the pooled samples, than it is of population
structure. Our analysis signals that the use of NGS on pooled samples to
identify divergent SNPs warrants caution. With pooled samples, it is hard
to diagnose when an NGS experiment has gone awry; especially when NGS data
on pooled samples are of low read depth with a limited number of
individuals, it may be worthwhile to temper claims of unexpected
population differentiation from pooled samples, pending verification with
more reliable methods or stricter adherence to recommended sampling
designs for pooled sequencing e.g. Futschik & Schlötterer ,
Genetics, 186, 207; Gautier et al., , Molecular Ecology, 3766–3779).
Analysis of the data and diagnosis of problems is easier and more reliable
(and can be less costly) with individually barcoded samples. Consequently,
for some scenarios, individual barcoding may be preferable to pooling of
samples.
提供机构:
Dryad
创建时间:
2013-12-05



