Data from: How to optimize the precision of allele and haplotype frequency estimates using pooled-sequencing data

Name: Data from: How to optimize the precision of allele and haplotype frequency estimates using pooled-sequencing data
Creator: Dryad
Published: 2025-06-01 02:18:00
License: 暂无描述

DataCite Commons2025-06-01 更新2025-06-15 收录

下载链接：

https://datadryad.org/dataset/doi:10.5061/dryad.cr65v

下载链接

链接失效反馈

官方服务：

资源简介：

Sequencing pools of individuals rather than individuals separately reduces the costs of estimating allele frequencies at many loci in many populations. Theoretical and empirical studies show that sequencing pools comprising a limited number of individuals (typically fewer than 50) provides reliable allele frequency estimates, provided that the DNA pooling and DNA sequencing steps are carefully controlled. Unequal contributions of different individuals to the DNA pool and the mean and variance in sequencing depth both can affect the standard error of allele frequency estimates. To our knowledge, no study separately investigated the effect of these two factors on allele frequency estimates; so that there is currently no method to a priori estimate the relative importance of unequal individual DNA contributions independently of sequencing depth. We develop a new analytical model for allele frequency estimation that explicitly distinguishes these two effects. Our model shows that the DNA pooling variance in a pooled sequencing experiment depends solely on two factors: the number of individuals within the pool and the coefficient of variation of individual DNA contributions to the pool. We present a new method to experimentally estimate this coefficient of variation when planning a pooled sequencing design where samples are either pooled before or after DNA extraction. Using this analytical and experimental framework, we provide guidelines to optimize the design of pooled sequencing experiments. Finally, we sequence replicated pools of inbred lines of the plant Medicago truncatula and show that the predictions from our model generally hold true when estimating the frequency of known multilocus haplotypes using pooled sequencing.

提供机构：

Dryad

创建时间：

2017-10-02

5,000+

优质数据集

54 个

任务类型

进入经典数据集