Data from: Composite measures of selection can improve the signal-to-noise ratio in genome scans
收藏DataCite Commons2025-06-01 更新2025-04-09 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.bp11m
下载链接
链接失效反馈官方服务:
资源简介:
The growing wealth of genomic data is yielding new insights into the
genetic basis of adaptation, but it also presents the challenge of
extracting the relevant signal from multi-dimensional datasets. Different
statistical approaches vary in their power to detect selection depending
on the demographic history, type of selection, genetic architecture and
experimental design. Here, we develop and evaluate new approaches for
combining results from multiple tests, including multivariate distance
measures and methods for combining P-values. We evaluate these methods on
(i) simulated landscape genetic data analysed for differentiation outliers
and genetic-environment associations and (ii) empirical genomic data
analysed for selective sweeps within dog breeds for loci known to be
selected for during domestication. We also introduce and evaluate how
robust statistical algorithms can be used for parameter estimation in
statistical genomics. On the simulated data, many of the composite
measures performed well and had decreased variation in outcomes across
many sampling designs. On the empirical dataset, methods based on
combining P-values generally performed better with clearer signals of
selection, higher significance of the signal, and in closer proximity to
the known selected locus. Although robust algorithms could identify
neutral loci in our simulations, they did not universally improve power to
detect selection. Overall, a composite statistic that measured a robust
multivariate distance from rank-based P-values performed the best. We
found that composite measures of selection could improve the signal of
selection in many cases, but they were not a panacea and their power is
limited by the power of the univariate statistics they summarize. Since
genome scans are widely used, improving inference for prioritizing
candidate genes may be beneficial to medicine, agriculture, and breeding.
Our results also have application to outlier detection in high-dimensional
datasets and to combining results in meta-analyses in many disciplines.
The compound measures we evaluate are implemented in the r package
minotaur.
提供机构:
Dryad
创建时间:
2017-03-15



