Data from: pcadapt: an R package to perform genome scans for selection based on principal component analysis
收藏DataCite Commons2025-05-01 更新2025-05-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.8290n
下载链接
链接失效反馈官方服务:
资源简介:
The R package pcadapt performs genome scans to detect genes under
selection based on population genomic data. It assumes that candidate
markers are outliers with respect to how they are related to population
structure. Because population structure is ascertained with principal
component analysis, the package is fast and works with large-scale data.
It can handle missing data and pooled sequencing data. By contrast to
population-based approaches, the package handle admixed individuals and
does not require grouping individuals into populations. Since its first
release, pcadapt has evolved in terms of both statistical approach and
software implementation. We present results obtained with robust
Mahalanobis distance, which is a new statistic for genome scans available
in the 2.0 and later versions of the package. When hierarchical population
structure occurs, Mahalanobis distance is more powerful than the
communality statistic that was implemented in the first version of the
package. Using simulated data, we compare pcadapt to other computer
programs for genome scans (BayeScan, hapflk, OutFLANK, sNMF). We find that
the proportion of false discoveries is around a nominal false discovery
rate set at 10% with the exception of BayeScan that generates 40% of false
discoveries. We also find that the power of BayeScan is severely impacted
by the presence of admixed individuals whereas pcadapt is not impacted.
Last, we find that pcadapt and hapflk are the most powerful in scenarios
of population divergence and range expansion. Because pcadapt handles
next-generation sequencing data, it is a valuable tool for data analysis
in molecular ecology.
提供机构:
Dryad
创建时间:
2016-08-11



