five

A Geometric Perspective on the Power of Principal Component Association Tests in Multiple Phenotype Studies

收藏
DataCite Commons2021-09-29 更新2024-07-27 收录
下载链接:
https://tandf.figshare.com/articles/dataset/A_Geometric_Perspective_on_the_Power_of_Principal_Component_Association_Tests_in_Multiple_Phenotype_Studies/7015253
下载链接
链接失效反馈
官方服务:
资源简介:
Joint analysis of multiple phenotypes can increase statistical power in genetic association studies. Principal component analysis, as a popular dimension reduction method, especially when the number of phenotypes is high dimensional, has been proposed to analyze multiple correlated phenotypes. It has been empirically observed that the first PC, which summarizes the largest amount of variance, can be less powerful than higher-order PCs and other commonly used methods in detecting genetic association signals. In this article, we investigate the properties of PCA-based multiple phenotype analysis from a geometric perspective by introducing a novel concept called principal angle. A particular PC is powerful if its principal angle is 0° and is powerless if its principal angle is 90°. Without prior knowledge about the true principal angle, each PC can be powerless. We propose linear, nonlinear, and data-adaptive omnibus tests by combining PCs. We demonstrate that the Wald test is a special quadratic PC-based test. We show that the omnibus PC test is robust and powerful in a wide range of scenarios. We study the properties of the proposed methods using power analysis and eigen-analysis. The subtle differences and close connections between these combined PC methods are illustrated graphically in terms of their rejection boundaries. Our proposed tests have convex acceptance regions and hence are admissible. The <i>p</i>-values for the proposed tests can be efficiently calculated analytically and the proposed tests have been implemented in a publicly available R package <i>MPAT</i>. We conduct simulation studies in both low- and high-dimensional settings with various signal vectors and correlation structures. We apply the proposed tests to the joint analysis of metabolic syndrome-related phenotypes with datasets collected from four international consortia to demonstrate the effectiveness of the proposed combined PC testing procedures. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.

在遗传关联研究领域,多表型联合分析可有效提升统计效力。主成分分析(Principal Component Analysis, PCA)作为主流降维方法,在表型维度较高的场景下,被广泛应用于多相关表型的分析工作。已有实证研究显示,在检测遗传关联信号时,汇总最大方差的第一主成分(Principal Component, PC),其统计效力可能弱于高阶主成分以及其他常用分析方法。本文通过引入一种名为主夹角(principal angle)的全新概念,从几何视角系统探究基于主成分分析的多表型分析方法的内在特性。若某一主成分的主夹角为0°,则其具备较强的统计效力;若主夹角为90°,则其统计效力不足。在未知真实主夹角的前提下,任意主成分均可能不具备检测关联信号的效力。本文通过组合主成分,提出了线性、非线性与数据自适应的综合检验(omnibus test)方法。本文证明沃尔德检验(Wald test)是一种特殊的基于主成分的二次检验方法。本文进一步证实,基于主成分的综合检验在多种研究场景下均展现出良好的稳健性与统计效力。本文通过效力分析与特征分析,对所提方法的特性展开了系统性研究。本文通过可视化的拒绝域形式,清晰阐释了各类组合主成分方法之间的细微差异与紧密关联。所提检验方法的接受域为凸集,因此具备可容许性。所提检验的p值可通过解析方法高效计算,且相关实现已部署于公开可用的R包MPAT中。本文在低维与高维两种场景下,针对多种信号向量与相关结构开展了模拟验证研究。本文将所提检验方法应用于四项国际联盟收集的代谢综合征(metabolic syndrome)相关表型的联合分析,验证了所提组合主成分检验流程的实际有效性。本文的补充材料(包含可用于复现本研究的标准化材料说明)可通过在线补充资源获取。
提供机构:
Taylor & Francis
创建时间:
2018-08-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作