five

A Geometric Perspective on the Power of Principal Component Association Tests in Multiple Phenotype Studies

收藏
DataCite Commons2021-09-29 更新2024-08-18 收录
下载链接:
https://tandf.figshare.com/articles/dataset/A_Geometric_Perspective_on_the_Power_of_Principal_Component_Association_Tests_in_Multiple_Phenotype_Studies/7015253/4
下载链接
链接失效反馈
官方服务:
资源简介:
Joint analysis of multiple phenotypes can increase statistical power in genetic association studies. Principal component analysis, as a popular dimension reduction method, especially when the number of phenotypes is high dimensional, has been proposed to analyze multiple correlated phenotypes. It has been empirically observed that the first PC, which summarizes the largest amount of variance, can be less powerful than higher-order PCs and other commonly used methods in detecting genetic association signals. In this article, we investigate the properties of PCA-based multiple phenotype analysis from a geometric perspective by introducing a novel concept called principal angle. A particular PC is powerful if its principal angle is 0° and is powerless if its principal angle is 90°. Without prior knowledge about the true principal angle, each PC can be powerless. We propose linear, nonlinear, and data-adaptive omnibus tests by combining PCs. We demonstrate that the Wald test is a special quadratic PC-based test. We show that the omnibus PC test is robust and powerful in a wide range of scenarios. We study the properties of the proposed methods using power analysis and eigen-analysis. The subtle differences and close connections between these combined PC methods are illustrated graphically in terms of their rejection boundaries. Our proposed tests have convex acceptance regions and hence are admissible. The <i>p</i>-values for the proposed tests can be efficiently calculated analytically and the proposed tests have been implemented in a publicly available R package <i>MPAT</i>. We conduct simulation studies in both low- and high-dimensional settings with various signal vectors and correlation structures. We apply the proposed tests to the joint analysis of metabolic syndrome-related phenotypes with datasets collected from four international consortia to demonstrate the effectiveness of the proposed combined PC testing procedures. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.

在遗传关联研究中,多表型联合分析可提升统计功效。主成分分析(Principal Component Analysis,PCA)作为一种常用的降维方法,已被提出用于分析多相关表型,尤其适用于表型数量为高维的场景。已有实证研究表明,在检测遗传关联信号时,汇总最大方差的第一主成分(Principal Component,PC)的统计功效可能低于高阶主成分以及其他常用方法。本文通过引入一种名为主夹角(principal angle)的全新概念,从几何视角探究基于主成分分析的多表型分析方法的特性。当某一主成分的主夹角为0°时,其统计功效较强;若主夹角为90°,则统计功效较弱。在缺乏真实主夹角先验信息的情况下,任意主成分的统计功效都可能较弱。本文提出了基于主成分组合的线性、非线性及数据自适应综合检验方法。本文证明沃尔德检验(Wald test)是一种特殊的基于主成分的二次检验。研究表明,所提出的主成分综合检验在多种场景下均具备良好的稳健性与统计功效。本文通过功效分析与特征分析探究了所提方法的特性。本文通过可视化展示各类主成分组合检验方法的拒绝域,阐明了它们之间的细微差异与紧密联系。所提检验的接受域为凸集,因此具备统计可容许性。所提检验的p值可通过解析方法高效计算,且相关方法已在公开可用的R软件包MPAT中实现。本文在低维和高维场景下,针对不同信号向量与相关结构开展了模拟研究。本文将所提检验方法应用于四项国际联盟收集的代谢综合征(metabolic syndrome)相关表型数据集的联合分析,以验证所提出的主成分组合检验流程的有效性。本文的补充材料(包含可复现研究所需材料的标准化说明)可通过在线补充资料获取。
提供机构:
Taylor & Francis
创建时间:
2021-09-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作