Signal Classification in Large-Scale Multi-Sequence Integrative Analysis Under the HMM Dependence
收藏Taylor & Francis Group2024-05-01 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Signal_Classification_in_Large-Scale_Multi-Sequence_Integrative_Analysis_Under_the_HMM_Dependence/24128553/1
下载链接
链接失效反馈官方服务:
资源简介:
The integrative analysis of multiple sequences of multiple tests has enjoyed increasing popularity in many applications, especially in large-scale genomics. In the context of large-scale multiple testing, the concept of signal classification has been developed recently for cases when the same features are involved in several independent studies, with the goal of classifying each feature into one of several classes. This article considers the problem of such signal classification in a generalized compound decision-making framework, where the observed data are assumed to be generated from an underlying four-state Cartesian hidden Markov model. Two oracle procedures are proposed for the total and set-specific control of misclassification rates, respectively, while the number of correct classifications is maximized. Optimal data-driven procedures are also proposed, with their asymptotic properties derived. It is shown that signal-classification could be improved significantly by taking into account the dependence structure among features, and the proposed procedures could have a better performance than their competitors that ignore the dependence structure. The proposed methods are applied to a psychiatric genetics study for detecting genetic variants that affect either or both of bipolar disorder and schizophrenia.
多重检验的多序列整合分析在诸多应用场景中愈发受到青睐,尤其在大规模基因组学领域。在大规模多重检验的研究框架下,针对同一特征被纳入多项独立研究的场景,学界近年提出了信号分类(signal classification)这一概念,其核心目标是将每个特征划分为若干类别中的一类。本文在广义复合决策框架下探讨此类信号分类问题,假设观测数据由底层的四态笛卡尔隐马尔可夫模型(four-state Cartesian hidden Markov model)生成。本文分别提出了两类神谕程序(oracle procedures),一类用于总体误分类率的控制,另一类用于集合特异性误分类率的控制,同时实现正确分类数量的最大化。本文还提出了最优数据驱动程序,并推导了其渐近性质。研究表明,通过考虑特征间的依赖结构,信号分类的性能可获得显著提升;且本文提出的程序相较于忽略依赖结构的同类竞争方法,表现更为优异。本文将所提方法应用于一项精神疾病遗传学研究,以筛选影响双相情感障碍(bipolar disorder)、精神分裂症(schizophrenia),或同时影响二者的遗传变异体。
提供机构:
Chen, Gongtao; Qiu, Peihua; Xiang, Dongdong; Li, Wendong
创建时间:
2023-09-12



