Class-specific Joint Feature Screening in Ultrahigh-dimensional Mixture Regression*

Name: Class-specific Joint Feature Screening in Ultrahigh-dimensional Mixture Regression*
Creator: Taylor & Francis
Published: 2025-06-01 04:07:23
License: 暂无描述

DataCite Commons2025-06-01 更新2025-05-07 收录

下载链接：

https://tandf.figshare.com/articles/dataset/Class-specific_Joint_Feature_Screening_in_Ultrahigh-dimensional_Mixture_Regression_/28473703/1

下载链接

链接失效反馈

官方服务：

资源简介：

Finite mixture of regression models are ubiquitous for analyzing complex data. They aim to detect heterogeneity in the effects of a set of features on a response over a finite number of latent classes. When the number of features is large, a direct fitting of mixture regressions can be computationally infeasible and often leads to a poor interpretative value. One practical strategy is to screen out most irrelevant features before an in-depth analysis. In this paper, we propose a novel method for feature screening in ultrahigh-dimensional Gaussian finite mixture of regressions. The new method is built upon a sparsity-restricted expectation-approximation-maximization algorithm, which simultaneously removes varying sets of irrelevant features from multiple latent classes. In the screening process, joint effects between features are naturally accounted and class-specific screening results are produced without ad hoc steps. These merits give the new method an edge to outperform the existing screening methods. The promising performance of the method is supported by both theory and numerical examples including a real data analysis. Supplementary materials for this article are available online.

有限混合回归模型（finite mixture of regression models）是复杂数据分析的主流工具之一，其核心目标是在有限个潜在类别（latent class）中，识别一组特征对响应变量的效应异质性。当特征维度较高时，直接拟合混合回归模型不仅计算上难以实现，且往往难以获得良好的解释性。一种实用的分析策略是在开展深入建模前，筛除大部分无关特征。本文针对超高维（ultrahigh-dimensional）高斯有限混合回归模型，提出一种全新的特征筛选方法。该方法基于稀疏约束的期望近似最大化（expectation-approximation-maximization）算法，可同时从多个潜在类别中移除不同子集的无关特征。在筛选流程中，特征间的联合效应（joint effects）会被自然纳入考量，且无需特设步骤（ad hoc steps）即可生成类别特异性的筛选结果。上述优势使得该方法相较于现有筛选方法具备更优异的性能。理论推导与数值实验（含真实数据分析案例）均验证了该方法的良好表现。本文的补充材料可在线获取。

提供机构：

Taylor & Francis

创建时间：

2025-02-24

5,000+

优质数据集

54 个

任务类型

进入经典数据集