Robust multivariate lasso regression with covariance estimation
收藏DataCite Commons2024-02-15 更新2024-07-29 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Robust_multivariate_lasso_regression_with_covariance_estimation/20736603/1
下载链接
链接失效反馈官方服务:
资源简介:
Multivariate regression with covariance estimation (MRCE) is a method that performs sparse estimation of multivariate regression coefficients, while taking account the covariance structure of the response variables. MRCE utilizes a penalized likelihood approach to simultaneously estimate the regression coefficients and the inverse covariance matrix so that prediction accuracy can be significantly improved. However, traditional likelihood-based methods such as MRCE can produce very misleading results in the presence of outliers. In this work, we propose an extension of MRCE, namely, a robust multivariate lasso regression with covariance estimation (RMLC) to handle potential outliers within the data. By using Huber’s loss or Tukey’s biweight loss, RMLC can be resistant to outliers in the responses or in both the responses and the covariates. A novel optimization algorithm that incorporates a two-fold accelerated proximal gradient (APG) algorithm is developed to solve RMLC efficiently. We also demonstrate that our proposed RMLC enjoys the oracle property. Our simulation study shows that RMLC produces very reliable results for both the regression coefficients and the correlation structure of the responses, even if the data are contaminated. A real analysis on hyperspectral data further demonstrates the utility of RMLC.
带协方差估计的多元回归(Multivariate regression with covariance estimation, MRCE)是一种能够对多元回归系数进行稀疏估计的方法,同时充分考虑响应变量的协方差结构。MRCE借助惩罚似然方法,可同时估计回归系数与逆协方差矩阵,进而显著提升预测精度。然而,传统基于似然的方法(如MRCE)在存在异常值的场景下,往往会生成极具误导性的结果。为此,本文提出MRCE的一种扩展方法——带协方差估计的稳健多元Lasso回归(Robust multivariate lasso regression with covariance estimation, RMLC),用以处理数据中潜在的异常值。通过采用Huber损失函数或Tukey双权损失函数,RMLC可对响应变量,或响应变量与协变量同时存在的异常值具备稳健性。本文开发了一种结合双重加速近端梯度(APG)算法的新型优化算法,以高效求解RMLC模型。此外,本文证明所提出的RMLC具备神谕性质。仿真实验结果表明,即便数据存在异常值污染,RMLC仍能为回归系数与响应变量的相关结构生成可靠的结果。针对高光谱数据的实际案例分析进一步验证了RMLC的实用性。
提供机构:
Taylor & Francis
创建时间:
2022-08-30



