Robust covariance estimation and explainable outlier detection for matrix-valued data
收藏Taylor & Francis Group2025-05-12 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Robust_covariance_estimation_and_explainable_outlier_detection_for_matrix-valued_data/28582137/1
下载链接
链接失效反馈官方服务:
资源简介:
This work introduces the Matrix Minimum Covariance Determinant (MMCD) method, a novel robust location and covariance estimation procedure designed for data that are naturally represented in the form of a matrix. Unlike standard robust multivariate estimators, which would only be applicable after a vectorization of the matrix-variate samples leading to high-dimensional datasets, the MMCD estimators account for the matrix-variate data structure and consistently estimate the mean matrix, as well as the rowwise and columnwise covariance matrices in the class of matrix-variate elliptical distributions. Additionally, we show that the MMCD estimators are matrix affine equivariant and achieve a higher breakdown point than the maximal achievable one by any multivariate, affine equivariant location/covariance estimator when applied to the vectorized data. An efficient algorithm with convergence guarantees is proposed and implemented. As a result, robust Mahalanobis distances based on MMCD estimators offer a reliable tool for outlier detection. Additionally, we extend the concept of Shapley values for outlier explanation to the matrix-variate setting, enabling the decomposition of the squared Mahalanobis distances into contributions of the rows, columns, or individual cells of matrix-valued observations. Notably, both the theoretical guarantees and simulations show that the MMCD estimators outperform robust estimators based on vectorized observations, offering better computational efficiency and improved robustness. Moreover, real-world data examples demonstrate the practical relevance of the MMCD estimators and the resulting robust Shapley values.
本研究提出了矩阵最小协方差行列式(Matrix Minimum Covariance Determinant,MMCD)方法,这是一种专为天然以矩阵形式表征的数据设计的新型稳健位置与协方差估计流程。与仅能在将矩阵变量样本向量化以得到高维数据集后才可应用的标准稳健多元估计器不同,MMCD估计器充分考量矩阵变量数据的固有结构,可在矩阵变量椭圆分布族中一致地估计均值矩阵,以及行协方差与列协方差矩阵。此外,本文证明MMCD估计器具备矩阵仿射等变性,且相较于任何应用于向量化数据的多元仿射等变位置/协方差估计器所能达到的最大崩溃点,MMCD估计器拥有更高的崩溃点。本文提出并实现了一种具备收敛性保证的高效算法。据此,基于MMCD估计器的稳健马氏距离(Mahalanobis distance)为异常值检测提供了可靠的分析工具。此外,本文将用于异常值解释的夏普利值(Shapley values)概念拓展至矩阵变量场景,实现了将平方马氏距离分解为矩阵值观测的行、列或单个单元格的贡献度。值得注意的是,理论推导与仿真实验均表明,MMCD估计器优于基于向量化观测的稳健估计器,具备更优异的计算效率与更强的稳健性。此外,真实世界数据集案例验证了MMCD估计器及其衍生的稳健夏普利值的实际应用价值。
提供机构:
Filzmoser, Peter; Mayrhofer, Marcus; Radojičić, Una
创建时间:
2025-03-12



