five

Inference for Low-Rank Models Without Estimating the Rank

收藏
Taylor & Francis Group2025-10-20 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Inference_for_Low-rank_Models_without_Estimating_the_Rank/29723099/2
下载链接
链接失效反馈
官方服务:
资源简介:
This article studies the inference about linear functionals of high-dimensional low-rank matrices. While most existing inference methods would require consistent estimation of the true rank, our procedure is robust to rank misspecification, making it a promising approach in applications where rank estimation can be unreliable. We estimate the low-rank spaces using pre-specified weighting matrices, known as diversified projections. A novel statistical insight is that, unlike the usual statistical wisdom that overfitting mainly introduces additional variances, the over-estimated low-rank space also gives rise to a non-negligible bias due to an implicit ridge-type regularization. We develop a new inference procedure and show that the central limit theorem holds as long as the pre-specified rank is no smaller than the true rank. In one of our applications, we study multiple testing with incomplete data in the presence of confounding factors and show that our method remains valid as long as the number of controlled confounding factors is at least as large as the true number, even when no confounding factors are present. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.

本文聚焦高维低秩矩阵的线性泛函统计推断问题。现有多数推断方法均要求对真实秩进行一致性估计,而本文提出的推断流程对秩误设定具备鲁棒性,在秩估计可靠性欠佳的应用场景中极具应用前景。我们通过预设加权矩阵(即多样化投影(diversified projections))对低秩空间进行估计。一项新颖的统计学洞见表明:与传统统计认知中「过拟合仅引入额外方差」的主流观点相悖,被高估的低秩空间会因隐式岭型正则化产生不可忽视的偏差。本文提出了全新的推断流程,并证明只要预设秩不小于真实秩,中心极限定理即可成立。在其中一项应用中,我们针对存在混杂因子的不完全数据多重检验问题展开研究,结果表明:即便不存在混杂因子,只要受控混杂因子的数量不少于真实混杂因子数目,本文所提方法依然有效。本文的补充材料可在线获取,其中包含了可用于复现研究工作的标准化材料说明。
提供机构:
Kwon, Hyukjun; Liao, Yuan; Choi, Jungjun
创建时间:
2025-10-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作