Data thinning for Poisson factor models and its applications
收藏Taylor & Francis Group2025-10-20 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Data_thinning_for_Poisson_factor_models_and_its_applications/29912242/1
下载链接
链接失效反馈官方服务:
资源简介:
The Poisson factor model is a powerful tool for dimension reduction and visualization of large-scale count datasets across diverse domains. Despite the availability of efficient algorithms for estimating factors and loadings, existing methods either require prior knowledge of the number of factors, or resort to ad hoc criteria for its determination. This paper proposes a novel data-driven criterion called Information Criterion via Data Thinning (ICDT), leveraging the thinning property of the Poisson distribution. Unlike traditional data splitting, data thinning partitions the count matrix into training and validation sets while preserving both the distribution and the underlying data structure. Interestingly, the validation error can be decomposed into the training error plus a covariance penalty. A simple estimator of the covariance penalty is obtained, leading to the development of ICDT. The selection consistency of ICDT is derived when both the sample size and the number of variables diverge to infinity. The proposed methodology is extended to dimension reduction in regression by incorporating the response inversely into the Poisson factor model. Extensive simulated examples and two real data applications are used to evaluate the performance of ICDT and compare it with existing criteria.
提供机构:
Zhao, Hongyu; Xu, Peirong; Wang, Zhijing; Wang, Tao
创建时间:
2025-08-14



