A Deep Dynamic Latent Block Model for Co-clustering of Zero-Inflated Data Matrices
收藏DataCite Commons2024-02-23 更新2024-08-26 收录
下载链接:
https://tandf.figshare.com/articles/dataset/A_Deep_Dynamic_Latent_Block_Model_for_Co-clustering_of_Zero-Inflated_Data_Matrices/25282360/1
下载链接
链接失效反馈官方服务:
资源简介:
The simultaneous clustering of observations and features of data sets (known as co-clustering) has recently emerged as a central machine learning application to summarize massive data sets. However, most existing models focus on continuous and dense data in stationary scenarios, where cluster assignments do not evolve over time. This work introduces a novel latent block model for the dynamic co-clustering of data matrices with high sparsity. To properly model this type of data, we assume that the observations follow a time and block dependent mixture of zero-inflated distributions, thus combining stochastic processes with the time-varying sparsity modeling. To detect abrupt changes in the dynamics of both cluster memberships and data sparsity, the mixing and sparsity proportions are modeled through systems of ordinary differential equations. The inference relies on an original variational procedure whose maximization step trains fully connected neural networks in order to solve the dynamical systems. Numerical experiments on simulated and real world data sets demonstrate the effectiveness of the proposed methodology in the context of count data.
对数据集的观测值与特征开展同步聚类(该任务亦称共聚类,co-clustering),近年来已成为汇总海量数据集的核心机器学习应用方向。然而,现有多数模型仅聚焦于平稳场景下的连续稠密数据,此类场景中聚类分配不会随时间发生演化。本研究提出一种面向高稀疏性数据矩阵的动态共聚类新型隐块模型。为合理建模此类数据,我们假设观测值服从依赖于时间与块的零膨胀分布混合模型,由此将随机过程与时变稀疏性建模相结合。为侦测聚类成员与数据稀疏性的动态突变,我们通过常微分方程组对混合比例与稀疏性比例进行建模。该推断方法依托一种原创变分流程,其最大化步骤通过训练全连接神经网络以求解动力学方程组。在模拟数据集与真实世界数据集上开展的数值实验,验证了所提方法在计数数据场景下的有效性。
提供机构:
Taylor & Francis
创建时间:
2024-02-23



