five

Deep Discrete Encoders: Identifiable Deep Generative Models for Rich Data with Discrete Latent Layers

收藏
DataCite Commons2026-02-27 更新2026-04-25 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Deep_Discrete_Encoders_Identifiable_Deep_Generative_Models_for_Rich_Data_with_Discrete_Latent_Layers/30764568/1
下载链接
链接失效反馈
官方服务:
资源简介:
In the era of generative AI, deep generative models (DGMs) with latent representations have gained tremendous popularity. Despite their impressive empirical performance, the statistical properties of these models remain underexplored. DGMs are often overparameterized, non-identifiable, and uninterpretable black boxes, raising serious concerns when deploying them in high-stakes applications. Motivated by this, we propose interpretable deep generative models for rich data types with discrete latent layers, called <i>Deep Discrete Encoders</i> (DDEs). A DDE is a directed graphical model with multiple binary latent layers. Theoretically, we propose transparent identifiability conditions for DDEs, which imply progressively smaller sizes of the latent layers as they go deeper. Identifiability ensures consistent parameter estimation and inspires an interpretable design of the deep architecture. Computationally, we propose a scalable estimation pipeline of a layerwise nonlinear spectral initialization followed by a penalized stochastic approximation EM algorithm. This procedure can efficiently estimate models with exponentially many latent components. Extensive simulation studies for high-dimensional data and deep architectures validate our theoretical results and demonstrate the excellent performance of our algorithms. We apply DDEs to three diverse real datasets with different data types to perform hierarchical topic modeling, image representation learning, and response time modeling in educational testing. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
提供机构:
Taylor & Francis
创建时间:
2025-12-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作