Graph Neural NMF Enhanced by Optimal Transport: Short Text Topic Modeling with Pretrained Language Models and Nonparametric Baye

Name: Graph Neural NMF Enhanced by Optimal Transport: Short Text Topic Modeling with Pretrained Language Models and Nonparametric Baye
Creator: Bin Zhao
License: 暂无描述

IEEE2026-04-17 收录

下载链接：

https://ieee-dataport.org/documents/graph-neural-nmf-enhanced-optimal-transport-short-text-topic-modeling-pretrained-language

下载链接

链接失效反馈

官方服务：

资源简介：

Short text topic modeling faces significant challenges due to vocabulary sparsity and lack of context, with fragmented expressions in social platforms and search logs posing higher demands for latent semantic mining. Existing models, such as LDA and single-layer nonnegative matrix factorization, perform well on long texts but lack stability and semantic depth in short text scenarios. To address the issues of short text sparsity, difficulty in presetting the number of topics, and semantic bias, this paper proposes a \\textbf{Graph Neural NMF model enhanced by optimal transport} (GN-DBNMF). By constructing a heterogeneous graph, incorporating a graph neural network encoder, a Beta process decoder, and Wasserstein distance regularization, the model achieves adaptive topic number selection, semantic alignment, and fair topic discovery. The model leverages word embeddings generated by pretrained language models to construct a word-text-long text heterogeneous graph, propagates high-order semantics through graph neural networks, and employs multilayer nonnegative matrix factorization combined with Beta process priors to learn interpretable topic distributions. Additionally, optimal transport constraints are introduced to ensure consistency between the topic distribution and the embedding space. Extensive experiments demonstrate that this method significantly outperforms existing models in classification accuracy, NMI, topic coherence, and perplexity on both Chinese and English short text datasets, validating its advantages and interpretability in semantically sparse scenarios.

提供机构：

Bin Zhao

5,000+

优质数据集

54 个

任务类型

进入经典数据集