five

Graph Neural NMF Enhanced by Optimal Transport: Short Text Topic Modeling with Pretrained Language Models and Nonparametric Baye

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/graph-neural-nmf-enhanced-optimal-transport-short-text-topic-modeling-pretrained-language
下载链接
链接失效反馈
官方服务:
资源简介:
Short text topic modeling faces significant challenges due to vocabulary sparsity and lack of context, with fragmented expressions in social platforms and search logs posing higher demands for latent semantic mining. Existing models, such as LDA and single-layer nonnegative matrix factorization, perform well on long texts but lack stability and semantic depth in short text scenarios. To address the issues of short text sparsity, difficulty in presetting the number of topics, and semantic bias, this paper proposes a \\textbf{Graph Neural NMF model enhanced by optimal transport} (GN-DBNMF). By constructing a heterogeneous graph, incorporating a graph neural network encoder, a Beta process decoder, and Wasserstein distance regularization, the model achieves adaptive topic number selection, semantic alignment, and fair topic discovery. The model leverages word embeddings generated by pretrained language models to construct a word-text-long text heterogeneous graph, propagates high-order semantics through graph neural networks, and employs multilayer nonnegative matrix factorization combined with Beta process priors to learn interpretable topic distributions. Additionally, optimal transport constraints are introduced to ensure consistency between the topic distribution and the embedding space. Extensive experiments demonstrate that this method significantly outperforms existing models in classification accuracy, NMI, topic coherence, and perplexity on both Chinese and English short text datasets, validating its advantages and interpretability in semantically sparse scenarios.
提供机构:
Bin Zhao
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作