Target-Entity Sentiment Classification with Image-Text Multimodal Entity Alignment
收藏中国科学数据2026-03-16 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.19678/j.issn.1000-3428.0070147
下载链接
链接失效反馈官方服务:
资源简介:
With the increasing popularity of social media, Multimodal Sentiment Classification (MSC) has received widespread attention in recent years. Target-oriented Multimodal Sentiment Classification (TMSC) is an important task in the field of multimodal sentiment analysis, which aims to predict the sentiment polarity of a referred entity by combining multiple modal information, such as text and images. Although many scholars have proposed numerous modeling methods for this task, these methods are still unable to achieve accurate entity alignment between text and images, which directly affects model accuracy on a target task. To address this problem, this study proposes a model for target-entity sentiment classification with Image-Text Multimodal Entity Alignment (ITMEA). The model first adopts Adjective-Noun Pairs (ANPs) extracted from an image to design sentiment auxiliary information such that the key sentiment information of the target entity in an image can be expressed more intuitively. Simultaneously, feature description information is designed by adopting the multimodal Large Language Model (LLM), LLaMA-Adapter V2, achieving accurate intermodal target entity alignment. Moreover, the model constructs a gating mechanism in the intermodal feature fusion stage to prevent irrelevant information from introducing additional interference, by dynamically controlling the input of information other than text. Experimental results on two Twitter benchmark datasets, Twitter-2015 and Twitter-2017, show that ITMEA improves accuracy by approximately 1.00 and 0.57 percentage points, respectively, in comparison with the optimal method among compared baselines, thus validating the effectiveness and superiority of the methods designed in this study.
创建时间:
2026-03-16



