five

Multi-modal multi-task internet sentiment recognition model based on mid-to-back fusion strategy

收藏
Figshare2025-12-15 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_b_Multi-modal_multi-task_b_b_internet_sentiment_recognition_model_based_on_mid-to-back_fusion_strategy_b_/30885701
下载链接
链接失效反馈
官方服务:
资源简介:
Previous studies in multimodal emotion recognition have not adequately addressed the disparity between unimodal expression and multimodal perception in emotion labeling, nor have they optimized multimodal fusion strategies. This paper introduces an innovative multimodal multi-task emotion recognition model based on a mid-to-late fusion strategy. The model employs BERT, Wav2Vec2.0, and Clip for feature preprocessing of text, audio, and visual modalities, respectively. Features are further refined through convolutional layers and self-attention mechanisms. Independent unimodal emotion labeling trains unimodal models to optimize unimodal emotional representations. Multimodal representations are integrated using a multi-head attention mechanism, and disparities between unimodal and multimodal presentations are captured through a multi-task learning strategy. Experiments on the CH-SIMS Chinese dataset demonstrate that the proposed model significantly outperforms baseline models, with a 3% increase in accuracy and a 1.74% increase in F1 score.Despite significant advancements in emotion recognition, the model still has room for improvement in handling dataset diversity, complexity of real-world applications, non-standard language expressions, and cross-cultural emotion understanding. The proposed model not only effectively enhances the effect of multimodal fusion but also offers a new perspective for the development of multimodal emotion recognition technology.
创建时间:
2025-12-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作