Dual-Teacher Multimodal Fusion for Depression Detection
收藏科学数据银行2025-09-02 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=6f2db2d87fb942acbc9d30f3f4c06f13
下载链接
链接失效反馈官方服务:
资源简介:
Depression affects over 264 million people worldwide and is a leading cause of disability. Existing machine learning approaches for analyzing behavioral data—including electronic health records, wearable devices, and social media—remain constrained by unimodal paradigms. Language models often ignore paralinguistic biomarkers such as speech prosody, while audio-based methods overlook semantic context, leaving critical cross-modal interactions unexploited. Here, we introduce a Teacher-Student Architecture-based Multimodal Fusion Network (TSA-MFN) that fundamentally advances depression classification. Dual specialized teacher models extract high-confidence text semantics and audio biomarkers, which are integrated by a student network through multi-head attention mechanisms. Fusion weights are dynamically computed via learnable similarity matrices, and a hybrid loss function balances knowledge transfer with classification optimization, enabling phased teacher-student synergy during training. TSA-MFN achieves a 99.1% F1-score on the DAIC-WOZ dataset, surpassing unimodal baselines by 18.7% and conventional fusion models by 12.4%, with ablation studies confirming the contribution of each component. By combining dual-teacher guidance with adaptive similarity-based fusion, this framework establishes a new paradigm for interpretable and balanced multimodal learning in mental health analytics.
提供机构:
School of Information and Intelligence Engineering, University of Sanya, Sanya, Hainan, 572000, China
创建时间:
2025-08-27



