UML: uncertainty-aware and mutual learning for noise-robust cross-lingual cross-modal retrieval

中国科学数据2026-01-15 更新2026-04-25 收录

下载链接：

https://www.sciengine.com/AA/doi/10.1007/s11432-024-4696-2

下载链接

链接失效反馈

官方服务：

资源简介：

Cross-lingual cross-modal retrieval (CCR) has recently emerged as a significant research area, focusing on aligning visual content with non-English captions without relying on human-annotated non-English cross-modal data pairs. Most CCR methods extend existing English-only datasets with other languages via machine translation (MT) to establish correspondence between vision and non-English.Regrettably, these cheaply collected datasets inevitably contain numerous mismatched vision and non-English data pairs, a.k.anoisy correspondence (NC). The presence of NC renders the supervision information unreliable, leading to a significant decline in retrieval performance. Furthermore, most existing methods attempt to improve alignment between visual and non-English representations by combining information from multiple views. However, these approaches often overlook the need for consistency across these views, capturing view-specific and task-irrelevant information, which exacerbates bias in the optimization direction. To address the issues, we propose an uncertainty-aware and mutual learning (UML) framework, which integrates a novel dual-view uncertainty-aware learning (DUL) paradigm and an efficient adaptive mutual learning (AML) loss. The DUL effectively models alignment uncertainty to assess and mitigate the effects of NC. Specifically, it employs evidential deep learning to obtain accurate cross-modal alignment uncertainty, which is then combined with labels softened by Fisher information to impose appropriate penalties for retrieval. To mitigate the exacerbation problem, we derive the AML loss, which aims to ensure effective aggregation between all modalities of a clean pair, while effectively separating the non-English representation of a noisy pair from its visual and English representations. Our UML consistently outperforms previous methods in supervised, domain generalization, and robustness settings across three challenging benchmarks.

创建时间：

2025-12-03