Statistical robustness analysis.
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Statistical_robustness_analysis_/30333155
下载链接
链接失效反馈官方服务:
资源简介:
Vision-language pre-training (VLP) methods have significantly advanced cross-modal tasks in recent years. However, image-text retrieval still faces two critical challenges: inter-modal matching deficiency and intra-modal fine-grained localization deficiency. These issues significantly impede the accuracy of image-text retrieval. To address these challenges, we propose a novel dual-stage training framework. In the first stage, we employ Soft Label Distillation (SLD) to align the contrastive relationships between images and texts by mitigating the overfitting problem caused by hard labels. In the second stage, we introduce Spatial Text Prompt (STP) to enhance the model’s visual grounding capabilities by incorporating spatial prompt information, thereby achieving more precise fine-grained alignment. Extensive experiments on standard datasets show that our method outperforms state-of-the-art approaches in image-text retrieval.The code and supplementary files can be found at https://github.com/Leon001211/DSSLP.
创建时间:
2025-10-10



