Ablation study of different modules in the model.
收藏Figshare2026-01-23 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_p_Ablation_study_of_different_modules_in_the_model_p_/31137499
下载链接
链接失效反馈官方服务:
资源简介:
Accurate segmentation of salient objects is crucial for various computer vision applications including image editing, autonomous driving, and object detection. While research on using depth information (RGB-D) in saliency detection is gaining significant attention, its broad application is limited by dependencies on depth sensors and the challenge of effectively integrating RGB and depth information. To address these issues, we propose an innovative method for salient object segmentation that integrates the Segment Anything Model (SAM), depth information, and cross-modal attention mechanisms. Our approach leverages SAM for robust feature extraction and combines it with a pre-trained depth estimation network to capture geometric information. By dynamically fusing features from RGB and depth modalities through a cross-modal attention mechanism, our method enhances the ability to handle diverse scenes. Additionally, we achieve computational efficiency without compromising precision by employing lightweight LoRA fine-tuning and freezing pre-trained weights. The use of a UNet decoder refines the segmentation output, ensuring the preservation of target boundary details in high-resolution outputs. Experiments conducted on five challenging benchmark datasets validate the effectiveness of our proposed method. Results show significant improvements over existing methods across key evaluation metrics, including MaxF, MAE, and S-measure. Particularly in tasks involving complex backgrounds, small targets, and multiple salient object segmentation, our method demonstrates superior performance and robustness. The significance of this work lies in advancing the application of depth-guided RGB in salient object segmentation while offering new insights into overcoming depth sensor dependency. Furthermore, it opens up novel pathways for the effective fusion of cross-modal information, thereby contributing to the broader development and diversification of related technologies and their applications.
创建时间:
2026-01-23



