Task prior attention network for multi-task learning of dense prediction
收藏中国科学数据2026-01-04 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.1007/s11432-023-4648-7
下载链接
链接失效反馈官方服务:
资源简介:
Transformer-based methods have been popular for a variety of visual perception tasks due to their better global modeling via attention. However, a plain transformer-based architecture is known for lacking inductive biases, which will impede the performance in multi-task learning (MTL) of dense prediction due to the incapability of capturing task-relevant prior information. To this end, we propose the task prior attention network (TPANet), which introduces task-relevant prior information into the whole architecture. Our TPANet consists of three tailored modules: task prior extractor, adaptive task mixing, and cross attention modules. First, the proposed task prior extractor is applied for introducing task-relevant prior information with inductive biases via convolution for each task, adapting them to the downstream module simultaneously. Second, for task interaction efficiency, our method relies on the adaptive task mixing equipped with spatial and channel mixing to capture the task interaction. Third, the proposed cross attention module is leveraged to query task-specific feature maps with task-relevant prior information via query-based attention. Our method allows compatibility with different backbones. TPANet (with Swin-L) performance surpasses the previous state-of-the-art by a large margin of $+$4.6 mIoU on NYUD-v2 and $+$0.8 mIoU on PASCAL-Context dataset, demonstrating the potential of our method as a robust MTL model.
创建时间:
2025-10-28



