Cross-Domain Data Alignment and Multi-Task Driven Language-Image Pre-Training for Radiology Tasks

Name: Cross-Domain Data Alignment and Multi-Task Driven Language-Image Pre-Training for Radiology Tasks
Creator: IEEE DataPort
Published: 2024-07-15 03:20:01
License: 暂无描述

DataCite Commons2024-07-15 更新2025-04-16 收录

下载链接：

https://ieee-dataport.org/documents/cross-domain-data-alignment-and-multi-task-driven-language-image-pre-training-radiology

下载链接

链接失效反馈

官方服务：

资源简介：

Advancements in Medical Vision-Language Pre-training (Medical-VLP) progress rapidly by learning representations from paired radiology reports. Nevertheless, there still remain two issues that restrict the development of Medical-VLP: the scarcity of parallel image-report pair and monotony of pre-training tasks. Thus, we propose Multi-Grained Cross-Domain Report Searching (CDRS) strategy, and Multi-Task Driven Language-Image Pre-Training (MLIP) framework. CDRS strives to match report-less radiology images with matched reports, by leveraging collaborative alignment training that bridges cross-domain data, operating at in-domain cross-modal, out-of-domain self-supervised, and patch-wise levels. MLIP utilizes a unified model to compute multiple training objectives with minimal overhead by sharing the same computational graph. It integrates single-encoder, dual-encoder, and encoder-decoder paradigms, enabling the decoupled model to perform both discriminative and generative tasks, such as report generation and medical visual question answering. Extensive experiments and analyses demonstrate that our method achieves state-of-the-art performance across multiple discriminative and generative medical downstream tasks. 

提供机构：

IEEE DataPort

创建时间：

2024-07-15

5,000+

优质数据集

54 个

任务类型

进入经典数据集