RadVLM model
收藏DataCite Commons2025-10-08 更新2026-05-04 收录
下载链接:
https://physionet.org/content/radvlm-model/
下载链接
链接失效反馈官方服务:
资源简介:
We present RadVLM, a compact (7B) multitask conversational foundation model
designed for CXR interpretation. Its development relies on the curation of a
large-scale instruction dataset comprising over 1 million image-instruction
pairs containing both single-turn tasks - such as report generation,
abnormality classification, and visual grounding - and multi-turn, multi-task
conversational interactions. Our experiments show that RadVLM, fine-tuned on
this instruction dataset, achieves state-of-the-art performance in
conversational capabilities and visual grounding while remaining competitive
in other radiology tasks (report generation, classification). Ablation studies
further highlight the benefit of joint training across multiple tasks,
particularly for scenarios with limited annotated data. Together, these
findings highlight the potential of the RadVLM model as a clinically relevant
AI assistant, providing structured CXR interpretation and conversational
capabilities to support more effective and accessible diagnostic workflows.
提供机构:
PhysioNet
创建时间:
2025-10-06



