MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context
收藏DataCite Commons2025-12-10 更新2026-05-04 收录
下载链接:
https://physionet.org/content/medvh/1.0.1/
下载链接
链接失效反馈官方服务:
资源简介:
Large Vision Language Models (LVLMs) have recently achieved superior
performance in various tasks on natural image and text data, which inspires a
large amount of studies for LVLMs fine-tuning and training. Despite their
advancements, there has been scant research on the robustness of these models
against hallucination when fine-tuned on smaller datasets. In this study, we
introduce a new benchmark dataset, the Medical Visual Hallucination evaluation
benchmark (MedVH), to evaluate the hallucination of domain-specific LVLMs.
MedVH comprises six tasks to evaluate hallucinations in LVLMs within the
medical context, which includes tasks for a comprehensive understanding of
textual and visual input, as well as long textual response generation.
提供机构:
PhysioNet
创建时间:
2025-12-02



