five

Medical-CXR-VQA dataset: A Large-Scale LLM-Enhanced Medical Dataset for Visual Question Answering on Chest X-Ray Images

收藏
DataCite Commons2025-01-21 更新2025-04-16 收录
下载链接:
https://physionet.org/content/medical-cxr-vqa-dataset/
下载链接
链接失效反馈
官方服务:
资源简介:
Medical Visual Question Answering (VQA) is an important task in medical multi- modal Large Language Models (LLMs), aiming to answer clinically relevant questions regarding input medical images. This technique has the potential to improve the efficiency of medical professionals while relieving the burden on the public health system, particularly in resource-poor countries. However, existing medical VQA datasets are small and only contain simple questions (equivalent to classification tasks), which lack semantic reasoning and clinical knowledge. Our previous work proposed a clinical knowledge-driven image difference VQA benchmark using a rule-based approach. However, given the same large-scale breadth of information coverage, the rule-based approach shows an 85% error rate on extracted labels. We trained an LLM method to extract labels with 62% increased accuracy. We also comprehensively evaluated our labels with 2 clinical experts on 100 samples to help us fine-tune the LLM. Based on the trained LLM model, we proposed a large-scale medical VQA dataset, Medical-CXR-VQA, derived from the MIMIC-CXR dataset and comprises 780,014 question-answer pairs, categorized into six types: abnormality (190,525 pairs), location (104,680 pairs), type (69,486 pairs), level (111,715 pairs), view (92,048 pairs), and presence (211,560 pairs).
提供机构:
PhysioNet
创建时间:
2025-01-14
二维码
社区交流群
二维码
科研交流群
商业服务