WorldMedQA-V
收藏魔搭社区2025-12-05 更新2025-11-15 收录
下载链接:
https://modelscope.cn/datasets/evalscope/WorldMedQA-V
下载链接
链接失效反馈官方服务:
资源简介:
# WorldMedQA-V: A Multilingual, Multimodal Medical Examination Dataset
<img src="src/logo.png" alt="logo" width="200"/>
## Overview
**WorldMedQA-V** is a multilingual and multimodal benchmarking dataset designed to evaluate vision-language models (VLMs) in healthcare contexts. The dataset includes medical examination questions from four countries—Brazil, Israel, Japan, and Spain—in both their original languages and English translations. Each multiple-choice question is paired with a corresponding medical image, enabling the evaluation of VLMs on multimodal data.
**Key Features:**
- **Multilingual:** Supports local languages (Portuguese, Hebrew, Japanese, and Spanish) as well as English translations.
- **Multimodal:** Each question is accompanied by a medical image, allowing for a comprehensive assessment of VLMs' performance on both textual and visual inputs.
- **Clinically Validated:** All questions and answers have been reviewed and validated by native-speaking clinicians from the respective countries.
## Dataset Details
- **Number of Questions:** 568
- **Countries Covered:** Brazil, Israel, Japan, Spain
- **Languages:** Portuguese, Hebrew, Japanese, Spanish, and English
- **Types of Data:** Multiple-choice questions with medical images
- **Evaluation:** Performance of models in both local languages and English, with and without medical images
The dataset aims to bridge the gap between real-world healthcare settings and AI evaluations, fostering more equitable, effective, and representative applications.
## Data Structure
The dataset is provided in TSV format, with the following structure:
- **ID**: Unique identifier for each question.
- **Question**: The medical multiple-choice question in the local language.
- **Options**: List of possible answers (A-D).
- **Correct Answer**: The correct answer's label.
- **Image Path**: Path to the corresponding medical image (if applicable).
- **Language**: The language of the question (original or English translation).
### Example from Brazil:
- **Question**: Um paciente do sexo masculino, 55 anos de idade, tabagista 60 maços/ano... [Full medical question see below]
- **Options**:
- A: Aspergilose pulmonar
- B: Carcinoma pulmonar
- C: Tuberculose cavitária
- D: Bronquiectasia com infecção
- **Correct Answer**: B
<img src="src/example.png" alt="example" width="800"/>
### Evaluate models/results:
<img src="src/results.png" alt="results" width="800"/>
## Download and Usage
The dataset can be downloaded from [Hugging Face datasets page](https://huggingface.co/datasets/WorldMedQA/V). All code for handling and evaluating the dataset is available in the following repositories:
- **Dataset Code**: [WorldMedQA GitHub repository](https://github.com/WorldMedQA/V)
- **Evaluation Code**: [VLMEvalKit GitHub repository](https://github.com/WorldMedQA/VLMEvalKit/tree/main)
**Where and How to start?**: [Google Colab Demo](https://colab.research.google.com/drive/16bw_7_sUTajNRZFunRNo3wqnL_tQWk6O)
## Citation
Please cite this dataset using our arXiv preprint:
```bibtex
@misc{WorldMedQA-V2024,
title={WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation},
author={João Matos and Shan Chen and Siena Placino and Yingya Li and Juan Carlos Climent Pardo and Daphna Idan and Takeshi Tohyama and David Restrepo and Luis F. Nakayama and Jose M. M. Pascual-Leone and Guergana Savova and Hugo Aerts and Leo A. Celi and A. Ian Wong and Danielle S. Bitterman and Jack Gallifant},
year={2024},
eprint={2410.12722},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2410.12722},
}
# WorldMedQA-V:多模态多语言医学考试数据集
## 概述
**WorldMedQA-V**是一款面向医疗场景的多模态多语言基准数据集,用于评估视觉语言模型(Vision-Language Models, VLMs)的性能。该数据集收录了巴西、以色列、日本、西班牙四个国家的医学考试题目,涵盖原始语言版本及英文翻译版本。每道选择题均配有对应的医学影像,可用于全面评估视觉语言模型在多模态数据上的表现。
**核心特性:**
- **多语言支持:** 覆盖葡萄牙语、希伯来语、日语、西班牙语四类本地语言,同时提供英文翻译版本。
- **多模态设计:** 每道题目均搭配医学影像,可完整评估模型对文本与视觉输入的综合处理能力。
- **临床验证:** 所有题目与答案均经过对应国家的母语临床医师审核与验证。
## 数据集详情
- **题目数量:** 568道
- **覆盖国家:** 巴西、以色列、日本、西班牙
- **支持语言:** 葡萄牙语、希伯来语、日语、西班牙语及英语
- **数据类型:** 带医学影像的选择题
- **评估场景:** 支持在本地语言与英文环境下,分别使用带/不带医学影像的输入评估模型性能。
本数据集旨在缩小真实医疗场景与AI评估之间的差距,推动更公平、高效且具有代表性的AI医疗应用发展。
## 数据结构
数据集以制表符分隔值(Tab-Separated Values, TSV)格式提供,结构如下:
- **ID:** 每道题目的唯一标识符
- **Question:** 本地语言版本的医学选择题题干
- **Options:** 候选答案列表(A-D)
- **Correct Answer:** 正确答案的选项标签
- **Image Path:** 对应医学影像的存储路径(如适用)
- **Language:** 题目所使用的语言(原始语言或英文翻译)
### 巴西区示例
- **题干:** 一名55岁男性患者,有60包·年吸烟史……[完整医学题干详见下文]
- **候选答案:**
A: 肺曲霉病
B: 肺癌
C: 空洞型肺结核
D: 伴感染的支气管扩张
- **正确答案:** B
### 模型与结果评估
## 下载与使用
该数据集可从[Hugging Face数据集页面](https://huggingface.co/datasets/WorldMedQA/V)下载。数据集处理与评估相关代码可在以下仓库获取:
- **数据集代码:** [WorldMedQA GitHub仓库](https://github.com/WorldMedQA/V)
- **评估代码:** [VLMEvalKit GitHub仓库](https://github.com/WorldMedQA/VLMEvalKit/tree/main)
**快速入门指南:** [Google Colab演示教程](https://colab.research.google.com/drive/16bw_7_sUTajNRZFunRNo3wqnL_tQWk6O)
## 引用方式
请通过以下arXiv预印本引用本数据集:
bibtex
@misc{WorldMedQA-V2024,
title={WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation},
author={João Matos and Shan Chen and Siena Placino and Yingya Li and Juan Carlos Climent Pardo and Daphna Idan and Takeshi Tohyama and David Restrepo and Luis F. Nakayama and Jose M. M. Pascual-Leone and Guergana Savova and Hugo Aerts and Leo A. Celi and A. Ian Wong and Danielle S. Bitterman and Jack Gallifant},
year={2024},
eprint={2410.12722},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2410.12722},
}
提供机构:
maas
创建时间:
2025-11-07



