Multilingual Pathology Vusual Question Answering Dataset

Name: Multilingual Pathology Vusual Question Answering Dataset
Creator: Femi Godslove, Julius
License: 暂无描述

IEEE2026-04-17 收录

下载链接：

https://ieee-dataport.org/documents/multilingual-pathology-vusual-question-answering-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

The Pathology Visual Question Answering (PathVQA) dataset is a comprehensive collection of 4,998 pathology images paired with 32,799 question-answer pairs. Derived from widely-used pathology textbooks and digital libraries, PathVQA includes both open-ended and binary (yes/no) questions related to pathology images. Initially in English, the dataset was curated and obtained from the Hugging Face dataset library. To enhance multilingual adaptability, we translated the dataset into French, Hindi, and Yoruba using Neural Machine Translation (NMT) and backtranslation techniques, ensuring consistency across the additional language versions. This dataset offers a valuable resource for advancing machine learning models in medical and pathology-related visual question answering tasks.

病理视觉问答（Pathology Visual Question Answering，简称PathVQA）数据集是收录4998张病理图像与32799组问答对的综合性资源集合。该数据集取材于广泛使用的病理教科书与数字图书馆，涵盖与病理图像相关的开放式问题与二元（是/否）两类问题。该数据集最初为英文版本，经精选整理后取自Hugging Face数据集库。为提升多语言适配性，研究团队采用神经机器翻译（Neural Machine Translation，简称NMT）与回译技术，将该数据集翻译为法语、印地语和约鲁巴语版本，并确保各新增语言版本间的一致性。该数据集可为推动医学及病理相关视觉问答任务的机器学习模型发展提供宝贵的研究资源。

提供机构：

Femi Godslove, Julius

5,000+

优质数据集

54 个

任务类型

进入经典数据集