Kvasir-VQA-x1

Name: Kvasir-VQA-x1
Creator: 挪威奥斯陆都市大学,挪威模拟大都市数字工程中心,挪威模拟研究实验室
Published: 2025-06-12 01:31:38
License: 暂无描述

arXiv2025-06-12 更新2025-11-28 收录

下载链接：

https://hf-mirror.com/datasets/SimulaMet/Kvasir-VQA-x1

下载链接

链接失效反馈

官方服务：

资源简介：

Kvasir-VQA-x1是一个大规模的医学视觉问答数据集，用于胃肠道内窥镜检查。该数据集在原有的Kvasir-VQA基础上进行了显著扩展，新增了159,549个问答对，旨在测试更深入的医学推理能力。数据集采用大型语言模型生成问题，并按复杂性分层，以更好地评估模型的推理能力。为了确保数据集为模型准备真实世界的临床场景，还引入了多种视觉增强，模拟常见的成像伪影。数据集的结构支持两个主要的评估轨道：一个用于标准VQA性能，另一个用于测试模型对这些视觉扰动的鲁棒性。通过提供一个更具挑战性和临床相关性的基准，Kvasir-VQA-x1旨在加速更可靠和有效的多模态AI系统的开发，用于临床环境。数据集完全可访问，并遵循FAIR数据原则，使其成为更广泛研究社区的宝贵资源。

Kvasir-VQA-x1 is a large-scale medical visual question answering (VQA) dataset designed for gastrointestinal endoscopy. Building upon the original Kvasir-VQA dataset, this work significantly expands the corpus with 159,549 additional question-answer pairs, targeting the evaluation of deeper medical reasoning capabilities. The questions in the dataset are generated by large language models (LLMs), and the pairs are stratified by complexity to better assess model reasoning abilities. To prepare the dataset for real-world clinical scenarios, multiple visual augmentations are introduced to simulate common imaging artifacts. The structure of the dataset supports two primary evaluation tracks: one for standard VQA performance assessment, and the other for testing model robustness against these visual perturbations. By providing a more challenging and clinically relevant benchmark, Kvasir-VQA-x1 aims to accelerate the development of more reliable and effective multimodal AI systems for clinical settings. The dataset is fully accessible and adheres to the FAIR data principles, making it a valuable resource for the broader research community.

提供机构：

挪威奥斯陆都市大学,挪威模拟大都市数字工程中心,挪威模拟研究实验室

创建时间：

2025-06-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集