EndoVis-17-VQLA
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/lalithjets/surgical_vqa
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是从2017年MICCAI内窥镜视觉挑战赛中创建的外部验证数据集,包含了针对选定帧的问题-答案-边界框标签。此外,为了证明模型的泛化能力,选定帧还针对常见工具和交互进行了注释。该数据集的规模为97个帧,包含472个问题-答案对,任务是对视觉问题进行定位式回答。
This dataset is an external validation dataset developed from the 2017 MICCAI Endoscopic Vision Challenge. It includes question-answer-bounding box annotations for selected frames. Additionally, to validate the generalization capability of models, these selected frames are further annotated with common instruments and their corresponding interactions. The dataset consists of 97 frames and 472 question-answer pairs, with the task being spatially grounded answering to visual questions.
提供机构:
Publicly accessible with official code implementation
搜集汇总
数据集介绍

背景与挑战
背景概述
EndoVis-17-VQLA是一个手术视觉问答数据集,包含分类和句子形式的答案,旨在通过视觉-文本Transformer模型回答关于手术场景的问题。该数据集扩展自MICCAI内窥镜视觉挑战2018数据集,支持多种训练和评估任务。
以上内容由遇见数据集搜集并总结生成



