Open-Ended PitVQA
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/HRL-Mike/PitVQA-Plus
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了大约101,803帧来自25个程序性手术视频,以及745,972个涵盖关键手术要素的问答句子对。此外,该数据集还涉及垂体手术的各个方面,包括工具的检测与定位。规模上,数据集包含了101,803帧画面和745,972个问答对,其任务主要集中在视觉问答领域。
This dataset contains approximately 101,803 frames from 25 procedural surgical videos and 745,972 question-answer sentence pairs covering key surgical elements. It covers all aspects of pituitary surgery, including tool detection and localization. With a total scale of 101,803 frames and 745,972 question-answer pairs, the core tasks of this dataset primarily focus on the visual question answering (VQA) domain.



