Surgery-R1-54k
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/FiFi-HAO467/Surgery-R1
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为Surgery-R1-54k,包含了视觉问答、定位问答以及思维链(CoT)的配对数据,旨在提升对手术场景理解中的推理能力。该数据集基于EndoVis-18-VQLA和EndoVis-17-VQLA数据集构建,并使用qwen2.5-vl-72b-instruct API生成,同时经过专家评审确保准确性。数据集规模包括12,255个完整的思维链、33,342个视觉问答对以及8,902个定位问答对。其任务定位于视觉问题定位回答(Surgical-Vqla)。
This dataset, named Surgery-R1-54k, encompasses paired data for visual question answering (VQA), localization question answering, and chain-of-thought (CoT), aiming to enhance reasoning capabilities in surgical scene understanding. It is constructed based on the EndoVis-18-VQLA and EndoVis-17-VQLA datasets, generated via the qwen2.5-vl-72b-instruct API, and validated by expert reviews to ensure its accuracy. The dataset comprises 12,255 complete chain-of-thought samples, 33,342 VQA pairs, and 8,902 localization question answering pairs. It is tailored for the visual question localization answering task, designated as Surgical-VQLA.



