Instruction-Following Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://pjlab-adg.github.io/LeapVAD/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是为了对视觉语言模型(VLM)进行有监督的微调而构建的,它融合了Rank2Tell、DriveLM和CARLA的数据,专注于关键对象属性和多视角对话。对话内容是基于关键对象的标准化指代格式构建的独特问答对,这些问答对涵盖了与语义、空间、运动和重要性相关的属性。该数据集包含了5,000个多视角摘要数据和2,000个多帧摘要数据,旨在支持视觉语言模型的监督微调任务。
This dataset is constructed for supervised fine-tuning of Vision-Language Models (VLMs). It incorporates data from Rank2Tell, DriveLM, and CARLA, with a focus on key object attributes and multi-view dialogue. The dialogue content consists of unique question-answer pairs constructed based on standardized reference formats for key objects, covering attributes related to semantics, spatial relations, motion, and importance. This dataset contains 5,000 multi-view summary samples and 2,000 multi-frame summary samples, intended to support supervised fine-tuning tasks for vision-language models.



