Outside-Knowledge Conversational Video (OKCV) Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/c-patsch/OKCV
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个基于视频对话的数据集,它要求参与者利用外部知识来完成正确的对话,通过在输入和输出模式中融入时间维度,从而扩展了开放式知识视觉问答任务。该数据集要求视觉模型处理视频并处理不断发展的对话状态,同时还需要检索和整合外部信息。该数据集包含2,017个视频,5,986个人工标注的对话,这些对话由40,954个交替的对话轮次组成,其任务是进行对话式视频理解和问答。
This video-based conversational dataset extends the open-ended knowledge-based visual question answering task by integrating temporal dimensions into both input and output modalities, requiring task participants to leverage external knowledge to generate valid and coherent conversations. Additionally, it demands visual models to process input videos, track dynamically evolving conversational states, and retrieve as well as integrate external auxiliary information. The dataset consists of 2,017 videos, 5,986 manually annotated conversations composed of 40,954 alternating dialogue turns, with the core task focusing on conversational video understanding and question answering.



