VidComposition
收藏arXiv2025-09-30 收录
下载链接:
https://yunlong10.github.io/VidComposition/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一项专为评估MLLMs(多语言预训练模型)在视频构图理解方面的能力而设计的基准测试。它包含了精心挑选和编辑的视频,以及电影级别的注释。这些视频涵盖了多种构图元素,如摄像机移动、角度、镜头大小、叙事结构、角色动作和情感。该数据集规模包含982个视频和1706个多项选择题,任务集中在视频构图理解。
This dataset is a benchmark specifically designed to evaluate the video composition understanding capabilities of MLLMs (Multilingual Pre-trained Language Models). It comprises carefully selected and edited videos alongside film-grade annotations. The videos cover various compositional elements including camera movement, shot angle, shot size, narrative structure, character actions, and emotions. The dataset contains 982 videos and 1,706 multiple-choice questions, with its tasks centered on video composition understanding.



