VidComposition

arXiv2025-09-30 收录

下载链接：

https://yunlong10.github.io/VidComposition/

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一项专为评估MLLMs（多语言预训练模型）在视频构图理解方面的能力而设计的基准测试。它包含了精心挑选和编辑的视频，以及电影级别的注释。这些视频涵盖了多种构图元素，如摄像机移动、角度、镜头大小、叙事结构、角色动作和情感。该数据集规模包含982个视频和1706个多项选择题，任务集中在视频构图理解。

This dataset is a benchmark specifically designed to evaluate the video composition understanding capabilities of MLLMs (Multilingual Pre-trained Language Models). It comprises carefully selected and edited videos alongside film-grade annotations. The videos cover various compositional elements including camera movement, shot angle, shot size, narrative structure, character actions, and emotions. The dataset contains 982 videos and 1,706 multiple-choice questions, with its tasks centered on video composition understanding.

5,000+

优质数据集

54 个

任务类型

进入经典数据集