电影MovieNet数据集
收藏国家数据集管理服务平台2026-04-28 更新2026-04-29 收录
下载链接:
https://www.ndsms.cn/dataRetrieval/datasetDetail/?id=2e97b21e68461c491b84b30dbde77df2
下载链接
链接失效反馈官方服务:
资源简介:
本数据集面向长视频理解大模型研发团队、影视内容智能分析系统开发商及多模态算法研究机构,旨在解决长视频理解任务中时序标注粒度粗、多模态对齐数据少的瓶颈。数据集包含1,100部电影,提供预告片、剧照、情节描述等多模态数据,标注维度全面:带边界框和身份的字符标注、42K场景边界、2.5K对齐描述句子、65K地点/动作标签、92K电影风格标签。与传统侧重短视频动作识别的数据集不同,本数据集覆盖完整电影叙事结构,支持从镜头到整部影片的多粒度建模。
This dataset is tailored for R&D teams developing large language models (LLMs) for long-video understanding, developers of intelligent film content analysis systems, and multimodal algorithm research institutions, aiming to resolve the key bottlenecks in long-video understanding tasks: coarse temporal annotation granularity and the shortage of multimodally aligned data. The dataset includes 1,100 movies, providing multimodal data such as trailers, film stills, and plot descriptions. Its annotation dimensions are comprehensive, covering character annotations with bounding boxes and unique identities, 42K scene boundaries, 2.5K aligned descriptive sentences, 65K location/action tags, and 92K movie style tags. Unlike traditional datasets that primarily focus on short-video action recognition, this dataset covers the complete narrative structure of full-length films, enabling multi-granularity modeling ranging from individual shots to entire motion pictures.
提供机构:
上海库帕思科技有限公司
创建时间:
2026-04-27
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集旨在支持长视频理解任务,针对时序标注粒度粗和多模态对齐数据少的瓶颈,包含1,100部电影的多模态数据,如预告片、剧照和情节描述,并提供全面的标注维度,包括字符、场景边界、对齐句子、地点/动作和风格标签。与传统短视频数据集相比,它覆盖完整电影叙事结构,支持从镜头到整部影片的多粒度建模。
以上内容由遇见数据集搜集并总结生成



