Open-Sora Video Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/hpcaitech/Open-Sora
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个经过精心筛选的视频片段集合,通过分层数据筛选系统进行处理,以提高视频片段的质量和适用性,使其更适合用于训练视频生成模型。此外,该数据集采用了一套分层的筛选系统来进行数据整理,包括审美评分、动作评分和清晰度评估,确保了用于训练的高质量视频片段。该数据集包含了时长从2秒到8秒不等的短视频片段,其任务旨在支持视频生成模型的训练。
This dataset is a carefully curated collection of video clips, processed using a hierarchical data filtering system to improve the quality and suitability of the segments for video generation model training. Furthermore, this hierarchical filtering pipeline incorporates aesthetic scoring, action evaluation, and clarity assessment to guarantee that only high-quality video clips are utilized for training purposes. The dataset comprises short video clips with durations ranging from 2 to 8 seconds, and its core objective is to support the training of video generation models.
提供机构:
Open-Sora Team



