SynCamVideo-Dataset

Name: SynCamVideo-Dataset
Creator: maas
Published: 2025-12-05 16:49:16
License: 暂无描述

魔搭社区2025-12-05 更新2025-11-22 收录

下载链接：

https://modelscope.cn/datasets/KwaiVGI/SynCamVideo-Dataset

下载链接

链接失效反馈

官方服务：

资源简介：

[Github](https://github.com/KwaiVGI/SynCamMaster) [Project Page](https://jianhongbai.github.io/SynCamMaster/) [Paper](https://arxiv.org/abs/2412.07760) ## 📷 Dataset: SynCamVideo Dataset - __[2025.04.15]__: Release a new version of the SynCamVideo Dataset with improved quality and greater diversity. - __[2025.04.15]__: Please also check our [MultiCamVideo](https://huggingface.co/datasets/KwaiVGI/MultiCamVideo-Dataset) Dataset. ### 1. Dataset Introduction **TL;DR:** The SynCamVideo Dataset is a multi-camera synchronized video dataset rendered using Unreal Engine 5. It includes synchronized multi-camera videos and their corresponding camera poses. The SynCamVideo Dataset can be valuable in fields such as camera-controlled video generation, synchronized video production, and 3D/4D reconstruction. The camera is stationary in the SynCamVideo Dataset. If you require footage with moving cameras rather than stationary ones, please explore our [MultiCamVideo](https://huggingface.co/datasets/KwaiVGI/MultiCamVideo-Dataset) Dataset. <div align="center"> <video controls autoplay style="width: 70%;" src="https://cdn-uploads.huggingface.co/production/uploads/6530bf50f145530101ec03a2/qEUQstpMa3-6UjbG_0ytq.mp4"></video> </div> The SynCamVideo Dataset is a multi-camera synchronized video dataset rendered using Unreal Engine 5. It includes synchronized multi-camera videos and their corresponding camera poses. It consists of 3.4K different dynamic scenes, each captured by 10 cameras, resulting in a total of 34K videos. Each dynamic scene is composed of four elements: {3D environment, character, animation, camera}. Specifically, we use animation to drive the character and position the animated character within the 3D environment. Then, Time-synchronized cameras are set up to render the multi-camera video data. <p align="center"> <img src="https://github.com/user-attachments/assets/107c9607-e99b-4493-b715-3e194fcb3933" alt="Example Image" width="70%"> </p> **3D Environment:** We collect 37 high-quality 3D environments assets from [Fab](https://www.fab.com). To minimize the domain gap between rendered data and real-world videos, we primarily select visually realistic 3D scenes, while choosing a few stylized or surreal 3D scenes as a supplement. To ensure data diversity, the selected scenes cover a variety of indoor and outdoor settings, such as city streets, shopping malls, cafes, office rooms, and the countryside. **Character:** We collect 66 different human 3D models as characters from [Fab](https://www.fab.com) and [Mixamo](https://www.mixamo.com). **Animation:** We collect 93 different animations from [Fab](https://www.fab.com) and [Mixamo](https://www.mixamo.com), including common actions such as waving, dancing, and cheering. We use these animations to drive the collected characters and create diverse datasets through various combinations. **Camera:** To enhance the diversity of the dataset, each camera is randomly sampled on a hemispherical surface centered around the character. ### 2. Statistics and Configurations Dataset Statistics: | Number of Dynamic Scenes | Camera per Scene | Total Videos | |:------------------------:|:----------------:|:------------:| | 3400 | 10 | 34,000 | Video Configurations: | Resolution | Frame Number | FPS | |:-----------:|:------------:|:------------------------:| | 1280x1280 | 81 | 15 | Note: You can use 'center crop' to adjust the video's aspect ratio to fit your video generation model, such as 16:9, 9:16, 4:3, or 3:4. Camera Configurations: | Focal Length | Aperture | Sensor Height | Sensor Width | |:-----------------------:|:------------------:|:-------------:|:------------:| | 24mm | 5.0 | 23.76mm | 23.76mm | ### 3. File Structure ``` SynCamVideo-Dataset ├── train │ └── f24_aperture5 │ ├── scene1 # one dynamic scene │ │ ├── videos │ │ │ ├── cam01.mp4 # synchronized 81-frame videos at 1280x1280 resolution │ │ │ ├── cam02.mp4 │ │ │ ├── ... │ │ │ └── cam10.mp4 │ │ └── cameras │ │ └── camera_extrinsics.json # 81-frame camera extrinsics of the 10 cameras │ ├── ... │ └── scene3400 └── val └── basic ├── videos │ ├── cam01.mp4 # example videos corresponding to the validation cameras │ ├── cam02.mp4 │ ├── ... │ └── cam10.mp4 └── cameras └── camera_extrinsics.json # 10 cameras for validation ``` ### 3. Useful scripts - Data Extraction ```bash tar -xzvf SynCamVideo-Dataset.tar.gz ``` - Camera Visualization ```python python vis_cam.py ``` <p align="center"> <img src="https://cdn-uploads.huggingface.co/production/uploads/6530bf50f145530101ec03a2/3WCWS0Axlnu5MyOBqMoVC.png" alt="Example Image" width="40%"> </p> ## Acknowledgments We thank Jinwen Cao, Yisong Guo, Haowen Ji, Jichao Wang, and Yi Wang from Kuaishou Technology for their invaluable help in constructing the SynCamVideo-Dataset. ## 🌟 Citation Please cite our paper if you find our dataset helpful. ``` @misc{bai2024syncammaster, title={SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints}, author={Jianhong Bai and Menghan Xia and Xintao Wang and Ziyang Yuan and Xiao Fu and Zuozhu Liu and Haoji Hu and Pengfei Wan and Di Zhang}, year={2024}, eprint={2412.07760}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2412.07760}, } ``` ## Contact [Jianhong Bai](https://jianhongbai.github.io/)

[Github](https://github.com/KwaiVGI/SynCamMaster) [项目主页](https://jianhongbai.github.io/SynCamMaster/) [论文](https://arxiv.org/abs/2412.07760) ## 📷 数据集：SynCamVideo 数据集（SynCamVideo Dataset） - __[2025.04.15]__: 发布SynCamVideo 数据集的新版本，画质优化且多样性进一步提升。 - __[2025.04.15]__: 请同时查阅我们的MultiCamVideo 数据集（MultiCamVideo Dataset）。 ### 1. 数据集简介 **TL;DR:** SynCamVideo 数据集是基于虚幻引擎5（Unreal Engine 5）渲染的多相机同步视频数据集，包含同步多相机视频及其对应的相机位姿。该数据集可应用于相机可控视频生成、同步视频制作以及3D/4D重建等领域。本数据集中相机均为静止状态，若需要包含运动相机的镜头素材，请参考我们的MultiCamVideo 数据集（MultiCamVideo Dataset）。 <div align="center"> <video controls autoplay style="width: 70%;" src="https://cdn-uploads.huggingface.co/production/uploads/6530bf50f145530101ec03a2/qEUQstpMa3-6UjbG_0ytq.mp4"></video> </div> SynCamVideo 数据集是基于虚幻引擎5（Unreal Engine 5）渲染的多相机同步视频数据集，包含同步多相机视频及其对应的相机位姿。该数据集包含3400个不同的动态场景，每个场景由10台相机采集，总计生成34000段视频。每个动态场景由四大要素构成：{3D环境、角色、动画、相机}。具体而言，我们通过动画驱动角色，并将该动画角色放置于3D环境中。随后，设置时间同步的多台相机以渲染多相机视频数据。 <p align="center"> <img src="https://github.com/user-attachments/assets/107c9607-e99b-4493-b715-3e194fcb3933" alt="示例图像" width="70%"> </p> **3D环境：** 我们从[Fab](https://www.fab.com)平台收集了37个高质量3D环境资产。为缩小渲染数据与真实世界视频的域间隙，我们优先选择视觉效果写实的3D场景，同时辅以少量风格化或超现实3D场景作为补充。为保证数据多样性，所选场景覆盖多种室内外场景，包括城市街道、购物中心、咖啡馆、办公室以及乡村场景等。 **角色：** 我们从[Fab](https://www.fab.com)和[Mixamo](https://www.mixamo.com)平台收集了66个不同的人类3D模型作为角色。 **动画：** 我们从[Fab](https://www.fab.com)和[Mixamo](https://www.mixamo.com)平台收集了93种不同的动画，包括挥手、跳舞、欢呼等常见动作。我们使用这些动画驱动收集到的角色，并通过多种组合方式构建多样化数据集。 **相机：** 为增强数据集的多样性，每个相机均在以角色为中心的半球面上随机采样得到。 ### 2. 统计信息与配置参数数据集统计： | 动态场景数量 | 单场景相机数 | 总视频数 | |:------------------------:|:----------------:|:------------:| | 3400 | 10 | 34,000 | 视频配置： | 分辨率 | 帧数 | 帧率 | |:-----------:|:------------:|:------------------------:| | 1280x1280 | 81 | 15 | 注：您可通过**中心裁剪（center crop）**调整视频宽高比，以适配各类视频生成模型，例如16:9、9:16、4:3或3:4。相机配置： | 焦距 | 光圈 | 传感器高度 | 传感器宽度 | |:-----------------------:|:------------------:|:-------------:|:------------:| | 24mm | 5.0 | 23.76mm | 23.76mm | ### 3. 文件结构 SynCamVideo-Dataset ├── train │ └── f24_aperture5 │ ├── scene1 # 单个动态场景 │ │ ├── videos │ │ │ ├── cam01.mp4 # 分辨率1280x1280、含81帧的同步视频 │ │ │ ├── cam02.mp4 │ │ │ ├── ... │ │ │ └── cam10.mp4 │ │ └── cameras │ │ └── camera_extrinsics.json # 10台相机的81帧相机外参 │ ├── ... │ └── scene3400 └── val └── basic ├── videos │ ├── cam01.mp4 # 验证相机对应的示例视频 │ ├── cam02.mp4 │ ├── ... │ └── cam10.mp4 └── cameras └── camera_extrinsics.json # 用于验证的10台相机 ### 3. 实用脚本 - 数据解压 bash tar -xzvf SynCamVideo-Dataset.tar.gz - 相机可视化 python python vis_cam.py <p align="center"> <img src="https://cdn-uploads.huggingface.co/production/uploads/6530bf50f145530101ec03a2/3WCWS0Axlnu5MyOBqMoVC.png" alt="示例图像" width="40%"> </p> ## 致谢感谢来自快手科技（Kuaishou Technology）的曹晋文、郭义松、纪浩文、王吉超以及王毅在构建SynCamVideo 数据集过程中提供的宝贵帮助。 ## 🌟 引用若您认为本数据集对您的研究有所帮助，请引用我们的论文： @misc{bai2024syncammaster, title={SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints}, author={Jianhong Bai and Menghan Xia and Xintao Wang and Ziyang Yuan and Xiao Fu and Zuozhu Liu and Haoji Hu and Pengfei Wan and Di Zhang}, year={2024}, eprint={2412.07760}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2412.07760}, } ## 联系方式 [白建宏](https://jianhongbai.github.io/)

提供机构：

maas

创建时间：

2025-09-03

5,000+

优质数据集

54 个

任务类型

进入经典数据集