five

SynCamVideo-Dataset

收藏
魔搭社区2025-12-05 更新2025-11-22 收录
下载链接:
https://modelscope.cn/datasets/KwaiVGI/SynCamVideo-Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
[Github](https://github.com/KwaiVGI/SynCamMaster) [Project Page](https://jianhongbai.github.io/SynCamMaster/) [Paper](https://arxiv.org/abs/2412.07760) ## 📷 Dataset: SynCamVideo Dataset - __[2025.04.15]__: Release a new version of the SynCamVideo Dataset with improved quality and greater diversity. - __[2025.04.15]__: Please also check our [MultiCamVideo](https://huggingface.co/datasets/KwaiVGI/MultiCamVideo-Dataset) Dataset. ### 1. Dataset Introduction **TL;DR:** The SynCamVideo Dataset is a multi-camera synchronized video dataset rendered using Unreal Engine 5. It includes synchronized multi-camera videos and their corresponding camera poses. The SynCamVideo Dataset can be valuable in fields such as camera-controlled video generation, synchronized video production, and 3D/4D reconstruction. The camera is stationary in the SynCamVideo Dataset. If you require footage with moving cameras rather than stationary ones, please explore our [MultiCamVideo](https://huggingface.co/datasets/KwaiVGI/MultiCamVideo-Dataset) Dataset. <div align="center"> <video controls autoplay style="width: 70%;" src="https://cdn-uploads.huggingface.co/production/uploads/6530bf50f145530101ec03a2/qEUQstpMa3-6UjbG_0ytq.mp4"></video> </div> The SynCamVideo Dataset is a multi-camera synchronized video dataset rendered using Unreal Engine 5. It includes synchronized multi-camera videos and their corresponding camera poses. It consists of 3.4K different dynamic scenes, each captured by 10 cameras, resulting in a total of 34K videos. Each dynamic scene is composed of four elements: {3D environment, character, animation, camera}. Specifically, we use animation to drive the character and position the animated character within the 3D environment. Then, Time-synchronized cameras are set up to render the multi-camera video data. <p align="center"> <img src="https://github.com/user-attachments/assets/107c9607-e99b-4493-b715-3e194fcb3933" alt="Example Image" width="70%"> </p> **3D Environment:** We collect 37 high-quality 3D environments assets from [Fab](https://www.fab.com). To minimize the domain gap between rendered data and real-world videos, we primarily select visually realistic 3D scenes, while choosing a few stylized or surreal 3D scenes as a supplement. To ensure data diversity, the selected scenes cover a variety of indoor and outdoor settings, such as city streets, shopping malls, cafes, office rooms, and the countryside. **Character:** We collect 66 different human 3D models as characters from [Fab](https://www.fab.com) and [Mixamo](https://www.mixamo.com). **Animation:** We collect 93 different animations from [Fab](https://www.fab.com) and [Mixamo](https://www.mixamo.com), including common actions such as waving, dancing, and cheering. We use these animations to drive the collected characters and create diverse datasets through various combinations. **Camera:** To enhance the diversity of the dataset, each camera is randomly sampled on a hemispherical surface centered around the character. ### 2. Statistics and Configurations Dataset Statistics: | Number of Dynamic Scenes | Camera per Scene | Total Videos | |:------------------------:|:----------------:|:------------:| | 3400 | 10 | 34,000 | Video Configurations: | Resolution | Frame Number | FPS | |:-----------:|:------------:|:------------------------:| | 1280x1280 | 81 | 15 | Note: You can use 'center crop' to adjust the video's aspect ratio to fit your video generation model, such as 16:9, 9:16, 4:3, or 3:4. Camera Configurations: | Focal Length | Aperture | Sensor Height | Sensor Width | |:-----------------------:|:------------------:|:-------------:|:------------:| | 24mm | 5.0 | 23.76mm | 23.76mm | ### 3. File Structure ``` SynCamVideo-Dataset ├── train │ └── f24_aperture5 │ ├── scene1 # one dynamic scene │ │ ├── videos │ │ │ ├── cam01.mp4 # synchronized 81-frame videos at 1280x1280 resolution │ │ │ ├── cam02.mp4 │ │ │ ├── ... │ │ │ └── cam10.mp4 │ │ └── cameras │ │ └── camera_extrinsics.json # 81-frame camera extrinsics of the 10 cameras │ ├── ... │ └── scene3400 └── val └── basic ├── videos │ ├── cam01.mp4 # example videos corresponding to the validation cameras │ ├── cam02.mp4 │ ├── ... │ └── cam10.mp4 └── cameras └── camera_extrinsics.json # 10 cameras for validation ``` ### 3. Useful scripts - Data Extraction ```bash tar -xzvf SynCamVideo-Dataset.tar.gz ``` - Camera Visualization ```python python vis_cam.py ``` <p align="center"> <img src="https://cdn-uploads.huggingface.co/production/uploads/6530bf50f145530101ec03a2/3WCWS0Axlnu5MyOBqMoVC.png" alt="Example Image" width="40%"> </p> ## Acknowledgments We thank Jinwen Cao, Yisong Guo, Haowen Ji, Jichao Wang, and Yi Wang from Kuaishou Technology for their invaluable help in constructing the SynCamVideo-Dataset. ## 🌟 Citation Please cite our paper if you find our dataset helpful. ``` @misc{bai2024syncammaster, title={SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints}, author={Jianhong Bai and Menghan Xia and Xintao Wang and Ziyang Yuan and Xiao Fu and Zuozhu Liu and Haoji Hu and Pengfei Wan and Di Zhang}, year={2024}, eprint={2412.07760}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2412.07760}, } ``` ## Contact [Jianhong Bai](https://jianhongbai.github.io/)

[Github](https://github.com/KwaiVGI/SynCamMaster) [项目主页](https://jianhongbai.github.io/SynCamMaster/) [论文](https://arxiv.org/abs/2412.07760) ## 📷 数据集:SynCamVideo 数据集(SynCamVideo Dataset) - __[2025.04.15]__: 发布SynCamVideo 数据集的新版本,画质优化且多样性进一步提升。 - __[2025.04.15]__: 请同时查阅我们的MultiCamVideo 数据集(MultiCamVideo Dataset)。 ### 1. 数据集简介 **TL;DR:** SynCamVideo 数据集是基于虚幻引擎5(Unreal Engine 5)渲染的多相机同步视频数据集,包含同步多相机视频及其对应的相机位姿。该数据集可应用于相机可控视频生成、同步视频制作以及3D/4D重建等领域。本数据集中相机均为静止状态,若需要包含运动相机的镜头素材,请参考我们的MultiCamVideo 数据集(MultiCamVideo Dataset)。 <div align="center"> <video controls autoplay style="width: 70%;" src="https://cdn-uploads.huggingface.co/production/uploads/6530bf50f145530101ec03a2/qEUQstpMa3-6UjbG_0ytq.mp4"></video> </div> SynCamVideo 数据集是基于虚幻引擎5(Unreal Engine 5)渲染的多相机同步视频数据集,包含同步多相机视频及其对应的相机位姿。 该数据集包含3400个不同的动态场景,每个场景由10台相机采集,总计生成34000段视频。每个动态场景由四大要素构成:{3D环境、角色、动画、相机}。具体而言,我们通过动画驱动角色,并将该动画角色放置于3D环境中。随后,设置时间同步的多台相机以渲染多相机视频数据。 <p align="center"> <img src="https://github.com/user-attachments/assets/107c9607-e99b-4493-b715-3e194fcb3933" alt="示例图像" width="70%"> </p> **3D环境:** 我们从[Fab](https://www.fab.com)平台收集了37个高质量3D环境资产。为缩小渲染数据与真实世界视频的域间隙,我们优先选择视觉效果写实的3D场景,同时辅以少量风格化或超现实3D场景作为补充。为保证数据多样性,所选场景覆盖多种室内外场景,包括城市街道、购物中心、咖啡馆、办公室以及乡村场景等。 **角色:** 我们从[Fab](https://www.fab.com)和[Mixamo](https://www.mixamo.com)平台收集了66个不同的人类3D模型作为角色。 **动画:** 我们从[Fab](https://www.fab.com)和[Mixamo](https://www.mixamo.com)平台收集了93种不同的动画,包括挥手、跳舞、欢呼等常见动作。我们使用这些动画驱动收集到的角色,并通过多种组合方式构建多样化数据集。 **相机:** 为增强数据集的多样性,每个相机均在以角色为中心的半球面上随机采样得到。 ### 2. 统计信息与配置参数 数据集统计: | 动态场景数量 | 单场景相机数 | 总视频数 | |:------------------------:|:----------------:|:------------:| | 3400 | 10 | 34,000 | 视频配置: | 分辨率 | 帧数 | 帧率 | |:-----------:|:------------:|:------------------------:| | 1280x1280 | 81 | 15 | 注:您可通过**中心裁剪(center crop)**调整视频宽高比,以适配各类视频生成模型,例如16:9、9:16、4:3或3:4。 相机配置: | 焦距 | 光圈 | 传感器高度 | 传感器宽度 | |:-----------------------:|:------------------:|:-------------:|:------------:| | 24mm | 5.0 | 23.76mm | 23.76mm | ### 3. 文件结构 SynCamVideo-Dataset ├── train │ └── f24_aperture5 │ ├── scene1 # 单个动态场景 │ │ ├── videos │ │ │ ├── cam01.mp4 # 分辨率1280x1280、含81帧的同步视频 │ │ │ ├── cam02.mp4 │ │ │ ├── ... │ │ │ └── cam10.mp4 │ │ └── cameras │ │ └── camera_extrinsics.json # 10台相机的81帧相机外参 │ ├── ... │ └── scene3400 └── val └── basic ├── videos │ ├── cam01.mp4 # 验证相机对应的示例视频 │ ├── cam02.mp4 │ ├── ... │ └── cam10.mp4 └── cameras └── camera_extrinsics.json # 用于验证的10台相机 ### 3. 实用脚本 - 数据解压 bash tar -xzvf SynCamVideo-Dataset.tar.gz - 相机可视化 python python vis_cam.py <p align="center"> <img src="https://cdn-uploads.huggingface.co/production/uploads/6530bf50f145530101ec03a2/3WCWS0Axlnu5MyOBqMoVC.png" alt="示例图像" width="40%"> </p> ## 致谢 感谢来自快手科技(Kuaishou Technology)的曹晋文、郭义松、纪浩文、王吉超以及王毅在构建SynCamVideo 数据集过程中提供的宝贵帮助。 ## 🌟 引用 若您认为本数据集对您的研究有所帮助,请引用我们的论文: @misc{bai2024syncammaster, title={SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints}, author={Jianhong Bai and Menghan Xia and Xintao Wang and Ziyang Yuan and Xiao Fu and Zuozhu Liu and Haoji Hu and Pengfei Wan and Di Zhang}, year={2024}, eprint={2412.07760}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2412.07760}, } ## 联系方式 [白建宏](https://jianhongbai.github.io/)
提供机构:
maas
创建时间:
2025-09-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作