Wan-Syn_77x448x832_600k

Name: Wan-Syn_77x448x832_600k
Creator: maas
Published: 2025-11-19 17:27:40
License: 暂无描述

魔搭社区2025-11-19 更新2025-09-13 收录

下载链接：

https://modelscope.cn/datasets/lhn526/Wan-Syn_77x448x832_600k

下载链接

链接失效反馈

官方服务：

资源简介：

# FastVideo Synthetic Wan2.1 480P dataset <p align="center"> <img src="https://raw.githubusercontent.com/hao-ai-lab/FastVideo/main/assets/logo.png" width="200"/> </p> <div> <div align="center"> <a href="https://github.com/hao-ai-lab/FastVideo" target="_blank">FastVideo Team</a>&emsp; </div> <div align="center"> <a href="https://arxiv.org/pdf/2505.13389">Paper</a> | <a href="https://hao-ai-lab.github.io/FastVideo">Project Page</a> | <a href="https://github.com/hao-ai-lab/FastVideo">Github</a> </div> </div> ## Dataset Overview This dataset contains synthetic video data presented in the paper [VSA: Faster Video Diffusion with Trainable Sparse Attention](https://arxiv.org/pdf/2505.13389). It is part of the larger FastVideo project, which provides a unified post-training and inference framework for accelerated video generation. - The prompts were randomly sampled from the [Vchitect_T2V_DataVerse](https://huggingface.co/datasets/Vchitect/Vchitect_T2V_DataVerse) dataset. - Each sample was generated using the **Wan2.1-T2V-14B-Diffusers** model and stored the latents. - The resolution of each latent sample corresponds to **77 frames**, with each frame sized **448×832**. - It includes all preprocessed latents required for both **Text-to-Video (T2V)** and **Image-to-Video (I2V)** tasks (Latents after VAE and CLIP). - The dataset is fully compatible with the [FastVideo](https://github.com/hao-ai-lab/FastVideo) repository and can be directly loaded and used without any additional preprocessing. ## Sample Usage To generate a video using models trained on or compatible with this dataset, you can use the `fastvideo` library. First, install the library: ```bash pip install fastvideo ``` Then, use the `VideoGenerator` to generate videos: ```python from fastvideo import VideoGenerator def main(): # Create a video generator with a pre-trained model generator = VideoGenerator.from_pretrained( "FastVideo/FastWan2.1-T2V-1.3B-Diffusers", # Example model, refer to FastVideo Hub for others num_gpus=1, # Adjust based on your hardware ) # Define a prompt for your video prompt = "A curious raccoon peers through a vibrant field of yellow sunflowers, its eyes wide with interest." # Generate the video video = generator.generate_video( prompt, return_frames=True, # Also return frames from this call (defaults to False) output_path="my_videos/", # Controls where videos are saved save_video=True ) if __name__ == '__main__': main() ``` ## Citation If you use FastVideo Synthetic Wan2.1 dataset for your research, please cite our related papers: ```bibtex @software{fastvideo2024, title = {FastVideo: A Unified Framework for Accelerated Video Generation}, author = {The FastVideo Team}, url = {https://github.com/hao-ai-lab/FastVideo}, month = apr, year = {2024}, } @article{zhang2025vsa, title={VSA: Faster Video Diffusion with Trainable Sparse Attention}, author={Zhang, Peiyuan and Huang, Haofeng and Chen, Yongqi and Lin, Will and Liu, Zhengzhong and Stoica, Ion and Xing, Eric and Zhang, Hao}, journal={arXiv preprint arXiv:2505.13389}, year={2025} } @article{zhang2025fast, title={Fast video generation with sliding tile attention}, author={Zhang, Peiyuan and Chen, Yongqi and Su, Runlong and Ding, Hangliang and Stoica, Ion and Liu, Zhengzhong and Zhang, Hao}, journal={arXiv preprint arXiv:2502.04507}, year={2025} } ```

# FastVideo Synthetic Wan2.1 480P 数据集 <p align="center"> <img src="https://raw.githubusercontent.com/hao-ai-lab/FastVideo/main/assets/logo.png" width="200"/> </p> <div> <div align="center"> <a href="https://github.com/hao-ai-lab/FastVideo" target="_blank">FastVideo 团队</a>&emsp; </div> <div align="center"> <a href="https://arxiv.org/pdf/2505.13389">论文</a> | <a href="https://hao-ai-lab.github.io/FastVideo">项目主页</a> | <a href="https://github.com/hao-ai-lab/FastVideo">GitHub 仓库</a> </div> </div> ## 数据集概览本数据集为论文《VSA: Faster Video Diffusion with Trainable Sparse Attention》（arXiv:2505.13389）中使用的合成视频数据集，隶属于FastVideo项目整体框架，该框架为加速视频生成提供了统一的训练后优化与推理工具链。 - 提示词从[Vchitect_T2V_DataVerse](https://huggingface.co/datasets/Vchitect/Vchitect_T2V_DataVerse)数据集随机采样得到。 - 所有样本均通过**Wan2.1-T2V-14B-Diffusers**模型生成，并存储其隐向量。 - 每个隐向量样本对应**77帧**视频，单帧分辨率为**448×832**。 - 数据集包含文本转视频（Text-to-Video, T2V）与图像转视频（Image-to-Video, I2V）任务所需的全部预处理隐向量（即经过VAE与CLIP处理后的隐向量）。 - 本数据集完全兼容[FastVideo](https://github.com/hao-ai-lab/FastVideo)仓库，可直接加载使用，无需额外预处理。 ## 示例用法若要使用基于本数据集训练或与之兼容的模型生成视频，可使用`fastvideo`库。首先安装该库： bash pip install fastvideo 随后通过`VideoGenerator`类生成视频： python from fastvideo import VideoGenerator def main(): # 加载预训练模型以创建视频生成器 generator = VideoGenerator.from_pretrained( "FastVideo/FastWan2.1-T2V-1.3B-Diffusers", # 示例模型，更多模型请参考FastVideo模型库 num_gpus=1, # 根据硬件配置调整GPU数量 ) # 定义视频生成提示词 prompt = "一只好奇的浣熊透过一片生机勃勃的黄色向日葵田向外张望，它的眼睛里满是好奇。" # 生成视频 video = generator.generate_video( prompt, return_frames=True, # 本次调用同时返回视频帧（默认为False） output_path="my_videos/", # 指定视频保存路径 save_video=True ) if __name__ == '__main__': main() ## 引用声明若您在研究中使用本FastVideo Synthetic Wan2.1数据集，请引用以下相关论文： bibtex @software{fastvideo2024, title = {FastVideo: A Unified Framework for Accelerated Video Generation}, author = {The FastVideo Team}, url = {https://github.com/hao-ai-lab/FastVideo}, month = apr, year = {2024}, } @article{zhang2025vsa, title={VSA: Faster Video Diffusion with Trainable Sparse Attention}, author={Zhang, Peiyuan and Huang, Haofeng and Chen, Yongqi and Lin, Will and Liu, Zhengzhong and Stoica, Ion and Xing, Eric and Zhang, Hao}, journal={arXiv preprint arXiv:2505.13389}, year={2025} } @article{zhang2025fast, title={Fast video generation with sliding tile attention}, author={Zhang, Peiyuan and Chen, Yongqi and Su, Runlong and Ding, Hangliang and Stoica, Ion and Liu, Zhengzhong and Zhang, Hao}, journal={arXiv preprint arXiv:2502.04507}, year={2025} }

提供机构：

maas

创建时间：

2025-09-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集