five

Wan-Syn_77x448x832_600k

收藏
魔搭社区2025-11-19 更新2025-09-13 收录
下载链接:
https://modelscope.cn/datasets/lhn526/Wan-Syn_77x448x832_600k
下载链接
链接失效反馈
官方服务:
资源简介:
# FastVideo Synthetic Wan2.1 480P dataset <p align="center"> <img src="https://raw.githubusercontent.com/hao-ai-lab/FastVideo/main/assets/logo.png" width="200"/> </p> <div> <div align="center"> <a href="https://github.com/hao-ai-lab/FastVideo" target="_blank">FastVideo Team</a>&emsp; </div> <div align="center"> <a href="https://arxiv.org/pdf/2505.13389">Paper</a> | <a href="https://hao-ai-lab.github.io/FastVideo">Project Page</a> | <a href="https://github.com/hao-ai-lab/FastVideo">Github</a> </div> </div> ## Dataset Overview This dataset contains synthetic video data presented in the paper [VSA: Faster Video Diffusion with Trainable Sparse Attention](https://arxiv.org/pdf/2505.13389). It is part of the larger FastVideo project, which provides a unified post-training and inference framework for accelerated video generation. - The prompts were randomly sampled from the [Vchitect_T2V_DataVerse](https://huggingface.co/datasets/Vchitect/Vchitect_T2V_DataVerse) dataset. - Each sample was generated using the **Wan2.1-T2V-14B-Diffusers** model and stored the latents. - The resolution of each latent sample corresponds to **77 frames**, with each frame sized **448×832**. - It includes all preprocessed latents required for both **Text-to-Video (T2V)** and **Image-to-Video (I2V)** tasks (Latents after VAE and CLIP). - The dataset is fully compatible with the [FastVideo](https://github.com/hao-ai-lab/FastVideo) repository and can be directly loaded and used without any additional preprocessing. ## Sample Usage To generate a video using models trained on or compatible with this dataset, you can use the `fastvideo` library. First, install the library: ```bash pip install fastvideo ``` Then, use the `VideoGenerator` to generate videos: ```python from fastvideo import VideoGenerator def main(): # Create a video generator with a pre-trained model generator = VideoGenerator.from_pretrained( "FastVideo/FastWan2.1-T2V-1.3B-Diffusers", # Example model, refer to FastVideo Hub for others num_gpus=1, # Adjust based on your hardware ) # Define a prompt for your video prompt = "A curious raccoon peers through a vibrant field of yellow sunflowers, its eyes wide with interest." # Generate the video video = generator.generate_video( prompt, return_frames=True, # Also return frames from this call (defaults to False) output_path="my_videos/", # Controls where videos are saved save_video=True ) if __name__ == '__main__': main() ``` ## Citation If you use FastVideo Synthetic Wan2.1 dataset for your research, please cite our related papers: ```bibtex @software{fastvideo2024, title = {FastVideo: A Unified Framework for Accelerated Video Generation}, author = {The FastVideo Team}, url = {https://github.com/hao-ai-lab/FastVideo}, month = apr, year = {2024}, } @article{zhang2025vsa, title={VSA: Faster Video Diffusion with Trainable Sparse Attention}, author={Zhang, Peiyuan and Huang, Haofeng and Chen, Yongqi and Lin, Will and Liu, Zhengzhong and Stoica, Ion and Xing, Eric and Zhang, Hao}, journal={arXiv preprint arXiv:2505.13389}, year={2025} } @article{zhang2025fast, title={Fast video generation with sliding tile attention}, author={Zhang, Peiyuan and Chen, Yongqi and Su, Runlong and Ding, Hangliang and Stoica, Ion and Liu, Zhengzhong and Zhang, Hao}, journal={arXiv preprint arXiv:2502.04507}, year={2025} } ```

# FastVideo Synthetic Wan2.1 480P 数据集 <p align="center"> <img src="https://raw.githubusercontent.com/hao-ai-lab/FastVideo/main/assets/logo.png" width="200"/> </p> <div> <div align="center"> <a href="https://github.com/hao-ai-lab/FastVideo" target="_blank">FastVideo 团队</a>&emsp; </div> <div align="center"> <a href="https://arxiv.org/pdf/2505.13389">论文</a> | <a href="https://hao-ai-lab.github.io/FastVideo">项目主页</a> | <a href="https://github.com/hao-ai-lab/FastVideo">GitHub 仓库</a> </div> </div> ## 数据集概览 本数据集为论文《VSA: Faster Video Diffusion with Trainable Sparse Attention》(arXiv:2505.13389)中使用的合成视频数据集,隶属于FastVideo项目整体框架,该框架为加速视频生成提供了统一的训练后优化与推理工具链。 - 提示词从[Vchitect_T2V_DataVerse](https://huggingface.co/datasets/Vchitect/Vchitect_T2V_DataVerse)数据集随机采样得到。 - 所有样本均通过**Wan2.1-T2V-14B-Diffusers**模型生成,并存储其隐向量。 - 每个隐向量样本对应**77帧**视频,单帧分辨率为**448×832**。 - 数据集包含文本转视频(Text-to-Video, T2V)与图像转视频(Image-to-Video, I2V)任务所需的全部预处理隐向量(即经过VAE与CLIP处理后的隐向量)。 - 本数据集完全兼容[FastVideo](https://github.com/hao-ai-lab/FastVideo)仓库,可直接加载使用,无需额外预处理。 ## 示例用法 若要使用基于本数据集训练或与之兼容的模型生成视频,可使用`fastvideo`库。首先安装该库: bash pip install fastvideo 随后通过`VideoGenerator`类生成视频: python from fastvideo import VideoGenerator def main(): # 加载预训练模型以创建视频生成器 generator = VideoGenerator.from_pretrained( "FastVideo/FastWan2.1-T2V-1.3B-Diffusers", # 示例模型,更多模型请参考FastVideo模型库 num_gpus=1, # 根据硬件配置调整GPU数量 ) # 定义视频生成提示词 prompt = "一只好奇的浣熊透过一片生机勃勃的黄色向日葵田向外张望,它的眼睛里满是好奇。" # 生成视频 video = generator.generate_video( prompt, return_frames=True, # 本次调用同时返回视频帧(默认为False) output_path="my_videos/", # 指定视频保存路径 save_video=True ) if __name__ == '__main__': main() ## 引用声明 若您在研究中使用本FastVideo Synthetic Wan2.1数据集,请引用以下相关论文: bibtex @software{fastvideo2024, title = {FastVideo: A Unified Framework for Accelerated Video Generation}, author = {The FastVideo Team}, url = {https://github.com/hao-ai-lab/FastVideo}, month = apr, year = {2024}, } @article{zhang2025vsa, title={VSA: Faster Video Diffusion with Trainable Sparse Attention}, author={Zhang, Peiyuan and Huang, Haofeng and Chen, Yongqi and Lin, Will and Liu, Zhengzhong and Stoica, Ion and Xing, Eric and Zhang, Hao}, journal={arXiv preprint arXiv:2505.13389}, year={2025} } @article{zhang2025fast, title={Fast video generation with sliding tile attention}, author={Zhang, Peiyuan and Chen, Yongqi and Su, Runlong and Ding, Hangliang and Stoica, Ion and Liu, Zhengzhong and Zhang, Hao}, journal={arXiv preprint arXiv:2502.04507}, year={2025} }
提供机构:
maas
创建时间:
2025-09-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作