Wan-Syn_77x448x832_600k
收藏魔搭社区2025-11-19 更新2025-09-13 收录
下载链接:
https://modelscope.cn/datasets/lhn526/Wan-Syn_77x448x832_600k
下载链接
链接失效反馈官方服务:
资源简介:
# FastVideo Synthetic Wan2.1 480P dataset
<p align="center">
<img src="https://raw.githubusercontent.com/hao-ai-lab/FastVideo/main/assets/logo.png" width="200"/>
</p>
<div>
<div align="center">
<a href="https://github.com/hao-ai-lab/FastVideo" target="_blank">FastVideo Team</a> 
</div>
<div align="center">
<a href="https://arxiv.org/pdf/2505.13389">Paper</a> |
<a href="https://hao-ai-lab.github.io/FastVideo">Project Page</a> |
<a href="https://github.com/hao-ai-lab/FastVideo">Github</a>
</div>
</div>
## Dataset Overview
This dataset contains synthetic video data presented in the paper [VSA: Faster Video Diffusion with Trainable Sparse Attention](https://arxiv.org/pdf/2505.13389). It is part of the larger FastVideo project, which provides a unified post-training and inference framework for accelerated video generation.
- The prompts were randomly sampled from the [Vchitect_T2V_DataVerse](https://huggingface.co/datasets/Vchitect/Vchitect_T2V_DataVerse) dataset.
- Each sample was generated using the **Wan2.1-T2V-14B-Diffusers** model and stored the latents.
- The resolution of each latent sample corresponds to **77 frames**, with each frame sized **448×832**.
- It includes all preprocessed latents required for both **Text-to-Video (T2V)** and **Image-to-Video (I2V)** tasks (Latents after VAE and CLIP).
- The dataset is fully compatible with the [FastVideo](https://github.com/hao-ai-lab/FastVideo) repository and can be directly loaded and used without any additional preprocessing.
## Sample Usage
To generate a video using models trained on or compatible with this dataset, you can use the `fastvideo` library. First, install the library:
```bash
pip install fastvideo
```
Then, use the `VideoGenerator` to generate videos:
```python
from fastvideo import VideoGenerator
def main():
# Create a video generator with a pre-trained model
generator = VideoGenerator.from_pretrained(
"FastVideo/FastWan2.1-T2V-1.3B-Diffusers", # Example model, refer to FastVideo Hub for others
num_gpus=1, # Adjust based on your hardware
)
# Define a prompt for your video
prompt = "A curious raccoon peers through a vibrant field of yellow sunflowers, its eyes wide with interest."
# Generate the video
video = generator.generate_video(
prompt,
return_frames=True, # Also return frames from this call (defaults to False)
output_path="my_videos/", # Controls where videos are saved
save_video=True
)
if __name__ == '__main__':
main()
```
## Citation
If you use FastVideo Synthetic Wan2.1 dataset for your research, please cite our related papers:
```bibtex
@software{fastvideo2024,
title = {FastVideo: A Unified Framework for Accelerated Video Generation},
author = {The FastVideo Team},
url = {https://github.com/hao-ai-lab/FastVideo},
month = apr,
year = {2024},
}
@article{zhang2025vsa,
title={VSA: Faster Video Diffusion with Trainable Sparse Attention},
author={Zhang, Peiyuan and Huang, Haofeng and Chen, Yongqi and Lin, Will and Liu, Zhengzhong and Stoica, Ion and Xing, Eric and Zhang, Hao},
journal={arXiv preprint arXiv:2505.13389},
year={2025}
}
@article{zhang2025fast,
title={Fast video generation with sliding tile attention},
author={Zhang, Peiyuan and Chen, Yongqi and Su, Runlong and Ding, Hangliang and Stoica, Ion and Liu, Zhengzhong and Zhang, Hao},
journal={arXiv preprint arXiv:2502.04507},
year={2025}
}
```
# FastVideo Synthetic Wan2.1 480P 数据集
<p align="center">
<img src="https://raw.githubusercontent.com/hao-ai-lab/FastVideo/main/assets/logo.png" width="200"/>
</p>
<div>
<div align="center">
<a href="https://github.com/hao-ai-lab/FastVideo" target="_blank">FastVideo 团队</a> 
</div>
<div align="center">
<a href="https://arxiv.org/pdf/2505.13389">论文</a> |
<a href="https://hao-ai-lab.github.io/FastVideo">项目主页</a> |
<a href="https://github.com/hao-ai-lab/FastVideo">GitHub 仓库</a>
</div>
</div>
## 数据集概览
本数据集为论文《VSA: Faster Video Diffusion with Trainable Sparse Attention》(arXiv:2505.13389)中使用的合成视频数据集,隶属于FastVideo项目整体框架,该框架为加速视频生成提供了统一的训练后优化与推理工具链。
- 提示词从[Vchitect_T2V_DataVerse](https://huggingface.co/datasets/Vchitect/Vchitect_T2V_DataVerse)数据集随机采样得到。
- 所有样本均通过**Wan2.1-T2V-14B-Diffusers**模型生成,并存储其隐向量。
- 每个隐向量样本对应**77帧**视频,单帧分辨率为**448×832**。
- 数据集包含文本转视频(Text-to-Video, T2V)与图像转视频(Image-to-Video, I2V)任务所需的全部预处理隐向量(即经过VAE与CLIP处理后的隐向量)。
- 本数据集完全兼容[FastVideo](https://github.com/hao-ai-lab/FastVideo)仓库,可直接加载使用,无需额外预处理。
## 示例用法
若要使用基于本数据集训练或与之兼容的模型生成视频,可使用`fastvideo`库。首先安装该库:
bash
pip install fastvideo
随后通过`VideoGenerator`类生成视频:
python
from fastvideo import VideoGenerator
def main():
# 加载预训练模型以创建视频生成器
generator = VideoGenerator.from_pretrained(
"FastVideo/FastWan2.1-T2V-1.3B-Diffusers", # 示例模型,更多模型请参考FastVideo模型库
num_gpus=1, # 根据硬件配置调整GPU数量
)
# 定义视频生成提示词
prompt = "一只好奇的浣熊透过一片生机勃勃的黄色向日葵田向外张望,它的眼睛里满是好奇。"
# 生成视频
video = generator.generate_video(
prompt,
return_frames=True, # 本次调用同时返回视频帧(默认为False)
output_path="my_videos/", # 指定视频保存路径
save_video=True
)
if __name__ == '__main__':
main()
## 引用声明
若您在研究中使用本FastVideo Synthetic Wan2.1数据集,请引用以下相关论文:
bibtex
@software{fastvideo2024,
title = {FastVideo: A Unified Framework for Accelerated Video Generation},
author = {The FastVideo Team},
url = {https://github.com/hao-ai-lab/FastVideo},
month = apr,
year = {2024},
}
@article{zhang2025vsa,
title={VSA: Faster Video Diffusion with Trainable Sparse Attention},
author={Zhang, Peiyuan and Huang, Haofeng and Chen, Yongqi and Lin, Will and Liu, Zhengzhong and Stoica, Ion and Xing, Eric and Zhang, Hao},
journal={arXiv preprint arXiv:2505.13389},
year={2025}
}
@article{zhang2025fast,
title={Fast video generation with sliding tile attention},
author={Zhang, Peiyuan and Chen, Yongqi and Su, Runlong and Ding, Hangliang and Stoica, Ion and Liu, Zhengzhong and Zhang, Hao},
journal={arXiv preprint arXiv:2502.04507},
year={2025}
}
提供机构:
maas
创建时间:
2025-09-08



