Vchitect_T2V_DataVerse

Name: Vchitect_T2V_DataVerse
Creator: maas
Published: 2025-12-05 16:41:09
License: 暂无描述

魔搭社区2025-12-05 更新2025-07-19 收录

下载链接：

https://modelscope.cn/datasets/Vchitect/Vchitect_T2V_DataVerse

下载链接

链接失效反馈

官方服务：

资源简介：

# Vchitect-T2V-Dataverse <div> <div align="center"> <a href='https://vchitect.intern-ai.org.cn/' target='_blank'>Vchitect Team<sup>1</sup></a>&emsp; </div> <div> <div align="center"> <sup>1</sup>Shanghai Artificial Intelligence Laboratory&emsp; </div> <div align="center"> <a href="https://arxiv.org/abs/2501.08453">Paper</a> | <a href="https://vchitect.intern-ai.org.cn/">Project Page</a> | </div> ## Data Overview The Vchitect-T2V-Dataverse is the core dataset used to train our text-to-video diffusion model, Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models. It comprises 14 million high-quality videos collected from the Internet, each paired with detailed textual captions. This large-scale dataset enables the model to learn rich video-text alignments and generate temporally coherent video content from textual prompts. For more technical details, data processing procedures, and model training strategies, please refer to our paper. ## BibTex ``` @article{fan2025vchitect, title={Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models}, author={Fan, Weichen and Si, Chenyang and Song, Junhao and Yang, Zhenyu and He, Yinan and Zhuo, Long and Huang, Ziqi and Dong, Ziyue and He, Jingwen and Pan, Dongwei and others}, journal={arXiv preprint arXiv:2501.08453}, year={2025} } @article{si2025RepVideo, title={RepVideo: Rethinking Cross-Layer Representation for Video Generation}, author={Si, Chenyang and Fan, Weichen and Lv, Zhengyao and Huang, Ziqi and Qiao, Yu and Liu, Ziwei}, journal={arXiv 2501.08994}, year={2025} } ``` ## Disclaimer We disclaim responsibility for user-generated content. The model was not trained to realistically represent people or events, so using it to generate such content is beyond the model's capabilities. It is prohibited for pornographic, violent and bloody content generation, and to generate content that is demeaning or harmful to people or their environment, culture, religion, etc. Users are solely liable for their actions. The project contributors are not legally affiliated with, nor accountable for users' behaviors. Use the generative model responsibly, adhering to ethical and legal standards.

# Vchitect-T2V-数据宇宙 <div align="center"> <a href='https://vchitect.intern-ai.org.cn/' target='_blank'>Vchitect 团队<sup>1</sup></a>&emsp; </div> <div> <div align="center"> <sup>1</sup>上海人工智能实验室&emsp; </div> <div align="center"> <a href="https://arxiv.org/abs/2501.08453">论文</a> | <a href="https://vchitect.intern-ai.org.cn/">项目主页</a> | </div> ## 数据集概览 Vchitect-T2V-数据宇宙是用于训练我们的文本至视频扩散模型（text-to-video diffusion model）Vchitect-2.0：面向视频扩散模型规模化的并行Transformer（Parallel Transformer）的核心数据集。该数据集包含从互联网采集的1400万条高质量视频，每条视频均配有详尽的文本标注。这一超大规模数据集可助力模型学习到丰富的视频-文本对齐特征，并能够根据文本提示生成时序连贯的视频内容。如需了解更多技术细节、数据处理流程及模型训练策略，请参阅我们的论文。 ## BibTex @article{fan2025vchitect, title={Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models}, author={Fan, Weichen and Si, Chenyang and Song, Junhao and Yang, Zhenyu and He, Yinan and Zhuo, Long and Huang, Ziqi and Dong, Ziyue and He, Jingwen and Pan, Dongwei and others}, journal={arXiv preprint arXiv:2501.08453}, year={2025} } @article{si2025RepVideo, title={RepVideo: Rethinking Cross-Layer Representation for Video Generation}, author={Si, Chenyang and Fan, Weichen and Lv, Zhengyao and Huang, Ziqi and Qiao, Yu and Liu, Ziwei}, journal={arXiv 2501.08994}, year={2025} } ## 免责声明我们不对用户生成的内容承担责任。本模型并非为真实还原人物或事件而训练，因此使用其生成此类内容超出了模型的能力范围。禁止使用本模型生成色情、暴力及血腥内容，以及任何贬低、伤害他人或破坏其环境、文化、宗教等的内容。用户需对其自身行为承担全部责任。本项目贡献者与用户行为无法律从属关系，亦不对用户行为承担责任。请负责任地使用生成式模型，遵守伦理与法律准则。

提供机构：

maas

创建时间：

2025-07-07

搜集汇总

数据集介绍

背景与挑战

背景概述

Vchitect-T2V-Dataverse是一个大规模文本到视频数据集，包含1400万个高质量视频及其详细文本描述，用于训练Vchitect-2.0视频扩散模型。数据集规模达7.25TB，支持视频与文本的对齐学习，以生成时间连贯的视频内容。该数据集遵循Apache 2.0许可证，更新于2025年7月12日。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集