five

VIVID-10M

收藏
魔搭社区2025-12-05 更新2025-09-06 收录
下载链接:
https://modelscope.cn/datasets/KwaiVGI/VIVID-10M
下载链接
链接失效反馈
官方服务:
资源简介:
# VIVID-10M [\[project page\]](https://klingteam.github.io/VIVID/) | [\[Paper\]](https://huggingface.co/papers/2411.15260) | [\[arXiv\]](https://arxiv.org/abs/2411.15260) VIVID-10M is the first large-scale hybrid image-video local editing dataset aimed at reducing data construction and model training costs, comprising 9.7M samples that encompass a wide range of video editing tasks. ## Data Index The data index is located at four `.csv` files: ``` bash vivid-image-change.csv vivid-image-remove.csv vivid-video-change.csv vivid-video-remove.csv ``` VIVID-Video splits contains the columns: ``` bash local_caption, # caption of masked object source_video_path, # ground-truth video path crop_video_path, # cropped video path (need to synthesize) mask_path, # masked video path editing_mode # change or remove ``` VIVID-Image splits contains the columns: ``` bash local_caption, # caption of masked object source_image_path, # ground-truth image path crop_image_path, # cropped image path (need to synthesize) mask_path, # masked image path editing_mode # change or remove ``` ## Get started 1. Download all files from this repository. 2. Merge split files. ```bash cat vivid-video.tar.part-* > vivid-video.tar cat vivid-image.tar.part-* > vivid-image.tar ``` 3. Expand the `.tar` file. ```bash tar -xvf vivid-video.tar tar -xvf vivid-image.tar ``` 4. (Optional) Synthesize cropped data. ``` bash python get_crop_data.py ```

# VIVID-10M [项目页](https://klingteam.github.io/VIVID/) | [论文](https://huggingface.co/papers/2411.15260) | [arXiv](https://arxiv.org/abs/2411.15260) VIVID-10M是首个大规模混合图文局部编辑数据集,旨在降低数据构建与模型训练成本,共包含970万条样本,覆盖多类图文编辑任务。 ## 数据索引 数据索引存储于四个`.csv`文件中,分别为: bash vivid-image-change.csv vivid-image-remove.csv vivid-video-change.csv vivid-video-remove.csv VIVID视频子集包含以下字段: bash local_caption, # 掩码目标的描述文本 source_video_path, # 基准真值(ground-truth)视频路径 crop_video_path, # 待合成的裁剪视频路径 mask_path, # 掩码视频路径 editing_mode # 编辑模式(修改或移除) VIVID图像子集包含以下字段: bash local_caption, # 掩码目标的描述文本 source_image_path, # 基准真值(ground-truth)图像路径 crop_image_path, # 待合成的裁剪图像路径 mask_path, # 掩码图像路径 editing_mode # 编辑模式(修改或移除) ## 快速上手 1. 从本仓库下载所有文件。 2. 合并分卷压缩包: bash cat vivid-video.tar.part-* > vivid-video.tar cat vivid-image.tar.part-* > vivid-image.tar 3. 解压`.tar`压缩包: bash tar -xvf vivid-video.tar tar -xvf vivid-image.tar 4. (可选)合成裁剪数据: bash python get_crop_data.py
提供机构:
maas
创建时间:
2025-09-03
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作