VIVID-10M
收藏魔搭社区2025-12-05 更新2025-09-06 收录
下载链接:
https://modelscope.cn/datasets/KwaiVGI/VIVID-10M
下载链接
链接失效反馈官方服务:
资源简介:
# VIVID-10M
[\[project page\]](https://klingteam.github.io/VIVID/) | [\[Paper\]](https://huggingface.co/papers/2411.15260) | [\[arXiv\]](https://arxiv.org/abs/2411.15260)
VIVID-10M is the first large-scale hybrid image-video local editing dataset aimed at reducing data construction and model training costs, comprising 9.7M samples that encompass a wide range of video editing tasks.
## Data Index
The data index is located at four `.csv` files:
``` bash
vivid-image-change.csv
vivid-image-remove.csv
vivid-video-change.csv
vivid-video-remove.csv
```
VIVID-Video splits contains the columns:
``` bash
local_caption, # caption of masked object
source_video_path, # ground-truth video path
crop_video_path, # cropped video path (need to synthesize)
mask_path, # masked video path
editing_mode # change or remove
```
VIVID-Image splits contains the columns:
``` bash
local_caption, # caption of masked object
source_image_path, # ground-truth image path
crop_image_path, # cropped image path (need to synthesize)
mask_path, # masked image path
editing_mode # change or remove
```
## Get started
1. Download all files from this repository.
2. Merge split files.
```bash
cat vivid-video.tar.part-* > vivid-video.tar
cat vivid-image.tar.part-* > vivid-image.tar
```
3. Expand the `.tar` file.
```bash
tar -xvf vivid-video.tar
tar -xvf vivid-image.tar
```
4. (Optional) Synthesize cropped data.
``` bash
python get_crop_data.py
```
# VIVID-10M
[项目页](https://klingteam.github.io/VIVID/) | [论文](https://huggingface.co/papers/2411.15260) | [arXiv](https://arxiv.org/abs/2411.15260)
VIVID-10M是首个大规模混合图文局部编辑数据集,旨在降低数据构建与模型训练成本,共包含970万条样本,覆盖多类图文编辑任务。
## 数据索引
数据索引存储于四个`.csv`文件中,分别为:
bash
vivid-image-change.csv
vivid-image-remove.csv
vivid-video-change.csv
vivid-video-remove.csv
VIVID视频子集包含以下字段:
bash
local_caption, # 掩码目标的描述文本
source_video_path, # 基准真值(ground-truth)视频路径
crop_video_path, # 待合成的裁剪视频路径
mask_path, # 掩码视频路径
editing_mode # 编辑模式(修改或移除)
VIVID图像子集包含以下字段:
bash
local_caption, # 掩码目标的描述文本
source_image_path, # 基准真值(ground-truth)图像路径
crop_image_path, # 待合成的裁剪图像路径
mask_path, # 掩码图像路径
editing_mode # 编辑模式(修改或移除)
## 快速上手
1. 从本仓库下载所有文件。
2. 合并分卷压缩包:
bash
cat vivid-video.tar.part-* > vivid-video.tar
cat vivid-image.tar.part-* > vivid-image.tar
3. 解压`.tar`压缩包:
bash
tar -xvf vivid-video.tar
tar -xvf vivid-image.tar
4. (可选)合成裁剪数据:
bash
python get_crop_data.py
提供机构:
maas
创建时间:
2025-09-03
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



