VideoGrain-dataset
收藏魔搭社区2025-12-18 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/VideoGrain-dataset
下载链接
链接失效反馈官方服务:
资源简介:
# VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing (ICLR 2025)
[Github](https://github.com/knightyxp/VideoGrain) (⭐ Star our GitHub )
[Project Page](https://knightyxp.github.io/VideoGrain_project_page)
[ArXiv](https://arxiv.org/abs/2502.17258)
[Youtube Video](https://www.youtube.com/watch?v=XEM4Pex7F9E)
[HuggingFace Daily Papers Top1](https://huggingface.co/papers/2502.17258)
If you think this dataset is helpful, please feel free to leave a star⭐️⭐️⭐️ and cite our paper:
<p align="center">
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6486df66373f79a52913e017/ZQnogrOMFhy1mcTuxSQ62.mp4"></video>
</p>
# Summary
This is the dataset proposed in our paper [VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing](https://arxiv.org/abs/2502.17258) (ICLR 2025).
VideoGrain is a zero-shot method for class-level, instance-level, and part-level video editing.
- **Multi-grained Video Editing**
- class-level: Editing objects within the same class (previous SOTA limited to this level)
- instance-level: Editing each individual instance to distinct object
- part-level: Adding new objects or modifying existing attributes at the part-level
- **Training-Free**
- Does not require any training/fine-tuning
- **One-Prompt Multi-region Control & Deep investigations about cross/self attn**
- modulating cross-attn for multi-regions control (visualizations available)
- modulating self-attn for feature decoupling (clustering are available)
# Directory
```
data/
├── 2_cars
│ ├── 2_cars # original videos frames
│ └── layout_masks # layout masks subfolders (e.g., bg, left, right)
├── 2_cats
│ ├── 2_cats
│ └── layout_masks
├── 2_monkeys
├── badminton
├── boxer-punching
├── car
├── cat_flower
├── man_text_message
├── run_two_man
├── soap-box
├── spin-ball
├── tennis
└── wolf
```
# Download
### Automatical
Install the [datasets](https://huggingface.co/docs/datasets/v1.15.1/installation.html) library first, by:
```
pip install datasets
```
Then it can be downloaded automatically with
```python
import numpy as np
from datasets import load_dataset
dataset = load_dataset("XiangpengYang/VideoGrain-dataset")
```
# License
This dataset are licensed under the [CC BY-NC 4.0 license](https://creativecommons.org/licenses/by-nc/4.0/deed.en).
# Citation
```
@article{yang2025videograin,
title={VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing},
author={Yang, Xiangpeng and Zhu, Linchao and Fan, Hehe and Yang, Yi},
journal={arXiv preprint arXiv:2502.17258},
year={2025}
}
```
# Contact
If you have any questions, feel free to contact Xiangpeng Yang (knightyxp@gmail.com).
# VideoGrain:面向多粒度视频编辑的时空注意力调制方法(ICLR 2025)
[GitHub仓库](https://github.com/knightyxp/VideoGrain)(⭐ 欢迎为我们的GitHub仓库点亮Star)
[项目主页](https://knightyxp.github.io/VideoGrain_project_page)
[ArXiv论文](https://arxiv.org/abs/2502.17258)
[YouTube演示视频](https://www.youtube.com/watch?v=XEM4Pex7F9E)
[HuggingFace每日论文榜Top1](https://huggingface.co/papers/2502.17258)
若您认为本数据集对您的研究有所帮助,欢迎为我们点亮Star⭐️⭐️⭐️并引用我们的论文:
<p align="center">
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6486df66373f79a52913e017/ZQnogrOMFhy1mcTuxSQ62.mp4"></video>
</p>
# 数据集概述
本数据集来自我们发表于ICLR 2025的论文《VideoGrain:面向多粒度视频编辑的时空注意力调制方法》([ArXiv链接](https://arxiv.org/abs/2502.17258))。
VideoGrain是一种支持**零样本(zero-shot)**的视频编辑方法,可实现类级别、实例级别与部件级别的视频编辑。
- **多粒度视频编辑**
- 类级别:对同一类别的目标进行编辑(此前的当前最优方法仅支持该级别编辑)
- 实例级别:对每个独立实例进行编辑,使其转换为不同的目标对象
- 部件级别:在部件层面添加新目标或修改现有目标的属性
- **无需训练**
- 无需任何训练或微调过程
- **单提示多区域控制与交叉/自注意力深度分析**
- 调制交叉注意力(cross-attn)以实现多区域控制(附带可视化结果)
- 调制自注意力(self-attn)以实现特征解耦(附带聚类结果)
# 目录结构
data/
├── 2_cars
│ ├── 2_cars # 原始视频帧
│ └── layout_masks # 布局掩码子文件夹(例如背景、左侧、右侧区域)
├── 2_cats
│ ├── 2_cats
│ └── layout_masks
├── 2_monkeys
├── badminton
├── boxer-punching
├── car
├── cat_flower
├── man_text_message
├── run_two_man
├── soap-box
├── spin-ball
├── tennis
└── wolf
# 下载方式
### 自动下载
请先通过以下命令安装[datasets库](https://huggingface.co/docs/datasets/v1.15.1/installation.html):
pip install datasets
随后即可通过以下代码自动下载该数据集:
python
import numpy as np
from datasets import load_dataset
dataset = load_dataset("XiangpengYang/VideoGrain-dataset")
# 授权协议
本数据集采用[CC BY-NC 4.0许可协议](https://creativecommons.org/licenses/by-nc/4.0/deed.en)进行授权。
# 引用格式
@article{yang2025videograin,
title={VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing},
author={Yang, Xiangpeng and Zhu, Linchao and Fan, Hehe and Yang, Yi},
journal={arXiv preprint arXiv:2502.17258},
year={2025}
}
# 联系方式
若您有任何疑问,欢迎联系杨翔鹏(邮箱:knightyxp@gmail.com)。
提供机构:
maas
创建时间:
2025-03-21



