VideoGrain-dataset

Name: VideoGrain-dataset
Creator: maas
Published: 2025-12-18 16:27:39
License: 暂无描述

魔搭社区2025-12-18 更新2025-03-22 收录

下载链接：

https://modelscope.cn/datasets/AI-ModelScope/VideoGrain-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

# VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing (ICLR 2025) [Github](https://github.com/knightyxp/VideoGrain) (⭐ Star our GitHub ) [Project Page](https://knightyxp.github.io/VideoGrain_project_page) [ArXiv](https://arxiv.org/abs/2502.17258) [Youtube Video](https://www.youtube.com/watch?v=XEM4Pex7F9E) [HuggingFace Daily Papers Top1](https://huggingface.co/papers/2502.17258) If you think this dataset is helpful, please feel free to leave a star⭐️⭐️⭐️ and cite our paper: <p align="center"> <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6486df66373f79a52913e017/ZQnogrOMFhy1mcTuxSQ62.mp4"></video> </p> # Summary This is the dataset proposed in our paper [VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing](https://arxiv.org/abs/2502.17258) (ICLR 2025). VideoGrain is a zero-shot method for class-level, instance-level, and part-level video editing. - **Multi-grained Video Editing** - class-level: Editing objects within the same class (previous SOTA limited to this level) - instance-level: Editing each individual instance to distinct object - part-level: Adding new objects or modifying existing attributes at the part-level - **Training-Free** - Does not require any training/fine-tuning - **One-Prompt Multi-region Control & Deep investigations about cross/self attn** - modulating cross-attn for multi-regions control (visualizations available) - modulating self-attn for feature decoupling (clustering are available) # Directory ``` data/ ├── 2_cars │ ├── 2_cars # original videos frames │ └── layout_masks # layout masks subfolders (e.g., bg, left, right) ├── 2_cats │ ├── 2_cats │ └── layout_masks ├── 2_monkeys ├── badminton ├── boxer-punching ├── car ├── cat_flower ├── man_text_message ├── run_two_man ├── soap-box ├── spin-ball ├── tennis └── wolf ``` # Download ### Automatical Install the [datasets](https://huggingface.co/docs/datasets/v1.15.1/installation.html) library first, by: ``` pip install datasets ``` Then it can be downloaded automatically with ```python import numpy as np from datasets import load_dataset dataset = load_dataset("XiangpengYang/VideoGrain-dataset") ``` # License This dataset are licensed under the [CC BY-NC 4.0 license](https://creativecommons.org/licenses/by-nc/4.0/deed.en). # Citation ``` @article{yang2025videograin, title={VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing}, author={Yang, Xiangpeng and Zhu, Linchao and Fan, Hehe and Yang, Yi}, journal={arXiv preprint arXiv:2502.17258}, year={2025} } ``` # Contact If you have any questions, feel free to contact Xiangpeng Yang (knightyxp@gmail.com).

# VideoGrain：面向多粒度视频编辑的时空注意力调制方法（ICLR 2025） [GitHub仓库](https://github.com/knightyxp/VideoGrain)（⭐ 欢迎为我们的GitHub仓库点亮Star） [项目主页](https://knightyxp.github.io/VideoGrain_project_page) [ArXiv论文](https://arxiv.org/abs/2502.17258) [YouTube演示视频](https://www.youtube.com/watch?v=XEM4Pex7F9E) [HuggingFace每日论文榜Top1](https://huggingface.co/papers/2502.17258) 若您认为本数据集对您的研究有所帮助，欢迎为我们点亮Star⭐️⭐️⭐️并引用我们的论文： <p align="center"> <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6486df66373f79a52913e017/ZQnogrOMFhy1mcTuxSQ62.mp4"></video> </p> # 数据集概述本数据集来自我们发表于ICLR 2025的论文《VideoGrain：面向多粒度视频编辑的时空注意力调制方法》（[ArXiv链接](https://arxiv.org/abs/2502.17258)）。 VideoGrain是一种支持**零样本（zero-shot）**的视频编辑方法，可实现类级别、实例级别与部件级别的视频编辑。 - **多粒度视频编辑** - 类级别：对同一类别的目标进行编辑（此前的当前最优方法仅支持该级别编辑） - 实例级别：对每个独立实例进行编辑，使其转换为不同的目标对象 - 部件级别：在部件层面添加新目标或修改现有目标的属性 - **无需训练** - 无需任何训练或微调过程 - **单提示多区域控制与交叉/自注意力深度分析** - 调制交叉注意力（cross-attn）以实现多区域控制（附带可视化结果） - 调制自注意力（self-attn）以实现特征解耦（附带聚类结果） # 目录结构 data/ ├── 2_cars │ ├── 2_cars # 原始视频帧 │ └── layout_masks # 布局掩码子文件夹（例如背景、左侧、右侧区域） ├── 2_cats │ ├── 2_cats │ └── layout_masks ├── 2_monkeys ├── badminton ├── boxer-punching ├── car ├── cat_flower ├── man_text_message ├── run_two_man ├── soap-box ├── spin-ball ├── tennis └── wolf # 下载方式 ### 自动下载请先通过以下命令安装[datasets库](https://huggingface.co/docs/datasets/v1.15.1/installation.html)： pip install datasets 随后即可通过以下代码自动下载该数据集： python import numpy as np from datasets import load_dataset dataset = load_dataset("XiangpengYang/VideoGrain-dataset") # 授权协议本数据集采用[CC BY-NC 4.0许可协议](https://creativecommons.org/licenses/by-nc/4.0/deed.en)进行授权。 # 引用格式 @article{yang2025videograin, title={VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing}, author={Yang, Xiangpeng and Zhu, Linchao and Fan, Hehe and Yang, Yi}, journal={arXiv preprint arXiv:2502.17258}, year={2025} } # 联系方式若您有任何疑问，欢迎联系杨翔鹏（邮箱：knightyxp@gmail.com）。

提供机构：

maas

创建时间：

2025-03-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集