five

VideoGrain-dataset

收藏
魔搭社区2025-12-18 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/VideoGrain-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
# VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing (ICLR 2025) [Github](https://github.com/knightyxp/VideoGrain) (⭐ Star our GitHub ) [Project Page](https://knightyxp.github.io/VideoGrain_project_page) [ArXiv](https://arxiv.org/abs/2502.17258) [Youtube Video](https://www.youtube.com/watch?v=XEM4Pex7F9E) [HuggingFace Daily Papers Top1](https://huggingface.co/papers/2502.17258) If you think this dataset is helpful, please feel free to leave a star⭐️⭐️⭐️ and cite our paper: <p align="center"> <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6486df66373f79a52913e017/ZQnogrOMFhy1mcTuxSQ62.mp4"></video> </p> # Summary This is the dataset proposed in our paper [VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing](https://arxiv.org/abs/2502.17258) (ICLR 2025). VideoGrain is a zero-shot method for class-level, instance-level, and part-level video editing. - **Multi-grained Video Editing** - class-level: Editing objects within the same class (previous SOTA limited to this level) - instance-level: Editing each individual instance to distinct object - part-level: Adding new objects or modifying existing attributes at the part-level - **Training-Free** - Does not require any training/fine-tuning - **One-Prompt Multi-region Control & Deep investigations about cross/self attn** - modulating cross-attn for multi-regions control (visualizations available) - modulating self-attn for feature decoupling (clustering are available) # Directory ``` data/ ├── 2_cars │ ├── 2_cars # original videos frames │ └── layout_masks # layout masks subfolders (e.g., bg, left, right) ├── 2_cats │ ├── 2_cats │ └── layout_masks ├── 2_monkeys ├── badminton ├── boxer-punching ├── car ├── cat_flower ├── man_text_message ├── run_two_man ├── soap-box ├── spin-ball ├── tennis └── wolf ``` # Download ### Automatical Install the [datasets](https://huggingface.co/docs/datasets/v1.15.1/installation.html) library first, by: ``` pip install datasets ``` Then it can be downloaded automatically with ```python import numpy as np from datasets import load_dataset dataset = load_dataset("XiangpengYang/VideoGrain-dataset") ``` # License This dataset are licensed under the [CC BY-NC 4.0 license](https://creativecommons.org/licenses/by-nc/4.0/deed.en). # Citation ``` @article{yang2025videograin, title={VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing}, author={Yang, Xiangpeng and Zhu, Linchao and Fan, Hehe and Yang, Yi}, journal={arXiv preprint arXiv:2502.17258}, year={2025} } ``` # Contact If you have any questions, feel free to contact Xiangpeng Yang (knightyxp@gmail.com).

# VideoGrain:面向多粒度视频编辑的时空注意力调制方法(ICLR 2025) [GitHub仓库](https://github.com/knightyxp/VideoGrain)(⭐ 欢迎为我们的GitHub仓库点亮Star) [项目主页](https://knightyxp.github.io/VideoGrain_project_page) [ArXiv论文](https://arxiv.org/abs/2502.17258) [YouTube演示视频](https://www.youtube.com/watch?v=XEM4Pex7F9E) [HuggingFace每日论文榜Top1](https://huggingface.co/papers/2502.17258) 若您认为本数据集对您的研究有所帮助,欢迎为我们点亮Star⭐️⭐️⭐️并引用我们的论文: <p align="center"> <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6486df66373f79a52913e017/ZQnogrOMFhy1mcTuxSQ62.mp4"></video> </p> # 数据集概述 本数据集来自我们发表于ICLR 2025的论文《VideoGrain:面向多粒度视频编辑的时空注意力调制方法》([ArXiv链接](https://arxiv.org/abs/2502.17258))。 VideoGrain是一种支持**零样本(zero-shot)**的视频编辑方法,可实现类级别、实例级别与部件级别的视频编辑。 - **多粒度视频编辑** - 类级别:对同一类别的目标进行编辑(此前的当前最优方法仅支持该级别编辑) - 实例级别:对每个独立实例进行编辑,使其转换为不同的目标对象 - 部件级别:在部件层面添加新目标或修改现有目标的属性 - **无需训练** - 无需任何训练或微调过程 - **单提示多区域控制与交叉/自注意力深度分析** - 调制交叉注意力(cross-attn)以实现多区域控制(附带可视化结果) - 调制自注意力(self-attn)以实现特征解耦(附带聚类结果) # 目录结构 data/ ├── 2_cars │ ├── 2_cars # 原始视频帧 │ └── layout_masks # 布局掩码子文件夹(例如背景、左侧、右侧区域) ├── 2_cats │ ├── 2_cats │ └── layout_masks ├── 2_monkeys ├── badminton ├── boxer-punching ├── car ├── cat_flower ├── man_text_message ├── run_two_man ├── soap-box ├── spin-ball ├── tennis └── wolf # 下载方式 ### 自动下载 请先通过以下命令安装[datasets库](https://huggingface.co/docs/datasets/v1.15.1/installation.html): pip install datasets 随后即可通过以下代码自动下载该数据集: python import numpy as np from datasets import load_dataset dataset = load_dataset("XiangpengYang/VideoGrain-dataset") # 授权协议 本数据集采用[CC BY-NC 4.0许可协议](https://creativecommons.org/licenses/by-nc/4.0/deed.en)进行授权。 # 引用格式 @article{yang2025videograin, title={VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing}, author={Yang, Xiangpeng and Zhu, Linchao and Fan, Hehe and Yang, Yi}, journal={arXiv preprint arXiv:2502.17258}, year={2025} } # 联系方式 若您有任何疑问,欢迎联系杨翔鹏(邮箱:knightyxp@gmail.com)。
提供机构:
maas
创建时间:
2025-03-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作