AnyEdit
收藏魔搭社区2025-12-04 更新2025-05-24 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/AnyEdit
下载链接
链接失效反馈官方服务:
资源简介:
[](https://arxiv.org/abs/2411.15738)
[](https://huggingface.co/datasets/Bin1117/AnyEdit)
[](https://huggingface.co/WeiChow/AnySD)
[](https://dcd-anyedit.github.io/)
Celebrate! AnyEdit resolved the data alignment with the re-uploading process (but the view filter is not working:(, though it has 25 edit types). You can view the validation split for a quick look. You can also refer to [anyedit-split](https://huggingface.co/datasets/Bin1117/anyedit-split) dataset to view and download specific data for each editing type.
# Dataset Card for AnyEdit-Dataset
Instruction-based image editing aims to modify specific image elements with natural language instructions. However, current models in this domain often struggle to accurately execute complex user instructions, as they are trained on low-quality data with limited editing types. We present **AnyEdit**, a comprehensive multi-modal instruction editing dataset, comprising **2.5 million high-quality editing pairs** spanning **25 editing types and five domains**.
## Dataset Description
- **Homepage:** https://dcd-anyedit.github.io/
- **Repository:** https://github.com/DCDmllm/AnyEdit
- **Point of Contact:** [Qifan Yu](mailto:yuqifan@zju.edu.cn)
## Dataset Details
### Dataset Description
We comprehensively categorize image editing tasks into 5 groups based on different editing capabilities:
(a) Local Editing which focuses on region-based editing (green area);
(b) Global Editing which focuses on the full range of image rendering (yellow area);
(c) Camera Move Editing which focuses on viewpoints changing instead of scenes (gray area);
(d) Implicit Editing which requires commonsense knowledge to complete complex editing (orange area);
(e) Visual Editing which encompasses additional visual inputs, addressing the requirements for multi-modal editing (blue area).
- **Curated by:** [More Information Needed]
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
### Dataset Sources [optional]
<!-- Provide the basic links for the dataset. -->
- **Repository:** https://dcd-anyedit.github.io/
- **Paper:** https://arxiv.org/abs/2411.15738
- **Demo:** [More Information Needed]
Where to send questions or comments about the model: https://github.com/DCDmllm/AnyEdit/issues
## Intended use
Primary intended uses: The primary use of AnyEdit is research on text-to-image and instruction-based image editing.
Primary intended users: The model's primary intended users are researchers and hobbyists in computer vision, image generation, image processing, and AIGC.
## Dataset Structure
### Instruction Format
```
{
"edit": "change the airplane to green", # edited instruction
"edited object": "airplane", # the edited region, only for local editing, else is None
"input": "a small airplane sits stationary on a piece of concrete.", # the caption of the original image
"output": "A green small airplane sits stationary on a piece of concrete.", # the caption of the edited image
"edit_type": "color_alter", # editing type
"visual_input": "None", # the reference image for visual input instruction, else is None
"image_file": "coco/train2014/COCO_train2014_000000521165.jpg", # the file of original image
"edited_file": "anyedit_datasets/color_alter/xxxxx.jpg" # the file of edited image
}
```
### Dataset File Structure
To prevent potential data leakage, please check our repo for information on obtaining the test set.
We only provide a zip file for the test split to prevent potential data contamination from foundation models crawling the test set for training. Please download the test set [here](https://drive.google.com/file/d/1V-Z4agWoTMzAYkRJQ1BNz0-i79eAVWt4/view?usp=sharing).
```
├── anyedit_datasets
│ ├── train (~2.5M)
│ │ ├── remove
│ │ ├── background_change
│ │ ├── rotation_change
│ │ ├── visual_material_transfer
│ │ └── ...
│ ├── validation (5000)
│ ├── anyedit-test (1250)
```
### How to use AnyEdit
We provide an example to show how to use this data.
```python
from datasets import load_dataset
from PIL import Image
# Load the dataset
ds = load_dataset("Bin1117/AnyEdit")
# Print the total number of samples and show the first sample
print(f"Total number of samples: {len(ds['train'])}")
print("First sample in the dataset:", ds['train'][0])
# Retrieve the first sample's data
data_dict = ds['train'][0]
# Save the input image (image_file)
input_img = data_dict['image_file']
input_img.save('input_image.jpg')
print("Saved input image as 'input_image.jpg'.")
# Save the edited image (edited_file)
output_img = data_dict['edited_file']
output_img.save('edited_image.jpg')
print("Saved output image as 'edited_image.jpg'.")
# Save the visual images for visual editing (visual_input)
if data_dict['visual_input'] is not None:
visual_img = data_dict['visual_input']
visual_img.save('visual_input.jpg')
```
## Bibtex citation
```bibtex
@article{yu2024anyedit,
title={AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea},
author={Yu, Qifan and Chow, Wei and Yue, Zhongqi and Pan, Kaihang and Wu, Yang and Wan, Xiaoyang and Li, Juncheng and Tang, Siliang and Zhang, Hanwang and Zhuang, Yueting},
journal={arXiv preprint arXiv:2411.15738},
year={2024}
}
```
[](https://arxiv.org/abs/2411.15738)
[](https://huggingface.co/datasets/Bin1117/AnyEdit)
[](https://huggingface.co/WeiChow/AnySD)
[](https://dcd-anyedit.github.io/)
好消息!AnyEdit已通过重新上传流程解决了数据对齐问题(尽管视图过滤器暂未正常工作,但目前已覆盖25种编辑类型)。您可快速查看验证集以快速了解数据集概况,也可参考[anyedit-split](https://huggingface.co/datasets/Bin1117/anyedit-split)数据集来查看并下载对应编辑类型的专属数据。
# AnyEdit数据集卡片
基于指令的图像编辑旨在通过自然语言指令修改图像中的特定元素。然而当前该领域的模型往往难以精准执行复杂的用户指令,这是因为现有训练数据质量较低且覆盖的编辑类型有限。为此我们提出**AnyEdit**——一款全面的多模态指令编辑数据集,包含**250万组高质量编辑样本对**,覆盖**25种编辑类型与5大领域**。
## 数据集说明
- **主页:** https://dcd-anyedit.github.io/
- **代码仓库:** https://github.com/DCDmllm/AnyEdit
- **联系方式:** [俞启凡](mailto:yuqifan@zju.edu.cn)
## 数据集详情
### 数据集概述
我们基于不同的编辑能力,将图像编辑任务全面划分为5大类:
(a) 局部编辑(Local Editing):聚焦于基于区域的编辑(绿色标注区域);
(b) 全局编辑(Global Editing):聚焦于全图范围的图像渲染调整(黄色标注区域);
(c) 相机运动编辑(Camera Move Editing):聚焦于视角变换而非场景修改(灰色标注区域);
(d) 隐式编辑(Implicit Editing):需要借助常识知识完成复杂编辑任务(橙色标注区域);
(e) 视觉编辑(Visual Editing):包含额外视觉输入,满足多模态编辑的需求(蓝色标注区域)。
- **数据整理方:** [需补充更多信息]
- **资助方 [可选]:** [需补充更多信息]
- **分享方 [可选]:** [需补充更多信息]
- **自然语言类型:** [需补充更多信息]
- **许可证:** [需补充更多信息]
### 数据集来源 [可选]
<!-- 请提供数据集的基础链接。 -->
- **代码仓库:** https://dcd-anyedit.github.io/
- **论文:** https://arxiv.org/abs/2411.15738
- **演示:** [需补充更多信息]
如有关于该数据集的疑问或建议,请前往:https://github.com/DCDmllm/AnyEdit/issues
## 预期用途
核心用途:AnyEdit的核心用途为面向文本到图像生成与基于指令的图像编辑领域的研究。
目标用户:该数据集的目标用户为计算机视觉、图像生成、图像处理以及AIGC领域的研究人员与爱好者。
## 数据集结构
### 指令格式
{
"edit": "change the airplane to green", # 编辑指令
"edited object": "airplane", # 编辑区域,仅局部编辑场景下有效,其余场景为None
"input": "a small airplane sits stationary on a piece of concrete.", # 原始图像的描述文本
"output": "A green small airplane sits stationary on a piece of concrete.", # 编辑后图像的描述文本
"edit_type": "color_alter", # 编辑类型
"visual_input": "None", # 视觉编辑指令对应的参考图像,其余场景为None
"image_file": "coco/train2014/COCO_train2014_000000521165.jpg", # 原始图像文件路径
"edited_file": "anyedit_datasets/color_alter/xxxxx.jpg" # 编辑后图像文件路径
}
### 数据集文件结构
为避免潜在的数据泄露问题,请查阅我们的代码仓库以获取测试集的相关获取方式。
为防止基础模型抓取测试集用于训练而造成的数据污染,我们仅提供测试集的压缩包。您可通过[此链接](https://drive.google.com/file/d/1V-Z4agWoTMzAYkRJQ1BNz0-i79eAVWt4/view?usp=sharing)下载测试集。
├── anyedit_datasets
│ ├── train (~2.5M)
│ │ ├── remove
│ │ ├── background_change
│ │ ├── rotation_change
│ │ ├── visual_material_transfer
│ │ └── ...
│ ├── validation (5000)
│ ├── anyedit-test (1250)
### AnyEdit使用方法
我们提供以下示例以展示该数据集的使用方式:
python
from datasets import load_dataset
from PIL import Image
# 加载数据集
ds = load_dataset("Bin1117/AnyEdit")
# 打印总样本数并展示第一条样本
print(f"总样本数: {len(ds['train'])}")
print("数据集中的第一条样本:", ds['train'][0])
# 获取第一条样本的详细数据
data_dict = ds['train'][0]
# 保存原始图像(image_file字段)
input_img = data_dict['image_file']
input_img.save('input_image.jpg')
print("已将原始图像保存为'input_image.jpg'。")
# 保存编辑后图像(edited_file字段)
output_img = data_dict['edited_file']
output_img.save('edited_image.jpg')
print("已将编辑后图像保存为'edited_image.jpg'。")
# 保存视觉编辑所需的参考图像(visual_input字段)
if data_dict['visual_input'] is not None:
visual_img = data_dict['visual_input']
visual_img.save('visual_input.jpg')
## Bibtex引用
bibtex
@article{yu2024anyedit,
title={AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea},
author={Yu, Qifan and Chow, Wei and Yue, Zhongqi and Pan, Kaihang and Wu, Yang and Wan, Xiaoyang and Li, Juncheng and Tang, Siliang and Zhang, Hanwang and Zhuang, Yueting},
journal={arXiv preprint arXiv:2411.15738},
year={2024}
}
提供机构:
maas
创建时间:
2025-05-19



