OmniEdit-Filtered-1.2M
收藏魔搭社区2026-01-02 更新2024-12-14 收录
下载链接:
https://modelscope.cn/datasets/TIGER-Lab/OmniEdit-Filtered-1.2M
下载链接
链接失效反馈官方服务:
资源简介:
## OmniEdit
In this paper, we present OMNI-EDIT, which is an omnipotent editor to handle seven different image editing tasks with any aspect ratio seamlessly. Our contribution is in four folds: (1) OMNI-EDIT is trained by utilizing the supervision
from seven different specialist models to ensure task coverage. (2) we utilize importance sampling based on the scores provided by large multimodal models (like GPT-4o) instead of CLIP-score to improve the data quality.
[📃Paper](https://tiger-ai-lab.github.io/OmniEdit/) | [🌐Website](https://tiger-ai-lab.github.io/OmniEdit/) | [💻Github](https://github.com/TIGER-AI-Lab/OmniEdit) | [📚Dataset](https://huggingface.co/datasets/TIGER-Lab/OmniEdit-Filtered-1.2M)
## Dataset Columns
The dataset contains the following columns:
- src, edited_img: they are the source and edited images.
- edited_prompt_list: they are the short and long editing instructions.
- task: this indicates the editing task, which has seven categories like addition, removal, background, environment, style, etc.
- sc_score_1 and sc_score_1: semantic consistency score assigned by our quality rater.
- pq_score: the perceptual quality score assigned by our quality rater.
- o_score: the overall score, which is the weighted average of sc and pq score.
- *_reasoning: the rationale for assigning these scores.
## Data Pipeline
We synthesize the large scale dataset through specialist distillation. Our synthesis pipeline is depicted in
<p align="center">
<img src="synthesis.png" width="800">
</p>
Our released version contains 1.2M pairs covering seven different skills like addition, swaping, removal, attribute modification, background change, environment change and sytle transfer. The dataset has been filtered with VIEScore.
## Comparison with Others
Our dataset has the most diverse, highest-quality image editing pairs of any resolution.
<p align="center">
<img src="comparison.png" width="800">
</p>
## Citation
If you find our paper useful, please cite us with
```
@article{wei2024omniedit,
title={OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision},
author={Wei, Cong and Xiong, Zheyang and Ren, Weiming and Du, Xinrun and Zhang, Ge and Chen, Wenhu},
journal={arXiv preprint arXiv:2411.07199},
year={2024}
}
```
## OMNI-EDIT
本文提出了OMNI-EDIT,一款可无缝处理任意宽高比的七种图像编辑任务的全能编辑器。本文的贡献主要有四点:(1) OMNI-EDIT 借助七种不同专业模型的监督信号进行训练,以确保任务覆盖范围;(2) 我们采用基于大型多模态模型(如 GPT-4o)输出分数的重要性采样,替代 CLIP 评分,以提升数据质量。
[📃论文](https://tiger-ai-lab.github.io/OmniEdit/) | [🌐项目主页](https://tiger-ai-lab.github.io/OmniEdit/) | [💻GitHub 仓库](https://github.com/TIGER-AI-Lab/OmniEdit) | [📚数据集](https://huggingface.co/datasets/TIGER-Lab/OmniEdit-Filtered-1.2M)
## 数据集字段说明
该数据集包含以下字段:
- `src`、`edited_img`:分别代表源图像与编辑后图像。
- `edited_prompt_list`:包含长短两类编辑指令。
- `task`:表示编辑任务类型,涵盖七个类别,包括添加、移除、背景、环境、风格等。
- `sc_score_1` 与 `sc_score_2`:由质量评估人员标注的语义一致性评分。
- `pq_score`:由质量评估人员标注的感知质量评分。
- `o_score`:综合评分,为语义一致性评分与感知质量评分的加权平均值。
- `*_reasoning`:对应各项评分的标注依据。
## 数据构建流程
我们通过专业模型蒸馏技术合成大规模数据集,具体合成流程如下:
<p align="center">
<img src="synthesis.png" width="800">
</p>
本次发布的数据集包含120万组图像对,覆盖七种编辑能力:图像添加、图像替换、图像移除、属性修改、背景变更、环境调整与风格迁移。该数据集已通过 VIEScore 完成过滤。
## 与同类方法对比
本数据集拥有覆盖任意分辨率的最多样化、最高质量的图像编辑样本对。
<p align="center">
<img src="comparison.png" width="800">
</p>
## 引用说明
若您认为本研究工作对您有所帮助,请引用如下文献:
@article{wei2024omniedit,
title={OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision},
author={Wei, Cong and Xiong, Zheyang and Ren, Weiming and Du, Xinrun and Zhang, Ge and Chen, Wenhu},
journal={arXiv preprint arXiv:2411.07199},
year={2024}
}
提供机构:
maas
创建时间:
2025-02-03



