123123chen/WiseEdit-Benchmark

Name: 123123chen/WiseEdit-Benchmark
Creator: 123123chen
Published: 2025-12-09 07:13:24
License: 暂无描述

Hugging Face2025-12-09 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/123123chen/WiseEdit-Benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - image-to-image language: - en pretty_name: WiseEdit size_categories: - 1K<n<10K --- <div align="center"> <h1 style="font-size: 2.0em; margin-bottom: 0.15em;"> WiseEdit: Benchmarking Cognition- and Creativity-Informed Image Editing </h1> Kaihang Pan1* · Weile Chen1* · Haiyi Qiu1* · Qifan Yu1 · Wendong Bu1 · Zehan Wang1 Yun Zhu2 · Juncheng Li1 · Siliang Tang1 1Zhejiang University     2Shanghai Artificial Intelligence Laboratory *Equal contribution. <a href="https://www.arxiv.org/abs/2512.00387"> <img src="https://img.shields.io/badge/arXiv-2512.00387-b31b1b.svg" alt="arXiv"> </a> <a href="https://qnancy.github.io/wiseedit_project_page/"> <img src="https://img.shields.io/badge/Project-Page-b3.svg" alt="Project Page"> </a> <a href="https://github.com/beepkh/WiseEdit"> <img src="https://img.shields.io/badge/GitHub-Code-181717?logo=github" alt="Code"> </a> </div> ## 🌍 Introduction WiseEdit is a knowledge-intensive benchmark for cognition- and creativity-informed image editing. It decomposes instruction-based editing into three stages, **Awareness**, **Interpretation**, and **Imagination**, and provides **1,220 bilingual test cases** together with a GPT-4o–based automatic evaluation pipeline. Using WiseEdit, we benchmark **22 state-of-the-art image editing models** and reveal clear limitations in knowledge-based reasoning and compositional creativity. <img src="figures/intro.png" width="100%"> ## 💡 Dataset Overview WiseEdit is built around **task depth** and **knowledge breadth**. <img src="figures/wiseedit-intro.png" width="90%"> ### Task Depth – Four Task Types WiseEdit includes: - **Awareness Task** – Focus on *where* to edit; no explicit spatial coordinates are given; requires comparative reasoning, reference matching, or fine-grained perception. - **Interpretation Task** – Focus on *how* to edit at the perception level; instructions often encode **implicit intent**, demanding world knowledge. - **Imagination Task** – Focus on subject-driven creative generation; requires complex composition and identity-preserving transformations. - **WiseEdit-Complex** – Combines Awareness + Interpretation + Imagination; multi-image, multi-step reasoning with conditional logic and compositional generation. ### Knowledge Breadth – Three Knowledge Types WiseEdit organizes cases by **knowledge type**: - **Declarative Knowledge** – “knowing what”; Facts, concepts, perceptual cues. - **Procedural Knowledge** – “knowing how”; Multi-step skills or procedures. - **Metacognitive Knowledge** – “knowing about knowing”; When and how to apply declarative / procedural knowledge; conditional reasoning, rule stacking, etc. These are grounded in **Cultural Common Sense**, **Natural Sciences**, and **Spatio-Temporal Logic**, stressing culturally appropriate, physically consistent, and logically coherent edits. ## ⭐ Evaluation Protocol We adopt a **VLM-based automatic evaluation pipeline**: - **Backbone evaluator**: GPT-4o. - **Metrics (1–10 → linearly mapped to 0–100)**: - **IF** – Instruction Following - **DP** – Detail Preserving - **VQ** – Visual Quality - **KF** – Knowledge Fidelity (for knowledge-informed cases) - **CF** – Creative Fusion (for imagination / complex cases) The **overall score** is: `AVG = (IF + DP + VQ + α·KF + β·CF) / (3 + α + β)` where α and β are 1 only when KF / CF are applicable. Our user study shows strong correlation between this protocol and human ratings. ## 📊 Code & Results Our evaluation code is released at GitHub: - **WiseEdit**: https://github.com/beepkh/WiseEdit All our model evaluation results are also released at: - **WiseEdit-Results**: https://huggingface.co/datasets/midbee/WiseEdit-Results ## ✍️Citation If you find WiseEdit helpful, please cite: ```bibtex @article{pan2025wiseedit, title={WiseEdit: Benchmarking Cognition-and Creativity-Informed Image Editing}, author={Pan, Kaihang and Chen, Weile and Qiu, Haiyi and Yu, Qifan and Bu, Wendong and Wang, Zehan and Zhu, Yun and Li, Juncheng and Tang, Siliang}, journal={arXiv preprint arXiv:2512.00387}, year={2025} } ```

提供机构：

123123chen

5,000+

优质数据集

54 个

任务类型

进入经典数据集