five

123123chen/WiseEdit-Benchmark

收藏
Hugging Face2025-12-09 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/123123chen/WiseEdit-Benchmark
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - image-to-image language: - en pretty_name: WiseEdit size_categories: - 1K<n<10K --- <div align="center"> <h1 style="font-size: 2.0em; margin-bottom: 0.15em;"> WiseEdit: Benchmarking Cognition- and Creativity-Informed Image Editing </h1> <p style="font-size: 1.05em; margin: 0.2em 0 0.6em 0;"> <strong> Kaihang Pan<sup>1</sup>* · Weile Chen<sup>1</sup>* · Haiyi Qiu<sup>1</sup>* · Qifan Yu<sup>1</sup> · Wendong Bu<sup>1</sup> · Zehan Wang<sup>1</sup><br> Yun Zhu<sup>2</sup> · Juncheng Li<sup>1</sup> · Siliang Tang<sup>1</sup> </strong> </p> <p style="font-size: 0.9em; margin: 0;"> <sup>1</sup>Zhejiang University &nbsp;&nbsp;&nbsp; <sup>2</sup>Shanghai Artificial Intelligence Laboratory </p> <p style="font-size: 0.85em; margin-top: 0.2em;"> <em>*Equal contribution.</em> </p> <p style="margin-top: 0.8em;"> <a href="https://www.arxiv.org/abs/2512.00387"> <img src="https://img.shields.io/badge/arXiv-2512.00387-b31b1b.svg" alt="arXiv"> </a> <a href="https://qnancy.github.io/wiseedit_project_page/"> <img src="https://img.shields.io/badge/Project-Page-b3.svg" alt="Project Page"> </a> <a href="https://github.com/beepkh/WiseEdit"> <img src="https://img.shields.io/badge/GitHub-Code-181717?logo=github" alt="Code"> </a> </p> </div> ## 🌍 Introduction WiseEdit is a knowledge-intensive benchmark for cognition- and creativity-informed image editing. It decomposes instruction-based editing into three stages, **Awareness**, **Interpretation**, and **Imagination**, and provides **1,220 bilingual test cases** together with a GPT-4o–based automatic evaluation pipeline. Using WiseEdit, we benchmark **22 state-of-the-art image editing models** and reveal clear limitations in knowledge-based reasoning and compositional creativity. <p align="center"> <img src="figures/intro.png" width="100%"> </p> ## 💡 Dataset Overview WiseEdit is built around **task depth** and **knowledge breadth**. <p align="center"> <img src="figures/wiseedit-intro.png" width="90%"> </p> ### Task Depth – Four Task Types WiseEdit includes: - **Awareness Task** – Focus on *where* to edit; no explicit spatial coordinates are given; requires comparative reasoning, reference matching, or fine-grained perception. - **Interpretation Task** – Focus on *how* to edit at the perception level; instructions often encode **implicit intent**, demanding world knowledge. - **Imagination Task** – Focus on subject-driven creative generation; requires complex composition and identity-preserving transformations. - **WiseEdit-Complex** – Combines Awareness + Interpretation + Imagination; multi-image, multi-step reasoning with conditional logic and compositional generation. ### Knowledge Breadth – Three Knowledge Types WiseEdit organizes cases by **knowledge type**: - **Declarative Knowledge** – “knowing what”; Facts, concepts, perceptual cues. - **Procedural Knowledge** – “knowing how”; Multi-step skills or procedures. - **Metacognitive Knowledge** – “knowing about knowing”; When and how to apply declarative / procedural knowledge; conditional reasoning, rule stacking, etc. These are grounded in **Cultural Common Sense**, **Natural Sciences**, and **Spatio-Temporal Logic**, stressing culturally appropriate, physically consistent, and logically coherent edits. ## ⭐ Evaluation Protocol We adopt a **VLM-based automatic evaluation pipeline**: - **Backbone evaluator**: GPT-4o. - **Metrics (1–10 → linearly mapped to 0–100)**: - **IF** – Instruction Following - **DP** – Detail Preserving - **VQ** – Visual Quality - **KF** – Knowledge Fidelity (for knowledge-informed cases) - **CF** – Creative Fusion (for imagination / complex cases) The **overall score** is: `AVG = (IF + DP + VQ + α·KF + β·CF) / (3 + α + β)` where α and β are 1 only when KF / CF are applicable. Our user study shows strong correlation between this protocol and human ratings. ## 📊 Code & Results Our evaluation code is released at GitHub: - **WiseEdit**: https://github.com/beepkh/WiseEdit All our model evaluation results are also released at: - **WiseEdit-Results**: https://huggingface.co/datasets/midbee/WiseEdit-Results ## ✍️Citation If you find WiseEdit helpful, please cite: ```bibtex @article{pan2025wiseedit, title={WiseEdit: Benchmarking Cognition-and Creativity-Informed Image Editing}, author={Pan, Kaihang and Chen, Weile and Qiu, Haiyi and Yu, Qifan and Bu, Wendong and Wang, Zehan and Zhu, Yun and Li, Juncheng and Tang, Siliang}, journal={arXiv preprint arXiv:2512.00387}, year={2025} } ```
提供机构:
123123chen
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作