GenSearcher/KnowGen-Bench

Name: GenSearcher/KnowGen-Bench
Creator: GenSearcher
Published: 2026-04-11 07:58:35
License: 暂无描述

Hugging Face2026-04-11 更新2026-05-10 收录

下载链接：

https://hf-mirror.com/datasets/GenSearcher/KnowGen-Bench

下载链接

链接失效反馈

官方服务：

资源简介：

--- configs: - config_name: default data_files: - split: test path: "KnowGen-Bench.json" task_categories: - text-to-image --- # KnowGen Benchmark [**Project Page**](https://gen-searcher.vercel.app/) | [**Paper**](https://arxiv.org/abs/2603.28767) | [**Code**](https://github.com/tulerfeng/Gen-Searcher) This repository contains the KnowGen benchmark data for [Gen-Searcher: Reinforcing Agentic Search for Image Generation](https://arxiv.org/abs/2603.28767). # 👀 Intro <div align="center"> <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/teaser.jpg?raw=true" alt="Gen-Searcher Overview" width="80%"> </div> We introduce **Gen-Searcher**, as the first attempt to train a multimodal **deep research agent** for image generation that requires complex real-world knowledge. Gen-Searcher can **search the web, browse evidence, reason over multiple sources, and search visual references** before generation, enabling more accurate and up-to-date image synthesis in real-world scenarios. We build two dedicated training datasets **Gen-Searcher-SFT-10k**, **Gen-Searcher-RL-6k** and one new benchmark **KnowGen** for search-grounded image generation. Gen-Searcher achieves significant improvements, delivering **15+ point gains on the KnowGen and WISE benchmarks**. It also demonstrates **strong transferability** to various image generators. All code, models, data, and benchmark are fully released. ## 🔍 KnowGen-Bench Our KnowGen bench covers around 20 diverse categories in real-world scenarios. <div align="center"> <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/bench.jpg?raw=true" alt="KnowGen Benchmark Categories" width="80%"> </div> ## 🏆 Performance Our method delivers consistent gains across backbones, improving Qwen-Image by around **16 points** on KnowGen. It also shows strong transferability, generalizing to Seedream 4.5 and Nano Banana Pro with no additional training, yielding about 16-point and 3-point improvements, respectively. <div align="center"> <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/performance.jpg?raw=true" alt="Performance Graph" width="85%"> </div> ## 📐 KnowGen Bench Evaluation To evaluate your model on the KnowGen benchmark, you can use the evaluation scripts provided in the GitHub repository: ```bash cd KnowGen_Eval bash gpt_eval_knowgen.sh ``` Ensure that your results are organized in the following format for evaluation: ```json [ { "id": 3260, "success": true, "prompt": "xxxxx", "meta": { "category": "Biology", "difficulty": "easy" }, "output_path": "./images/output_3260.png", "gt_image": "./gt_image/answer_3260.png" } ] ``` For ground truth images, you may download `gt_image_part1.zip` and unzip it, or directly download the `gt_image` folder. ## Citation If you find this work or dataset helpful, please consider citing: ```bibtex @article{feng2026gen, title={Gen-Searcher: Reinforcing Agentic Search for Image Generation}, author={Feng, Kaituo and Zhang, Manyuan and Chen, Shuang and Lin, Yunlong and Fan, Kaixuan and Jiang, Yilei and Li, Hongyu and Zheng, Dian and Wang, Chenyang and Yue, Xiangyu}, journal={arXiv preprint arXiv:2603.28767}, year={2026} } ```

提供机构：

GenSearcher

5,000+

优质数据集

54 个

任务类型

进入经典数据集