TainU/IV-Edit
收藏Hugging Face2025-12-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/TainU/IV-Edit
下载链接
链接失效反馈官方服务:
资源简介:
该存储库包含**IV-Edit(指令视觉编辑)基准**和用于**RePlan框架**的训练数据。数据集旨在解决基于指令的图像编辑中的**指令视觉复杂性(IV-Complexity)**挑战,其中复杂的指令与杂乱或模糊的视觉场景交互。现有数据集通常以突出对象和直接命令为特征,而IV-Edit强调杂乱场景和需要细粒度视觉理解、复杂推理和精确区域控制的指令。数据集分为三个部分:`test`(官方IV-Edit基准,约800个手动验证的指令-图像对,专注于多样化和复杂的场景)、`train`(用于微调RePlan VLM规划器的训练数据)和`dev`(用于模型开发和超参数调整的验证集)。这些样本来自开源数据集(如COCO、LISA和TextAtlas),并经过筛选以满足IV-Complexity标准。
This repository contains the **IV-Edit (Instruction-Visual Editing)** Benchmark and the training data used for the **RePlan** framework. The dataset is designed to address the challenge of **Instruction-Visual Complexity (IV-Complexity)** in instruction-based image editing, where intricate instructions interact with cluttered or ambiguous visual scenes. While existing datasets often feature salient objects and direct commands, IV-Edit emphasizes cluttered scenes and instructions that require fine-grained visual understanding, complex reasoning, and precise region-level control. The dataset is organized into three splits: `test` (the official IV-Edit Benchmark, consisting of ~800 manually verified instruction-image pairs focusing on diverse, complex scenes), `train` (the training data used to fine-tune the RePlan VLM planner), and `dev` (a validation set for model development and hyperparameter tuning). These samples are derived from open-source datasets (such as COCO, LISA, and TextAtlas) and filtered to meet IV-Complexity standards.
提供机构:
TainU



