five

TainU/IV-Edit

收藏
Hugging Face2025-12-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/TainU/IV-Edit
下载链接
链接失效反馈
官方服务:
资源简介:
该存储库包含**IV-Edit(指令视觉编辑)基准**和用于**RePlan框架**的训练数据。数据集旨在解决基于指令的图像编辑中的**指令视觉复杂性(IV-Complexity)**挑战,其中复杂的指令与杂乱或模糊的视觉场景交互。现有数据集通常以突出对象和直接命令为特征,而IV-Edit强调杂乱场景和需要细粒度视觉理解、复杂推理和精确区域控制的指令。数据集分为三个部分:`test`(官方IV-Edit基准,约800个手动验证的指令-图像对,专注于多样化和复杂的场景)、`train`(用于微调RePlan VLM规划器的训练数据)和`dev`(用于模型开发和超参数调整的验证集)。这些样本来自开源数据集(如COCO、LISA和TextAtlas),并经过筛选以满足IV-Complexity标准。

This repository contains the **IV-Edit (Instruction-Visual Editing)** Benchmark and the training data used for the **RePlan** framework. The dataset is designed to address the challenge of **Instruction-Visual Complexity (IV-Complexity)** in instruction-based image editing, where intricate instructions interact with cluttered or ambiguous visual scenes. While existing datasets often feature salient objects and direct commands, IV-Edit emphasizes cluttered scenes and instructions that require fine-grained visual understanding, complex reasoning, and precise region-level control. The dataset is organized into three splits: `test` (the official IV-Edit Benchmark, consisting of ~800 manually verified instruction-image pairs focusing on diverse, complex scenes), `train` (the training data used to fine-tune the RePlan VLM planner), and `dev` (a validation set for model development and hyperparameter tuning). These samples are derived from open-source datasets (such as COCO, LISA, and TextAtlas) and filtered to meet IV-Complexity standards.
提供机构:
TainU
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作