timbrooks/instructpix2pix-clip-filtered

Name: timbrooks/instructpix2pix-clip-filtered
Creator: timbrooks
Published: 2023-03-02 11:19:16
License: 暂无描述

Hugging Face2023-03-02 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/timbrooks/instructpix2pix-clip-filtered

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: original_prompt dtype: string - name: original_image dtype: image - name: edit_prompt dtype: string - name: edited_prompt dtype: string - name: edited_image dtype: image splits: - name: train num_bytes: 130930966429.88 num_examples: 313010 download_size: 63067247926 dataset_size: 130930966429.88 language: - en size_categories: - 100K<n<1M --- # Dataset Card for InstructPix2Pix CLIP-filtered ## Table of Contents - [Table of Contents](#table-of-contents) - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage:** https://www.timothybrooks.com/instruct-pix2pix - **Repository:** https://github.com/timothybrooks/instruct-pix2pix - **Paper:** https://arxiv.org/abs/2211.09800 ## Dataset Summary The dataset can be used to train models to follow edit instructions. Edit instructions are available in the `edit_prompt`. `original_image` can be used with the `edit_prompt` and `edited_image` denotes the image after applying the `edit_prompt` on the `original_image`. Refer to the [GitHub repository](https://github.com/timothybrooks/instruct-pix2pix) to know more about how this dataset can be used to train a model that can follow instructions. ### Supported Tasks and Leaderboards [More Information Needed] ### Languages The text descriptions are in English. ## Dataset Structure ### Data Instances [More Information Needed] ### Data Fields [More Information Needed] ### Data Splits [More Information Needed] ## Dataset Creation ### Curation Rationale [More Information Needed] ### Source Data #### Initial Data Collection and Normalization [More Information Needed] #### Who are the source language producers? [More Information Needed] ### Annotations #### Annotation process [More Information Needed] #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information The license for this dataset is a custom license. Refer to the licensing file to know more. ### Citation Information [More Information Needed] ### Contributions Thanks to [@sayakpaul](https://github.com/sayakpaul) for contributing this dataset.

--- dataset_info: 数据集信息： features: 特征列表： - name: 原始提示词（original_prompt） dtype: 字符串（string） - name: 原始图像（original_image） dtype: 图像（image） - name: 编辑提示词（edit_prompt） dtype: 字符串（string） - name: 编辑后提示词（edited_prompt） dtype: 字符串（string） - name: 编辑后图像（edited_image） dtype: 图像（image） splits: 数据划分： - name: 训练集（train）字节数: 130930966429.88 样本数: 313010 下载大小: 63067247926 数据集总大小: 130930966429.88 language: 语言： - 英语（en） size_categories: 规模类别： - 100K<n<1M（10万 < 样本数 < 100万） --- # 经过CLIP过滤的InstructPix2Pix数据集卡片 ## 目录 - [目录](#table-of-contents) - [数据集描述](#dataset-description) - [数据集概述](#dataset-summary) - [支持任务与评测基准](#supported-tasks-and-leaderboards) - [语言](#languages) - [数据集结构](#dataset-structure) - [数据实例](#data-instances) - [数据字段](#data-fields) - [数据划分](#data-splits) - [数据集构建](#dataset-creation) - [筛选依据](#curation-rationale) - [源数据](#source-data) - [标注信息](#annotations) - [个人与敏感信息](#personal-and-sensitive-information) - [数据集使用注意事项](#considerations-for-using-the-data) - [数据集的社会影响](#social-impact-of-dataset) - [偏见分析](#discussion-of-biases) - [其他已知局限性](#other-known-limitations) - [附加信息](#additional-information) - [数据集维护者](#dataset-curators) - [许可信息](#licensing-information) - [引用信息](#citation-information) - [贡献致谢](#contributions) ## 数据集描述 - **官方主页:** https://www.timothybrooks.com/instruct-pix2pix - **代码仓库:** https://github.com/timothybrooks/instruct-pix2pix - **相关论文:** https://arxiv.org/abs/2211.09800 ## 数据集概述本数据集可用于训练能够遵循编辑指令的模型。编辑指令存储于`编辑提示词（edit_prompt）`字段中。可将`原始图像（original_image）`与`编辑提示词（edit_prompt）`结合使用，`编辑后图像（edited_image）`表示对`原始图像（original_image）`应用`编辑提示词（edit_prompt）`后得到的结果。如需了解更多关于如何使用本数据集训练可遵循指令的模型的细节，请参考其GitHub仓库。 ### 支持任务与评测基准 [需补充更多信息] ### 语言本数据集的文本描述采用英语。 ## 数据集结构 ### 数据实例 [需补充更多信息] ### 数据字段 [需补充更多信息] ### 数据划分 [需补充更多信息] ## 数据集构建 ### 筛选依据 [需补充更多信息] ### 源数据 #### 初始数据收集与归一化 [需补充更多信息] #### 源文本的创作者是谁？ [需补充更多信息] ### 标注信息 #### 标注流程 [需补充更多信息] #### 标注人员是谁？ [需补充更多信息] ### 个人与敏感信息 [需补充更多信息] ## 数据集使用注意事项 ### 数据集的社会影响 [需补充更多信息] ### 偏见分析 [需补充更多信息] ### 其他已知局限性 [需补充更多信息] ## 附加信息 ### 数据集维护者 [需补充更多信息] ### 许可信息本数据集采用自定义许可协议，详情请参阅配套许可文件。 ### 引用信息 [需补充更多信息] ### 贡献致谢感谢[@sayakpaul](https://github.com/sayakpaul)贡献本数据集。

提供机构：

timbrooks

原始信息汇总

数据集概述

数据集名称

InstructPix2Pix CLIP-filtered

数据集描述

该数据集用于训练模型遵循编辑指令。编辑指令存储在edit_prompt字段中，original_image字段包含原始图像，而edited_image字段表示应用edit_prompt后的图像。

数据集结构

数据字段

original_prompt: 字符串类型
original_image: 图像类型
edit_prompt: 字符串类型
edited_prompt: 字符串类型
edited_image: 图像类型

数据分割

train: 包含313010个示例，总大小为130930966429.88字节

数据集大小

下载大小: 63067247926字节
数据集总大小: 130930966429.88字节

语言

英语

大小类别

100K<n<1M

许可证信息

自定义许可证，详情请参考许可证文件。

贡献者

@sayakpaul

搜集汇总

数据集介绍

构建方式

在图像编辑与生成领域，高质量指令数据集的构建是推动模型理解复杂语义编辑任务的关键。InstructPix2Pix CLIP-filtered数据集通过精心设计的流程，将原始图像与编辑指令及对应的编辑后图像进行配对。其构建过程涉及从多样化来源收集原始图像与文本描述，并利用预训练模型生成编辑指令，随后通过CLIP模型进行过滤，确保指令与图像内容在语义层面高度一致，从而形成结构化的训练样本集合。

使用方法

在计算机视觉研究中，该数据集主要用于训练能够遵循自然语言指令进行图像编辑的生成模型。使用者可加载数据集中的原始图像与编辑指令作为输入，以编辑后图像作为监督目标，通过端到端训练优化模型参数。典型应用包括微调扩散模型或训练新型架构，以实现根据文本提示对图像进行可控修改。数据已划分为训练集，便于直接用于模型开发与评估，推动指令驱动图像编辑技术的进步。

背景与挑战

背景概述

在生成式人工智能迅猛发展的浪潮中，图像编辑任务正从传统的像素级操作转向以自然语言指令驱动的语义级编辑。由Tim Brooks等研究人员于2022年提出的InstructPix2Pix模型及相关数据集，标志着这一领域的重要进展。该数据集旨在解决如何让模型精准理解并执行开放式文本指令，从而对输入图像进行符合语义的编辑这一核心研究问题。其构建依托于大规模预训练模型，通过合成方法生成高质量的指令-图像对，为指令引导的图像编辑任务提供了关键的训练资源，显著推动了可控图像生成与编辑技术的发展。

当前挑战

该数据集致力于应对指令引导的图像编辑这一新兴领域的核心挑战，即如何确保生成模型能够准确、一致地理解并执行复杂、开放域的自然语言编辑指令，同时保持图像内容的合理性与真实性。在构建过程中，挑战主要源于高质量训练数据的合成与筛选。如何自动化生成海量且多样化的（原始图像，编辑指令，目标图像）三元组，并确保其语义一致性与视觉质量，是数据构建的关键难题。此外，利用CLIP等模型进行数据过滤以提升数据质量，也涉及到对多模态语义对齐能力的深度依赖与优化。

常用场景

经典使用场景

在生成式人工智能领域，InstructPix2Pix CLIP-filtered数据集为图像编辑任务提供了关键支持。该数据集通过包含原始图像、编辑指令及对应编辑后图像的配对数据，使模型能够学习如何根据自然语言指令对图像进行精准修改。其经典应用场景在于训练指令驱动的图像生成模型，例如将一张夏日风景照转换为冬季雪景，或为人物肖像添加特定风格的装饰。这种基于文本指令的图像编辑能力，极大地推动了可控内容生成技术的发展。

解决学术问题

该数据集有效解决了生成式人工智能中指令理解与图像编辑的耦合难题。传统方法往往需要复杂的手动操作或特定领域知识，而该数据集通过大规模高质量的指令-图像对，为模型提供了学习语义对齐与视觉转换的丰富样本。这促进了跨模态理解研究，使得模型能够更准确地解析人类意图并实现精细化编辑，为图像生成领域的可解释性与可控性研究提供了重要数据基础。

实际应用

在实际应用层面，该数据集支撑了多种创意与设计工具的开发。例如，在数字艺术创作中，艺术家可通过简单文本指令快速调整作品风格；在电子商务领域，商品图片可根据用户需求自动变换背景或颜色；教育领域也能利用其生成定制化的视觉教材。这些应用不仅提升了内容生产效率，还降低了专业图像处理的技术门槛，推动了人工智能在创意产业的深度融合。

数据集最近研究