X2I-mm-instruction
收藏魔搭社区2025-12-04 更新2025-04-05 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/X2I-mm-instruction
下载链接
链接失效反馈官方服务:
资源简介:
# X2I Dataset
* Project Page: [https://vectorspacelab.github.io/OmniGen/](https://vectorspacelab.github.io/OmniGen/)
* Github: [https://github.com/VectorSpaceLab/OmniGen](https://github.com/VectorSpaceLab/OmniGen)
* Paper: [https://arxiv.org/abs/2409.11340](https://arxiv.org/abs/2409.11340)
* Model: [https://huggingface.co/Shitao/OmniGen-v1](https://huggingface.co/Shitao/OmniGen-v1)
To achieve robust multi-task processing capabilities, it is essential to train the **OmniGen** on large-scale and diverse datasets. However, in the field of unified image generation, a readily available dataset has yet to emerge. For this reason, we have curated a large-scale **unified image generation** dataset with unified format for the **first time**, which we refer to as the **X2I dataset**, meaning **"anything to image"**.
| Task| Datastet|
| :-------- | :-------- |
| Multi-modal Instruction| [X2I-mm-instruction](https://huggingface.co/datasets/yzwang/X2I-mm-instruction) |
| Subject-driven Editing | [X2I-subject-driven](https://huggingface.co/datasets/yzwang/X2I-subject-driven) |
| In-context Learning | [X2I-in-context-learning](https://huggingface.co/datasets/yzwang/X2I-in-context-learning) |
| Computer Vision | [X2I-computer-vision](https://huggingface.co/datasets/yzwang/X2I-computer-vision) |
| Text to Image Generation| [X2I-text-to-image](https://huggingface.co/datasets/yzwang/X2I-text-to-image) |
## X2I-mm-instruction
- **FashionTryOn**
A fashion virtual try-on dataset with 41,004 samples.
```python
## meta file: fashiontryon.jsonl
cd fashiontryon
tar -xzvf fashiontryon.tar.gz
```
- **HR-VITON**
A fashion virtual try-on dataset with 13,679 samples.
```python
## meta file: hr-viton.jsonl
cd hr-viton
tar -xzvf hr-viton.tar.gz
```
- **MagicBrush**
An image editing dataset with 8,807 samples.
```python
## meta file: magicbrush.jsonl
cd magicbrush
tar -xzvf magicbrush.tar.gz
```
- **InstructPix2Pix**
An image editing dataset with 1,000,032 samples.
```python
## meta file: pix2pix.jsonl
cd pix2pix
cat images.tar.gz.* | tar -xzvf -
```
- **SomethingSomethingv2**
A human actions dataset with 168,913 samples.
```python
## meta file: ssv2.jsonl
cd ssv2
tar -xzvf ssv2.tar.gz
```
- **StyleBooth**
A style transfer dataset with 11,325 & 14,766 samples.
```python
## meta file: stylebooth-1.jsonl & stylebooth-2.jsonl
cd stylebooth
tar -xzvf stylebooth.tar.gz
```
- [MultiGen](https://github.com/salesforce/UniControl)
- [SeedEdit-Openimages](https://huggingface.co/datasets/AILab-CVC/SEED-Data-Edit-Part1-Openimages)
- [SeedEdit-Unsplash](https://huggingface.co/datasets/AILab-CVC/SEED-Data-Edit-Part1-Unsplash)
# X2I 数据集(X2I Dataset)
* 项目页面:[https://vectorspacelab.github.io/OmniGen/](https://vectorspacelab.github.io/OmniGen/)
* GitHub 仓库:[https://github.com/VectorSpaceLab/OmniGen](https://github.com/VectorSpaceLab/OmniGen)
* 相关论文:[https://arxiv.org/abs/2409.11340](https://arxiv.org/abs/2409.11340)
* 模型地址:[https://huggingface.co/Shitao/OmniGen-v1](https://huggingface.co/Shitao/OmniGen-v1)
为使**OmniGen**具备鲁棒的多任务处理能力,需在大规模多样化数据集上对其进行训练。然而当前统一图像生成领域尚未有成熟可用的公开数据集面世。为此,我们首次构建了具备统一格式的大规模统一图像生成数据集,将其命名为**X2I 数据集(X2I Dataset)**,意为「万物转图像(anything to image)」。
| 任务类型 | 数据集名称 |
| :-------- | :-------- |
| 多模态指令(Multi-modal Instruction) | [X2I-mm-instruction](https://huggingface.co/datasets/yzwang/X2I-mm-instruction) |
| 主体驱动式编辑(Subject-driven Editing) | [X2I-subject-driven](https://huggingface.co/datasets/yzwang/X2I-subject-driven) |
| 上下文学习(In-context Learning) | [X2I-in-context-learning](https://huggingface.co/datasets/yzwang/X2I-in-context-learning) |
| 计算机视觉(Computer Vision) | [X2I-computer-vision](https://huggingface.co/datasets/yzwang/X2I-computer-vision) |
| 文本转图像生成(Text to Image Generation) | [X2I-text-to-image](https://huggingface.co/datasets/yzwang/X2I-text-to-image) |
## X2I-mm-instruction
- **FashionTryOn**:该数据集为时尚虚拟试衣类数据集,包含41004条样本。
python
## 元文件:fashiontryon.jsonl
cd fashiontryon
tar -xzvf fashiontryon.tar.gz
- **HR-VITON**:该数据集为时尚虚拟试衣类数据集,包含13679条样本。
python
## 元文件:hr-viton.jsonl
cd hr-viton
tar -xzvf hr-viton.tar.gz
- **MagicBrush**:该数据集为图像编辑类数据集,包含8807条样本。
python
## 元文件:magicbrush.jsonl
cd magicbrush
tar -xzvf magicbrush.tar.gz
- **InstructPix2Pix**:该数据集为图像编辑类数据集,包含1000032条样本。
python
## 元文件:pix2pix.jsonl
cd pix2pix
cat images.tar.gz.* | tar -xzvf -
- **SomethingSomethingv2**:该数据集为人类动作类数据集,包含168913条样本。
python
## 元文件:ssv2.jsonl
cd ssv2
tar -xzvf ssv2.tar.gz
- **StyleBooth**:该数据集为风格迁移类数据集,分别包含11325条与14766条样本。
python
## 元文件:stylebooth-1.jsonl & stylebooth-2.jsonl
cd stylebooth
tar -xzvf stylebooth.tar.gz
- [MultiGen](https://github.com/salesforce/UniControl)
- [SeedEdit-Openimages](https://huggingface.co/datasets/AILab-CVC/SEED-Data-Edit-Part1-Openimages)
- [SeedEdit-Unsplash](https://huggingface.co/datasets/AILab-CVC/SEED-Data-Edit-Part1-Unsplash)
提供机构:
maas
创建时间:
2025-04-02



