VIGC-InstData
收藏魔搭社区2024-12-12 更新2024-06-01 收录
下载链接:
https://modelscope.cn/datasets/Shanghai_AI_Laboratory/VIGC-InstData
下载链接
链接失效反馈官方服务:
资源简介:
#### 下载方法
:modelscope-code[]{type="git"}
# VIGC-InstData
## Release Notes
### 2023.9.7
first version of VIGC-InstData:
- the dataset was generated by vigc7b model with minigpt4 pretrained weights and finetuned on llava150k dataset.
- we relase three kinds of data: conversation, detail description, complex reasoning(same as llava150k dataset).
- images we uesd to generate dataset comes from coco and objects365.
## Data Example
1. example in `vigc_coco_conv.json`
```
{
"image_id": 473109,
"image": "train2017/000000473109.jpg",
"instruction": "Generate a question based on the content of the given image and then answer it.",
"question": "What is the man wearing in the image?",
"answer": "The man in the image is wearing a blue shirt and a bow tie."
}
```
2. example in `vigc_coco_detail.json`
```
{
"image_id": 473109,
"image": "train2017/000000473109.jpg",
"instruction": "Generate a question to describe the image content in detail and then answer it.",
"question": "Can you describe the main features of this image for me?",
"answer": "The image features a man wearing a blue shirt and a bow tie, looking at the camera ..."
}
```
3. example in `vigc_coco_complex.json`
```
{
"image_id": 473109,
"image": "train2017/000000473109.jpg",
"instruction": "Based on the given image, generate an in-depth reasoning question and then answer it.",
"question": "What could be the reason behind the man's choice of attire?",
"answer": "The man's choice of attire, which includes a blue shirt and a bow tie, suggests that ..."
}
```
4. example in `vigc_obj365_detail.json`
```
{
"image_id": 420917,
"image": "train/patch8/objects365_v1_00420917.jpg",
"instruction": "Generate a question to describe the image content in detail and then answer it.",
"question": "Analyze the image in a comprehensive and detailed manner.",
"answer": "The image depicts an empty restaurant with long wooden tables and chairs. The tables ..."
}
```
## Prepare Images
1. COCO
download original images from [official web](https://cocodataset.org/#home), then put images as follow:
```
<path to coco>
├── train2017
│ ├── ...
│ ├── 000000473109.jpg
│ └── ...
├── val2017/
└── ...
```
2. Objects365
download original images from [official web](http://www.objects365.org/overview.html), decompress the images and put them as follow:
```
<path to objects365>
├── train
│ ├── ...
│ ├── patch8
│ │ ├── objects365_v1_00420917.jpg
│ │ └── ...
│ └── ...
├── val
└── ...
```
# Citation
```
@misc{wang2024vigc,
title={VIGC: Visual Instruction Generation and Correction},
author={Bin Wang and Fan Wu and Xiao Han and Jiahui Peng and Huaping Zhong and Pan Zhang and Xiaoyi Dong and Weijia Li and Wei Li and Jiaqi Wang and Conghui He},
year={2024},
eprint={2308.12714},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
#### 下载方法
:modelscope-code[]{type="git"}
# VIGC-InstData
## 版本更新说明
### 2023年9月7日
VIGC-InstData数据集的首个正式版本发布:
- 本数据集基于搭载MiniGPT4预训练权重的VIGC-7B模型生成,并在LLaVA-150K数据集上完成微调
- 本次发布涵盖三类数据:对话交互类、细节描述类、复杂推理类(数据格式与LLaVA-150K数据集保持一致)
- 生成数据集所用的图像素材来源于COCO与Objects365两大公开数据集
## 数据样例
1. `vigc_coco_conv.json` 文件中的数据样例
{
"image_id": 473109,
"image": "train2017/000000473109.jpg",
"instruction": "Generate a question based on the content of the given image and then answer it.",
"question": "What is the man wearing in the image?",
"answer": "The man in the image is wearing a blue shirt and a bow tie."
}
2. `vigc_coco_detail.json` 文件中的数据样例
{
"image_id": 473109,
"image": "train2017/000000473109.jpg",
"instruction": "Generate a question to describe the image content in detail and then answer it.",
"question": "Can you describe the main features of this image for me?",
"answer": "The image features a man wearing a blue shirt and a bow tie, looking at the camera ..."
}
3. `vigc_coco_complex.json` 文件中的数据样例
{
"image_id": 473109,
"image": "train2017/000000473109.jpg",
"instruction": "Based on the given image, generate an in-depth reasoning question and then answer it.",
"question": "What could be the reason behind the man's choice of attire?",
"answer": "The man's choice of attire, which includes a blue shirt and a bow tie, suggests that ..."
}
4. `vigc_obj365_detail.json` 文件中的数据样例
{
"image_id": 420917,
"image": "train/patch8/objects365_v1_00420917.jpg",
"instruction": "Generate a question to describe the image content in detail and then answer it.",
"question": "Analyze the image in a comprehensive and detailed manner.",
"answer": "The image depicts an empty restaurant with long wooden tables and chairs. The tables ..."
}
## 图像准备流程
1. COCO数据集
从[官方网站](https://cocodataset.org/#home)下载原始图像,随后按照如下格式放置图像文件:
<path to coco>
├── train2017
│ ├── ...
│ ├── 000000473109.jpg
│ └── ...
├── val2017/
└── ...
2. Objects365数据集
从[官方网站](http://www.objects365.org/overview.html)下载原始图像,解压后按照如下格式放置图像文件:
<path to objects365>
├── train
│ ├── ...
│ ├── patch8
│ │ ├── objects365_v1_00420917.jpg
│ │ └── ...
│ └── ...
├── val
└── ...
## 引用格式
@misc{wang2024vigc,
title={VIGC: Visual Instruction Generation and Correction},
author={Bin Wang and Fan Wu and Xiao Han and Jiahui Peng and Huaping Zhong and Pan Zhang and Xiaoyi Dong and Weijia Li and Wei Li and Jiaqi Wang and Conghui He},
year={2024},
eprint={2308.12714},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
提供机构:
maas
创建时间:
2024-05-28



