DataEngine-InstData

Name: DataEngine-InstData
Creator: maas
Published: 2025-09-03 16:54:58
License: 暂无描述

魔搭社区2025-09-03 更新2024-06-01 收录

下载链接：

https://modelscope.cn/datasets/Shanghai_AI_Laboratory/DataEngine-InstData

下载链接

链接失效反馈

官方服务：

资源简介：

#### 下载方法 :modelscope-code[]{type="git"} # DataEngine-InstData ## 介绍 DataEngine-InstData是一个视觉问答数据集, 在Visual Genome图片的基础上，使用GPT-4生成，生成方法采用了MLLM-DataEngine提出的缺陷定位-定向补强的迭代生成方法，其目的是生成更加高质量且针对性的数据，定向对MLLM的能力缺陷进行补强. ## 数据格式 MLLM-DataEngine生成的数据包含了清晰明确的指令（instruction）以及回答（answer）。除此之外，生成的数据还被整理成多项选择题的形式。生成的数据格式如下: ```json [ { "instruction": "Where is the man wearing a black backpack positioned in the picture?", "answer": "The man wearing a black backpack is located at the left side of the image, roughly in the middle between top and bottom", "short_answer": "Letf middle", "options": ["Top right", "Bottom right", "Bottom left", "Left middle"], "choide_answer": "D", "image": "vg/VG_100K_2/2404787.jpg", "qtype": 4, }, ... ] ``` ```instruction```: 清晰、明确的指令 ```answer```: 针对于指令的回答 ```short_answer```: 针对于指令的简短回答 ```options```: 针对于指令的四个选项（只有一个正确答案） ```choice_answer```: 四个选项中的正确答案 ```image```: Visual Genome图像路径 ```qtype```: 问题类型,有以下九种问题类型: ```json { 1: 'Scene Understanding', 2: 'Instance Identity', 3: 'Instance Attributes', 4: 'Instance Location', 5: 'Instances Counting', 6: 'Spatial Relation', 7: 'Instance Interaction', 8: 'Visual Reasoning', 9: 'Text Understanding' } ``` ## Introduction DataEngine-InstData is a VQA dataset, generated from GPT-4 using Visual Genome images and an iterative data-engine generation process. Its aim is to produce high-quality SFT data, targeted at enhancing specific capabilities of MLLMs. ## Data Format The MLLM-DataEngine generated data contains a clear, consice instruction, and corresponding answer. Besides, the instruction-answer pair is reformatted into multi-choices question answering format. The generated data is organized in the following format: ```json [ { "instruction": "Where is the man wearing a black backpack positioned in the picture?", "answer": "The man wearing a black backpack is located at the left side of the image, roughly in the middle between top and bottom", "short_answer": "Letf middle", "options": ["Top right", "Bottom right", "Bottom left", "Left middle"], "choide_answer": "D", "image": "vg/VG_100K_2/2404787.jpg", "qtype": 4, }, ... ] ``` ```instruction```: a clear, consice instruction ```answer```: direct answer to the instruction ```short_answer```: the short answer to the instruction ```options```: four options corresponding to the instruction ```choice_answer```: correct choice answer option ```image```: Visual Genome image path ```qtype```: question type in SEED-Bench, demonstrated in the following: ```json { 1: 'Scene Understanding', 2: 'Instance Identity', 3: 'Instance Attributes', 4: 'Instance Location', 5: 'Instances Counting', 6: 'Spatial Relation', 7: 'Instance Interaction', 8: 'Visual Reasoning', 9: 'Text Understanding' } ``` # Citation ``` @misc{zhao2023mllmdataengine, title={MLLM-DataEngine: An Iterative Refinement Approach for MLLM}, author={Zhiyuan Zhao and Linke Ouyang and Bin Wang and Siyuan Huang and Pan Zhang and Xiaoyi Dong and Jiaqi Wang and Conghui He}, year={2023}, eprint={2308.13566}, archivePrefix={arXiv}, primaryClass={cs.LG} } ```

#### 下载方法 :modelscope-code[]{type="git"} # DataEngine-InstData ## 介绍 DataEngine-InstData是一个视觉问答（Visual Question Answering, VQA）数据集，基于Visual Genome（视觉基因组）图像，由GPT-4生成。其生成方法采用了MLLM-DataEngine提出的缺陷定位-定向补强迭代生成法，旨在生成高质量且具备针对性的监督微调（Supervised Fine-Tuning, SFT）数据，定向补强多模态大语言模型（Multimodal Large Language Model, MLLM）的能力短板。 ## 数据格式 MLLM-DataEngine生成的数据包含清晰明确的指令（instruction）与对应回答（answer），同时将指令-回答对重构为多项选择题形式。生成数据的格式如下： json [ { "instruction": "Where is the man wearing a black backpack positioned in the picture?", "answer": "The man wearing a black backpack is located at the left side of the image, roughly in the middle between top and bottom", "short_answer": "Letf middle", "options": ["Top right", "Bottom right", "Bottom left", "Left middle"], "choide_answer": "D", "image": "vg/VG_100K_2/2404787.jpg", "qtype": 4, }, ... ] instruction: 清晰明确的指令 answer: 对应指令的完整回答 short_answer: 对应指令的简短回答 options: 对应指令的四个备选项（仅含一个正确答案） choice_answer: 备选项中的正确答案 image: Visual Genome图像路径 qtype: 问题类型，对应SEED-Bench中的九类问题，如下所示： json { 1: '场景理解', 2: '实例识别', 3: '实例属性', 4: '实例位置', 5: '实例计数', 6: '空间关系', 7: '实例交互', 8: '视觉推理', 9: '文本理解' } # Citation @misc{zhao2023mllmdataengine, title={MLLM-DataEngine: An Iterative Refinement Approach for MLLM}, author={Zhiyuan Zhao and Linke Ouyang and Bin Wang and Siyuan Huang and Pan Zhang and Xiaoyi Dong and Jiaqi Wang and Conghui He}, year={2023}, eprint={2308.13566}, archivePrefix={arXiv}, primaryClass={cs.LG} }

提供机构：

maas

创建时间：

2024-05-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集