five

DataEngine-InstData

收藏
魔搭社区2025-09-03 更新2024-06-01 收录
下载链接:
https://modelscope.cn/datasets/Shanghai_AI_Laboratory/DataEngine-InstData
下载链接
链接失效反馈
官方服务:
资源简介:
#### 下载方法 :modelscope-code[]{type="git"} # DataEngine-InstData ## 介绍 DataEngine-InstData是一个视觉问答数据集, 在Visual Genome图片的基础上,使用GPT-4生成,生成方法采用了MLLM-DataEngine提出的缺陷定位-定向补强的迭代生成方法, 其目的是生成更加高质量且针对性的数据,定向对MLLM的能力缺陷进行补强. ## 数据格式 MLLM-DataEngine生成的数据包含了清晰明确的指令(instruction)以及回答(answer)。除此之外,生成的数据还被整理成多项选择题的形式。生成的数据格式如下: ```json [ { "instruction": "Where is the man wearing a black backpack positioned in the picture?", "answer": "The man wearing a black backpack is located at the left side of the image, roughly in the middle between top and bottom", "short_answer": "Letf middle", "options": ["Top right", "Bottom right", "Bottom left", "Left middle"], "choide_answer": "D", "image": "vg/VG_100K_2/2404787.jpg", "qtype": 4, }, ... ] ``` ```instruction```: 清晰、明确的指令 ```answer```: 针对于指令的回答 ```short_answer```: 针对于指令的简短回答 ```options```: 针对于指令的四个选项(只有一个正确答案) ```choice_answer```: 四个选项中的正确答案 ```image```: Visual Genome图像路径 ```qtype```: 问题类型,有以下九种问题类型: ```json { 1: 'Scene Understanding', 2: 'Instance Identity', 3: 'Instance Attributes', 4: 'Instance Location', 5: 'Instances Counting', 6: 'Spatial Relation', 7: 'Instance Interaction', 8: 'Visual Reasoning', 9: 'Text Understanding' } ``` ## Introduction DataEngine-InstData is a VQA dataset, generated from GPT-4 using Visual Genome images and an iterative data-engine generation process. Its aim is to produce high-quality SFT data, targeted at enhancing specific capabilities of MLLMs. ## Data Format The MLLM-DataEngine generated data contains a clear, consice instruction, and corresponding answer. Besides, the instruction-answer pair is reformatted into multi-choices question answering format. The generated data is organized in the following format: ```json [ { "instruction": "Where is the man wearing a black backpack positioned in the picture?", "answer": "The man wearing a black backpack is located at the left side of the image, roughly in the middle between top and bottom", "short_answer": "Letf middle", "options": ["Top right", "Bottom right", "Bottom left", "Left middle"], "choide_answer": "D", "image": "vg/VG_100K_2/2404787.jpg", "qtype": 4, }, ... ] ``` ```instruction```: a clear, consice instruction ```answer```: direct answer to the instruction ```short_answer```: the short answer to the instruction ```options```: four options corresponding to the instruction ```choice_answer```: correct choice answer option ```image```: Visual Genome image path ```qtype```: question type in SEED-Bench, demonstrated in the following: ```json { 1: 'Scene Understanding', 2: 'Instance Identity', 3: 'Instance Attributes', 4: 'Instance Location', 5: 'Instances Counting', 6: 'Spatial Relation', 7: 'Instance Interaction', 8: 'Visual Reasoning', 9: 'Text Understanding' } ``` # Citation ``` @misc{zhao2023mllmdataengine, title={MLLM-DataEngine: An Iterative Refinement Approach for MLLM}, author={Zhiyuan Zhao and Linke Ouyang and Bin Wang and Siyuan Huang and Pan Zhang and Xiaoyi Dong and Jiaqi Wang and Conghui He}, year={2023}, eprint={2308.13566}, archivePrefix={arXiv}, primaryClass={cs.LG} } ```

#### 下载方法 :modelscope-code[]{type="git"} # DataEngine-InstData ## 介绍 DataEngine-InstData是一个视觉问答(Visual Question Answering, VQA)数据集,基于Visual Genome(视觉基因组)图像,由GPT-4生成。其生成方法采用了MLLM-DataEngine提出的缺陷定位-定向补强迭代生成法,旨在生成高质量且具备针对性的监督微调(Supervised Fine-Tuning, SFT)数据,定向补强多模态大语言模型(Multimodal Large Language Model, MLLM)的能力短板。 ## 数据格式 MLLM-DataEngine生成的数据包含清晰明确的指令(instruction)与对应回答(answer),同时将指令-回答对重构为多项选择题形式。生成数据的格式如下: json [ { "instruction": "Where is the man wearing a black backpack positioned in the picture?", "answer": "The man wearing a black backpack is located at the left side of the image, roughly in the middle between top and bottom", "short_answer": "Letf middle", "options": ["Top right", "Bottom right", "Bottom left", "Left middle"], "choide_answer": "D", "image": "vg/VG_100K_2/2404787.jpg", "qtype": 4, }, ... ] instruction: 清晰明确的指令 answer: 对应指令的完整回答 short_answer: 对应指令的简短回答 options: 对应指令的四个备选项(仅含一个正确答案) choice_answer: 备选项中的正确答案 image: Visual Genome图像路径 qtype: 问题类型,对应SEED-Bench中的九类问题,如下所示: json { 1: '场景理解', 2: '实例识别', 3: '实例属性', 4: '实例位置', 5: '实例计数', 6: '空间关系', 7: '实例交互', 8: '视觉推理', 9: '文本理解' } # Citation @misc{zhao2023mllmdataengine, title={MLLM-DataEngine: An Iterative Refinement Approach for MLLM}, author={Zhiyuan Zhao and Linke Ouyang and Bin Wang and Siyuan Huang and Pan Zhang and Xiaoyi Dong and Jiaqi Wang and Conghui He}, year={2023}, eprint={2308.13566}, archivePrefix={arXiv}, primaryClass={cs.LG} }
提供机构:
maas
创建时间:
2024-05-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作