ViP-LLaVA-Instruct
收藏魔搭社区2025-11-26 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/thomas/ViP-LLaVA-Instruct
下载链接
链接失效反馈官方服务:
资源简介:
# ViP-LLaVA Instruct Dataset Card
## Dataset details
**Dataset type:**
ViP-LLaVA Instruct is composed of a mixture of LLaVA-1.5 instruction data and the region-level visual prompting data.
It is constructed for visual instruction tuning and for building large multimodal towards GPT-4 level regional understanding capability.
Specifically, we use 1.2M data for stage 2 finetuning, and use 26K data for the optional stage 3 finetuning.
**Dataset date:**
ViP-LLaVA Instruct was collected in November 2023, by using a mixture of academic dataset and GPT-4/GPT-4V instructed dataset.
**Paper or resources for more information:**
https://vip-llava.github.io/
**License:**
Apache-2.0; and it should abide by the policy of OpenAI: https://openai.com/policies/terms-of-use
**Where to send questions or comments about the model:**
https://github.com/mu-cai/ViP-LLaVA/issues
## Intended use
**Primary intended uses:**
The primary use of ViP-LLaVA is research on large multimodal models and chatbots.
**Primary intended users:**
The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.
# ViP-LLaVA 指令数据集卡片
## 数据集详情
**数据集类型:**
ViP-LLaVA 指令数据集由LLaVA-1.5指令数据与区域级视觉提示(region-level visual prompting)数据混合构建而成,旨在支撑视觉指令微调任务,并打造具备GPT-4级区域理解能力的大型多模态模型。
具体而言,我们采用120万条数据用于第二阶段微调,2.6万条数据用于可选的第三阶段微调。
**数据集采集时间:**
ViP-LLaVA 指令数据集采集于2023年11月,数据来源为学术数据集与GPT-4/GPT-4V标注指令数据集的混合集合。
**更多信息参考论文或资源:**
https://vip-llava.github.io/
**授权协议:**
Apache-2.0;同时需遵守OpenAI的相关使用政策:https://openai.com/policies/terms-of-use
**该模型相关问题反馈渠道:**
https://github.com/mu-cai/ViP-LLaVA/issues
## 预期用途
**主要用途:**
ViP-LLaVA的核心用途为大型多模态模型与聊天机器人相关研究。
**目标用户群体:**
该数据集的主要目标用户为计算机视觉、自然语言处理、机器学习与人工智能领域的研究人员与爱好者。
提供机构:
maas
创建时间:
2024-02-27



