llava-instruct-v1_5-en-subset-358k
收藏魔搭社区2025-11-27 更新2025-11-29 收录
下载链接:
https://modelscope.cn/datasets/llm-jp/llava-instruct-v1_5-en-subset-358k
下载链接
链接失效反馈官方服务:
资源简介:
## Dataset Card for llava-instruct-v1_5-en-subset-358k
### Dataset details
This dataset is a subset of the [LLaVA-1.5 Instruction Data](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K/blob/main/llava_v1_5_mix665k.json), which was used to train [llm-jp-3-vila-14b](https://huggingface.co/llm-jp/llm-jp-3-vila-14b).
This dataset includes the following datasets.
| Dataset | Images |
|:---|---:|
|LLaVA | 158K |
|[VQAv2](https://visualqa.org/) | 53K |
|[GQA](https://cs.stanford.edu/people/dorarad/gqa/index.html) | 46K |
|[OCRVQA](https://ocr-vqa.github.io/) | 80K |
|[TextVQA](https://textvqa.org/dataset/) | 22K |
### License
Creative Commons Attribution 4.0 License; and it should abide by [the OpenAI terms of use](https://openai.com/policies/terms-of-use)
## llava-instruct-v1_5-en-subset-358k 数据集卡片
### 数据集详情
本数据集为[LLaVA-1.5 指令数据集](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K/blob/main/llava_v1_5_mix665k.json)的子集,被用于训练[llm-jp-3-vila-14b](https://huggingface.co/llm-jp/llm-jp-3-vila-14b)。
本数据集包含以下数据集:
| 数据集名称 | 图像数量 |
|:---|---:|
| LLaVA | 15.8万 |
| 视觉问答v2(VQAv2) | 5.3万 |
| GQA | 4.6万 |
| OCR视觉问答(OCRVQA) | 8.0万 |
| 文本视觉问答(TextVQA) | 2.2万 |
### 许可协议
采用知识共享署名4.0国际许可协议;同时须遵守[OpenAI使用条款](https://openai.com/policies/terms-of-use)
提供机构:
maas
创建时间:
2025-11-25



