opencsg/llava-instruct-zh-600k

Name: opencsg/llava-instruct-zh-600k
Creator: opencsg
Published: 2025-07-22 04:55:57
License: 暂无描述

Hugging Face2025-07-22 更新2025-08-09 收录

下载链接：

https://hf-mirror.com/datasets/opencsg/llava-instruct-zh-600k

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个用于微调中文视觉语言模型的数据集，基于单张图片进行问答和对话任务。数据集包含从中文网站爬取的图片，大部分含有中文字符，适合中文场景下的视觉语言模型训练。数据集分为日常对话、复杂推理、描述图片三类任务，分别有247,431、194,646、199,791个样本。每个样本都对应一个prompt，指导生成符合任务要求的对话内容。

This dataset is designed for fine-tuning Chinese visual-language models, supporting question answering and conversational tasks grounded in single images. It comprises images crawled from Chinese websites, most of which contain Chinese characters, making it suitable for training visual-language models in Chinese-language scenarios. The dataset is divided into three task categories: daily conversations, complex reasoning, and image description, with 247,431, 194,646, and 199,791 samples respectively. Each sample is paired with a prompt that guides the generation of dialogue content compliant with the task requirements.

提供机构：

opencsg

5,000+

优质数据集

54 个

任务类型

进入经典数据集