Sterzhang/PVIT-3M
收藏Hugging Face2024-11-02 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/Sterzhang/PVIT-3M
下载链接
链接失效反馈官方服务:
资源简介:
PVIT-3M数据集是一个专门为个性化视觉指令调优任务设计的数据集,包含300万对图像-文本对。该数据集旨在提升多模态大语言模型(MLLMs)在个性化视觉输入下的响应生成能力,使其更贴合个体用户的需求和偏好。数据集中的图像被组织在40个独立的文件夹中,每个文件夹包含不同类型的图像数据。JSON文件结构详细记录了每个对话实例的图像路径和对话内容。
The PVIT-3M dataset is specifically designed for tuning Multi-modal Large Language Models (MLLMs) in the context of personalized visual instruction tasks. This dataset consists of 3 million image-text pairs, organized into 40 folders, and includes structured data in a JSON file format. The JSON structure includes fields for image paths, conversations, and types. The dataset aims to enhance the ability of MLLMs to generate responses based on personalized visual inputs, making them more adaptable to individual user needs.
提供机构:
Sterzhang



