Salesforce/ProVision-10M
收藏Hugging Face2025-02-03 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/Salesforce/ProVision-10M
下载链接
链接失效反馈官方服务:
资源简介:
ProVision-10M数据集是一个用于训练多模态语言模型的指令数据集,通过数据生成器(Python程序)和场景图合成指令数据。数据集包含单图像和多图像的指令数据,格式包括简短答案和多项选择题。数据来源于Visual Genome和DataComp,未包含原始图像,需从原始来源下载。数据集旨在展示程序化合成指令数据用于训练多模态语言模型的潜力。
The ProVision-10M dataset is a programmatically generated instruction dataset designed for training multimodal language models. The dataset synthesizes instruction data using data generators and scene graphs, rather than proprietary models. It includes both single-image and multi-image instruction data sourced from Visual Genome and DataComp. The dataset is divided into various splits, each with a specific number of examples and format (short answer or multiple choice). The README also mentions that the images are not included in the dataset and should be downloaded from their original sources. The dataset is released under a CC-BY-NC-4.0 license and is intended for research purposes only.
提供机构:
Salesforce



