Leopard-Instruct

Name: Leopard-Instruct
Creator: maas
Published: 2025-12-04 16:17:52
License: 暂无描述

魔搭社区2025-12-04 更新2024-11-16 收录

下载链接：

https://modelscope.cn/datasets/AI-ModelScope/Leopard-Instruct

下载链接

链接失效反馈

官方服务：

资源简介：

# Leopard-Instruct [Paper](https://arxiv.org/abs/2410.01744) | [Github](https://github.com/tencent-ailab/Leopard) | [Models-LLaVA](https://huggingface.co/wyu1/Leopard-LLaVA) | [Models-Idefics2](https://huggingface.co/wyu1/Leopard-Idefics2) ## Summaries Leopard-Instruct is a large instruction-tuning dataset, comprising 925K instances, with 739K specifically designed for text-rich, multiimage scenarios. It's been used to train **Leopard-LLaVA** [\[checkpoint\]](https://huggingface.co/wyu1/Leopard-LLaVA) and **Leopard-Idefics2** [\[checkpoint\]](https://huggingface.co/wyu1/Leopard-Idefics2). ## Loading dataset - to load the dataset without automatically downloading and process the images (Please run the following codes with datasets==2.18.0) ```python import datasets dataset = datasets.load_dataset("wyu1/Leopard-Instruct", "webvision") # print(dataset['train'][0]['images'], dataset['train'][0]['texts']) ``` - to load all the subsets of the images ```python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("wyu1/Leopard-Instruct", config_name) ``` ## Citation ``` @article{jia2024leopard, title={LEOPARD: A Vision Language Model For Text-Rich Multi-Image Tasks}, author={Jia, Mengzhao and Yu, Wenhao and Ma, Kaixin and Fang, Tianqing and Zhang, Zhihan and Ouyang, Siru and Zhang, Hongming and Jiang, Meng and Yu, Dong}, journal={arXiv preprint arXiv:2410.01744}, year={2024} } ```

# Leopard-Instruct [论文](https://arxiv.org/abs/2410.01744) | [GitHub仓库](https://github.com/tencent-ailab/Leopard) | [LLaVA 模型](https://huggingface.co/wyu1/Leopard-LLaVA) | [Idefics2 模型](https://huggingface.co/wyu1/Leopard-Idefics2) ## 数据集概述 Leopard-Instruct 是一款大规模指令微调数据集，总计包含92.5万个样本，其中73.9万个样本专为富文本多图像场景设计。该数据集已被用于训练**Leopard-LLaVA**[[模型权重]](https://huggingface.co/wyu1/Leopard-LLaVA)与**Leopard-Idefics2**[[模型权重]](https://huggingface.co/wyu1/Leopard-Idefics2)。 ## 数据集加载 - 仅加载数据集而不自动下载并处理图像（请在`datasets==2.18.0`环境下运行以下代码） python import datasets dataset = datasets.load_dataset("wyu1/Leopard-Instruct", "webvision") # print(dataset['train'][0]['images'], dataset['train'][0]['texts']) - 加载图像的全部子集 python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("wyu1/Leopard-Instruct", config_name) ## 引用格式 @article{jia2024leopard, title={LEOPARD: A Vision Language Model For Text-Rich Multi-Image Tasks}, author={Jia, Mengzhao and Yu, Wenhao and Ma, Kaixin and Fang, Tianqing and Zhang, Zhihan and Ouyang, Siru and Zhang, Hongming and Jiang, Meng and Yu, Dong}, journal={arXiv preprint arXiv:2410.01744}, year={2024} }

提供机构：

maas

创建时间：

2024-11-06

5,000+

优质数据集

54 个

任务类型

进入经典数据集