Leopard-Instruct
收藏魔搭社区2025-12-04 更新2024-11-16 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/Leopard-Instruct
下载链接
链接失效反馈官方服务:
资源简介:
# Leopard-Instruct
[Paper](https://arxiv.org/abs/2410.01744) | [Github](https://github.com/tencent-ailab/Leopard) | [Models-LLaVA](https://huggingface.co/wyu1/Leopard-LLaVA) | [Models-Idefics2](https://huggingface.co/wyu1/Leopard-Idefics2)
## Summaries
Leopard-Instruct is a large instruction-tuning dataset, comprising 925K instances, with 739K specifically designed for text-rich, multiimage scenarios. It's been used to train **Leopard-LLaVA** [\[checkpoint\]](https://huggingface.co/wyu1/Leopard-LLaVA) and **Leopard-Idefics2** [\[checkpoint\]](https://huggingface.co/wyu1/Leopard-Idefics2).
## Loading dataset
- to load the dataset without automatically downloading and process the images (Please run the following codes with datasets==2.18.0)
```python
import datasets
dataset = datasets.load_dataset("wyu1/Leopard-Instruct", "webvision")
# print(dataset['train'][0]['images'], dataset['train'][0]['texts'])
```
- to load all the subsets of the images
```python
from datasets import get_dataset_config_names, load_dataset
config_dataset = {}
for config_name in get_dataset_config_names():
config_dataset[config_name] = load_dataset("wyu1/Leopard-Instruct", config_name)
```
## Citation
```
@article{jia2024leopard,
title={LEOPARD: A Vision Language Model For Text-Rich Multi-Image Tasks},
author={Jia, Mengzhao and Yu, Wenhao and Ma, Kaixin and Fang, Tianqing and Zhang, Zhihan and Ouyang, Siru and Zhang, Hongming and Jiang, Meng and Yu, Dong},
journal={arXiv preprint arXiv:2410.01744},
year={2024}
}
```
# Leopard-Instruct
[论文](https://arxiv.org/abs/2410.01744) | [GitHub仓库](https://github.com/tencent-ailab/Leopard) | [LLaVA 模型](https://huggingface.co/wyu1/Leopard-LLaVA) | [Idefics2 模型](https://huggingface.co/wyu1/Leopard-Idefics2)
## 数据集概述
Leopard-Instruct 是一款大规模指令微调数据集,总计包含92.5万个样本,其中73.9万个样本专为富文本多图像场景设计。该数据集已被用于训练**Leopard-LLaVA**[[模型权重]](https://huggingface.co/wyu1/Leopard-LLaVA)与**Leopard-Idefics2**[[模型权重]](https://huggingface.co/wyu1/Leopard-Idefics2)。
## 数据集加载
- 仅加载数据集而不自动下载并处理图像(请在`datasets==2.18.0`环境下运行以下代码)
python
import datasets
dataset = datasets.load_dataset("wyu1/Leopard-Instruct", "webvision")
# print(dataset['train'][0]['images'], dataset['train'][0]['texts'])
- 加载图像的全部子集
python
from datasets import get_dataset_config_names, load_dataset
config_dataset = {}
for config_name in get_dataset_config_names():
config_dataset[config_name] = load_dataset("wyu1/Leopard-Instruct", config_name)
## 引用格式
@article{jia2024leopard,
title={LEOPARD: A Vision Language Model For Text-Rich Multi-Image Tasks},
author={Jia, Mengzhao and Yu, Wenhao and Ma, Kaixin and Fang, Tianqing and Zhang, Zhihan and Ouyang, Siru and Zhang, Hongming and Jiang, Meng and Yu, Dong},
journal={arXiv preprint arXiv:2410.01744},
year={2024}
}
提供机构:
maas
创建时间:
2024-11-06



