five

Leopard-Instruct

收藏
魔搭社区2025-12-04 更新2024-11-16 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/Leopard-Instruct
下载链接
链接失效反馈
官方服务:
资源简介:
# Leopard-Instruct [Paper](https://arxiv.org/abs/2410.01744) | [Github](https://github.com/tencent-ailab/Leopard) | [Models-LLaVA](https://huggingface.co/wyu1/Leopard-LLaVA) | [Models-Idefics2](https://huggingface.co/wyu1/Leopard-Idefics2) ## Summaries Leopard-Instruct is a large instruction-tuning dataset, comprising 925K instances, with 739K specifically designed for text-rich, multiimage scenarios. It's been used to train **Leopard-LLaVA** [\[checkpoint\]](https://huggingface.co/wyu1/Leopard-LLaVA) and **Leopard-Idefics2** [\[checkpoint\]](https://huggingface.co/wyu1/Leopard-Idefics2). ## Loading dataset - to load the dataset without automatically downloading and process the images (Please run the following codes with datasets==2.18.0) ```python import datasets dataset = datasets.load_dataset("wyu1/Leopard-Instruct", "webvision") # print(dataset['train'][0]['images'], dataset['train'][0]['texts']) ``` - to load all the subsets of the images ```python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("wyu1/Leopard-Instruct", config_name) ``` ## Citation ``` @article{jia2024leopard, title={LEOPARD: A Vision Language Model For Text-Rich Multi-Image Tasks}, author={Jia, Mengzhao and Yu, Wenhao and Ma, Kaixin and Fang, Tianqing and Zhang, Zhihan and Ouyang, Siru and Zhang, Hongming and Jiang, Meng and Yu, Dong}, journal={arXiv preprint arXiv:2410.01744}, year={2024} } ```

# Leopard-Instruct [论文](https://arxiv.org/abs/2410.01744) | [GitHub仓库](https://github.com/tencent-ailab/Leopard) | [LLaVA 模型](https://huggingface.co/wyu1/Leopard-LLaVA) | [Idefics2 模型](https://huggingface.co/wyu1/Leopard-Idefics2) ## 数据集概述 Leopard-Instruct 是一款大规模指令微调数据集,总计包含92.5万个样本,其中73.9万个样本专为富文本多图像场景设计。该数据集已被用于训练**Leopard-LLaVA**[[模型权重]](https://huggingface.co/wyu1/Leopard-LLaVA)与**Leopard-Idefics2**[[模型权重]](https://huggingface.co/wyu1/Leopard-Idefics2)。 ## 数据集加载 - 仅加载数据集而不自动下载并处理图像(请在`datasets==2.18.0`环境下运行以下代码) python import datasets dataset = datasets.load_dataset("wyu1/Leopard-Instruct", "webvision") # print(dataset['train'][0]['images'], dataset['train'][0]['texts']) - 加载图像的全部子集 python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("wyu1/Leopard-Instruct", config_name) ## 引用格式 @article{jia2024leopard, title={LEOPARD: A Vision Language Model For Text-Rich Multi-Image Tasks}, author={Jia, Mengzhao and Yu, Wenhao and Ma, Kaixin and Fang, Tianqing and Zhang, Zhihan and Ouyang, Siru and Zhang, Hongming and Jiang, Meng and Yu, Dong}, journal={arXiv preprint arXiv:2410.01744}, year={2024} }
提供机构:
maas
创建时间:
2024-11-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作