Mantis-Instruct

Name: Mantis-Instruct
Creator: maas
Published: 2026-05-07 21:26:24
License: 暂无描述

魔搭社区2026-05-07 更新2024-06-08 收录

下载链接：

https://modelscope.cn/datasets/swift/Mantis-Instruct

下载链接

链接失效反馈

官方服务：

资源简介：

# Mantis-Instruct [Paper](https://arxiv.org/abs/2405.01483) | [Website](https://tiger-ai-lab.github.io/Mantis/) | [Github](https://github.com/TIGER-AI-Lab/Mantis) | [Models](https://huggingface.co/collections/TIGER-Lab/mantis-6619b0834594c878cdb1d6e4) | [Demo](https://huggingface.co/spaces/TIGER-Lab/Mantis) ## Summaries Mantis-Instruct is a fully text-image interleaved multimodal instruction tuning dataset, containing 721K examples from 14 subsets and covering multi-image skills including co-reference, reasoning, comparing, temporal understanding. **It's been used to train Mantis Model families** - Mantis-Instruct has a total of **721K instances**, consisting of **14 subsets** to cover all the multi-image skills. - Among the 14 subsets, 10 subsets are from the existing datasets. For example, NLVR2, IconQA, etc for reasoning skill; DreamSim, Birds-to-Words, etc for comparison skill; NExT-QA, STAR, for temporal understanding - We additionally curate four new datasets LLaVA-665k-multi, LRV-multi to cover coref skill and Contrast-Caption, Multi-VQA to broaden reasoning skill, where Multi-VQA is generated by prompting GPT-4. ![Mantis-Instruct Statistics](https://github.com/TIGER-AI-Lab/Mantis/blob/gh-pages/images/miqa_stat.png?raw=true) ## Loading dataset - to load the dataset without automatically downloading and process the images ```python import datasets dataset = datasets.load_dataset("TIGER-Lab/Mantis-Instruct", "multi_vqa") # revision is 'main' by default # dataset['train'][0]['images']: image paths relative to the text file, change it to the valid path on your local machine. ``` In this case, you need to manually download the image zips from the [`revision`](https://huggingface.co/datasets/TIGER-Lab/Mantis-Instruct/tree/script) branch of this repo for each subset, and set the prepend the directory of the images. - to load the dataset that automatically downloads and process the images (**Please run the following codes with datasets==2.18.0** ) ```python import datasets dataset = datasets.load_dataset("TIGER-Lab/Mantis-Instruct", "multi_vqa", revision="script") # dataset['train'][0]['images']: processed absolution valid path of the downloaded images on your local machine ``` - to load all the subsets of the images ```python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("TIGER-Lab/Mantis-Instruct", config_name) ``` - to load all the subsets of the images, with automatically downloading ```python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("TIGER-Lab/Mantis-Instruct", config_name, revision="script") ``` ## Citation ``` @article{Jiang2024MANTISIM, title={MANTIS: Interleaved Multi-Image Instruction Tuning}, author={Dongfu Jiang and Xuan He and Huaye Zeng and Cong Wei and Max W.F. Ku and Qian Liu and Wenhu Chen}, journal={Transactions on Machine Learning Research}, year={2024}, volume={2024}, url={https://openreview.net/forum?id=skLtdUVaJa} } ```

# Mantis-Instruct [论文](https://arxiv.org/abs/2405.01483) | [官网](https://tiger-ai-lab.github.io/Mantis/) | [GitHub仓库](https://github.com/TIGER-AI-Lab/Mantis) | [模型集合](https://huggingface.co/collections/TIGER-Lab/mantis-6619b0834594c878cdb1d6e4) | [在线演示](https://huggingface.co/spaces/TIGER-Lab/Mantis) ## 数据集概述 Mantis-Instruct是一个纯文本-图像交错式多模态指令微调数据集（multimodal instruction tuning dataset），包含来自14个子集的72.1万个样本，覆盖共指推理、逻辑推理、对比分析、时序理解等多图像技能。**该数据集已用于训练Mantis模型系列** - Mantis-Instruct总计包含**72.1万个实例**，由**14个子集**构成，覆盖全部多图像技能。 - 在这14个子集中，10个源自现有数据集：例如用于推理技能的NLVR2、IconQA等；用于对比技能的DreamSim、Birds-to-Words等；用于时序理解的NExT-QA、STAR等。 - 我们额外构建了4个全新数据集：LLaVA-665k-multi、LRV-multi用于覆盖共指技能，Contrast-Caption、Multi-VQA用于拓展推理技能，其中Multi-VQA通过提示GPT-4生成。 ![Mantis-Instruct 统计信息](https://github.com/TIGER-AI-Lab/Mantis/blob/gh-pages/images/miqa_stat.png?raw=true) ## 数据集加载 ### 无自动下载处理的加载方式如需在不自动下载并处理图像的情况下加载数据集： python import datasets dataset = datasets.load_dataset("TIGER-Lab/Mantis-Instruct", "multi_vqa") # 修订版本默认为主分支（main） # dataset['train'][0]['images']：图像路径为相对于文本文件的相对路径，请将其修改为本地机器上的有效路径。在此场景下，你需要手动从本仓库的[`script`分支](https://huggingface.co/datasets/TIGER-Lab/Mantis-Instruct/tree/script)下载每个子集对应的图像压缩包，并将图像目录前缀添加至路径中。 ### 自动下载处理图像的加载方式如需以自动下载并处理图像的方式加载数据集（**请确保使用datasets==2.18.0版本运行以下代码**）： python import datasets dataset = datasets.load_dataset("TIGER-Lab/Mantis-Instruct", "multi_vqa", revision="script") # dataset['train'][0]['images']：已处理为本地下载图像的绝对有效路径。 ### 加载所有子集的图像数据（无自动下载）如需加载所有子集的图像数据： python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("TIGER-Lab/Mantis-Instruct", config_name) ### 自动下载并加载所有子集的图像数据如需自动下载并加载所有子集的图像数据： python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("TIGER-Lab/Mantis-Instruct", config_name, revision="script") ## 引用信息 bibtex @article{Jiang2024MANTISIM, title={MANTIS: Interleaved Multi-Image Instruction Tuning}, author={Dongfu Jiang and Xuan He and Huaye Zeng and Cong Wei and Max W.F. Ku and Qian Liu and Wenhu Chen}, journal={Transactions on Machine Learning Research}, year={2024}, volume={2024}, url={https://openreview.net/forum?id=skLtdUVaJa} }

提供机构：

maas

创建时间：

2024-06-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集