five

Mantis-Instruct

收藏
魔搭社区2026-05-07 更新2024-06-08 收录
下载链接:
https://modelscope.cn/datasets/swift/Mantis-Instruct
下载链接
链接失效反馈
官方服务:
资源简介:
# Mantis-Instruct [Paper](https://arxiv.org/abs/2405.01483) | [Website](https://tiger-ai-lab.github.io/Mantis/) | [Github](https://github.com/TIGER-AI-Lab/Mantis) | [Models](https://huggingface.co/collections/TIGER-Lab/mantis-6619b0834594c878cdb1d6e4) | [Demo](https://huggingface.co/spaces/TIGER-Lab/Mantis) ## Summaries Mantis-Instruct is a fully text-image interleaved multimodal instruction tuning dataset, containing 721K examples from 14 subsets and covering multi-image skills including co-reference, reasoning, comparing, temporal understanding. **It's been used to train Mantis Model families** - Mantis-Instruct has a total of **721K instances**, consisting of **14 subsets** to cover all the multi-image skills. - Among the 14 subsets, 10 subsets are from the existing datasets. For example, NLVR2, IconQA, etc for reasoning skill; DreamSim, Birds-to-Words, etc for comparison skill; NExT-QA, STAR, for temporal understanding - We additionally curate four new datasets LLaVA-665k-multi, LRV-multi to cover coref skill and Contrast-Caption, Multi-VQA to broaden reasoning skill, where Multi-VQA is generated by prompting GPT-4. ![Mantis-Instruct Statistics](https://github.com/TIGER-AI-Lab/Mantis/blob/gh-pages/images/miqa_stat.png?raw=true) ## Loading dataset - to load the dataset without automatically downloading and process the images ```python import datasets dataset = datasets.load_dataset("TIGER-Lab/Mantis-Instruct", "multi_vqa") # revision is 'main' by default # dataset['train'][0]['images']: image paths relative to the text file, change it to the valid path on your local machine. ``` In this case, you need to manually download the image zips from the [`revision`](https://huggingface.co/datasets/TIGER-Lab/Mantis-Instruct/tree/script) branch of this repo for each subset, and set the prepend the directory of the images. - to load the dataset that automatically downloads and process the images (**Please run the following codes with datasets==2.18.0** ) ```python import datasets dataset = datasets.load_dataset("TIGER-Lab/Mantis-Instruct", "multi_vqa", revision="script") # dataset['train'][0]['images']: processed absolution valid path of the downloaded images on your local machine ``` - to load all the subsets of the images ```python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("TIGER-Lab/Mantis-Instruct", config_name) ``` - to load all the subsets of the images, with automatically downloading ```python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("TIGER-Lab/Mantis-Instruct", config_name, revision="script") ``` ## Citation ``` @article{Jiang2024MANTISIM, title={MANTIS: Interleaved Multi-Image Instruction Tuning}, author={Dongfu Jiang and Xuan He and Huaye Zeng and Cong Wei and Max W.F. Ku and Qian Liu and Wenhu Chen}, journal={Transactions on Machine Learning Research}, year={2024}, volume={2024}, url={https://openreview.net/forum?id=skLtdUVaJa} } ```

# Mantis-Instruct [论文](https://arxiv.org/abs/2405.01483) | [官网](https://tiger-ai-lab.github.io/Mantis/) | [GitHub仓库](https://github.com/TIGER-AI-Lab/Mantis) | [模型集合](https://huggingface.co/collections/TIGER-Lab/mantis-6619b0834594c878cdb1d6e4) | [在线演示](https://huggingface.co/spaces/TIGER-Lab/Mantis) ## 数据集概述 Mantis-Instruct是一个纯文本-图像交错式多模态指令微调数据集(multimodal instruction tuning dataset),包含来自14个子集的72.1万个样本,覆盖共指推理、逻辑推理、对比分析、时序理解等多图像技能。**该数据集已用于训练Mantis模型系列** - Mantis-Instruct总计包含**72.1万个实例**,由**14个子集**构成,覆盖全部多图像技能。 - 在这14个子集中,10个源自现有数据集:例如用于推理技能的NLVR2、IconQA等;用于对比技能的DreamSim、Birds-to-Words等;用于时序理解的NExT-QA、STAR等。 - 我们额外构建了4个全新数据集:LLaVA-665k-multi、LRV-multi用于覆盖共指技能,Contrast-Caption、Multi-VQA用于拓展推理技能,其中Multi-VQA通过提示GPT-4生成。 ![Mantis-Instruct 统计信息](https://github.com/TIGER-AI-Lab/Mantis/blob/gh-pages/images/miqa_stat.png?raw=true) ## 数据集加载 ### 无自动下载处理的加载方式 如需在不自动下载并处理图像的情况下加载数据集: python import datasets dataset = datasets.load_dataset("TIGER-Lab/Mantis-Instruct", "multi_vqa") # 修订版本默认为主分支(main) # dataset['train'][0]['images']:图像路径为相对于文本文件的相对路径,请将其修改为本地机器上的有效路径。 在此场景下,你需要手动从本仓库的[`script`分支](https://huggingface.co/datasets/TIGER-Lab/Mantis-Instruct/tree/script)下载每个子集对应的图像压缩包,并将图像目录前缀添加至路径中。 ### 自动下载处理图像的加载方式 如需以自动下载并处理图像的方式加载数据集(**请确保使用datasets==2.18.0版本运行以下代码**): python import datasets dataset = datasets.load_dataset("TIGER-Lab/Mantis-Instruct", "multi_vqa", revision="script") # dataset['train'][0]['images']:已处理为本地下载图像的绝对有效路径。 ### 加载所有子集的图像数据(无自动下载) 如需加载所有子集的图像数据: python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("TIGER-Lab/Mantis-Instruct", config_name) ### 自动下载并加载所有子集的图像数据 如需自动下载并加载所有子集的图像数据: python from datasets import get_dataset_config_names, load_dataset config_dataset = {} for config_name in get_dataset_config_names(): config_dataset[config_name] = load_dataset("TIGER-Lab/Mantis-Instruct", config_name, revision="script") ## 引用信息 bibtex @article{Jiang2024MANTISIM, title={MANTIS: Interleaved Multi-Image Instruction Tuning}, author={Dongfu Jiang and Xuan He and Huaye Zeng and Cong Wei and Max W.F. Ku and Qian Liu and Wenhu Chen}, journal={Transactions on Machine Learning Research}, year={2024}, volume={2024}, url={https://openreview.net/forum?id=skLtdUVaJa} }
提供机构:
maas
创建时间:
2024-06-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作