UniMM-Chat

Name: UniMM-Chat
Creator: maas
Published: 2025-11-20 15:09:34
License: 暂无描述

魔搭社区2025-11-20 更新2024-06-08 收录

下载链接：

https://modelscope.cn/datasets/thomas/UniMM-Chat

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for UniMM-Chat ## Dataset Summary UniMM-Chat dataset is an **open-source, knowledge-intensive, and multi-round multimodal dialogue data** powered by GPT-3.5, which consists of **1.1M diverse instructions**. UniMM-Chat leverages **complementary annotations from different VL datasets** and employs GPT-3.5 to generate multi-turn dialogues corresponding to each image, resulting in **117,238 dialogues**, with an average of **9.89 turns per dialogue**. <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/HQlP6gRsIq9E2czvmunca.png" alt="fig1" width="60%"/> **A diverse set of instructions**: <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/8gmR9FWnCjDIs8IQ7ZxpU.png" alt="fig1" width="30%"/> **Resulting superior performance in image understanding and reasoning**: <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/YZceD395gErU7FiVVBljE.png" alt="fig1" width="40%"/> ## Related Sources - Paper: https://arxiv.org/abs/2310.00653 - Models Trained on UniMM-Chat: 🥞[Muffin](https://github.com/thunlp/muffin), 🏆[RLHF-V](https://rlhf-v.github.io) ## Usage ```python from datasets import load_dataset data = load_dataset("Yirany/UniMM-Chat") ``` ## Citation ``` @article{yu2023reformulating, title={Reformulating vision-language foundation models and datasets towards universal multimodal assistants}, author={Yu, Tianyu and Hu, Jinyi and Yao, Yuan and Zhang, Haoye and Zhao, Yue and Wang, Chongyi and Wang, Shan and Pan, Yinxv and Xue, Jiao and Li, Dahai and others}, journal={arXiv preprint arXiv:2310.00653}, year={2023} } ```

# UniMM-Chat 数据集卡片 ## 数据集概述 UniMM-Chat数据集是一款由GPT-3.5驱动的**开源、知识密集型多轮多模态对话数据集**，共包含**110万条多样化指令**。该数据集利用了不同视觉语言（Vision-Language, VL）数据集的互补标注，并借助GPT-3.5为每张图像生成对应多轮对话，最终得到**117238条对话**，单条对话平均轮次为**9.89轮**。 <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/HQlP6gRsIq9E2czvmunca.png" alt="fig1" width="60%"/> **多样化指令集**： <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/8gmR9FWnCjDIs8IQ7ZxpU.png" alt="fig1" width="30%"/> **在图像理解与推理任务中表现优异**： <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/YZceD395gErU7FiVVBljE.png" alt="fig1" width="40%"/> ## 相关资源 - 论文：https://arxiv.org/abs/2310.00653 - 基于UniMM-Chat训练的模型：🥞[Muffin](https://github.com/thunlp/muffin)、🏆[RLHF-V](https://rlhf-v.github.io) ## 使用方法 python from datasets import load_dataset data = load_dataset("Yirany/UniMM-Chat") ## 引用格式 @article{yu2023reformulating, title={Reformulating vision-language foundation models and datasets towards universal multimodal assistants}, author={Yu, Tianyu and Hu, Jinyi and Yao, Yuan and Zhang, Haoye and Zhao, Yue and Wang, Chongyi and Wang, Shan and Pan, Yinxv and Xue, Jiao and Li, Dahai and others}, journal={arXiv preprint arXiv:2310.00653}, year={2023} }

提供机构：

maas

创建时间：

2024-05-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集