MMD

Name: MMD
Creator: maas
Published: 2025-08-23 09:41:07
License: 暂无描述

魔搭社区2025-08-23 更新2025-03-08 收录

下载链接：

https://modelscope.cn/datasets/RealmSky/MMD

下载链接

链接失效反馈

官方服务：

资源简介：

### 📢 News - [06/13/2024] 🚀 We release our MMDU benchmark and MMDU-45k instruct tunning data to huggingface. ### 💎 MMDU Benchmark To evaluate the multi-image multi-turn dialogue capabilities of existing models, we have developed the MMDU Benchmark. Our benchmark comprises **110 high-quality multi-image multi-turn dialogues with more than 1600 questions**, each accompanied by detailed long-form answers. Previous benchmarks typically involved only single images or a small number of images, with fewer rounds of questions and short-form answers. However, MMDU significantly increases the number of images, the number of question-and-answer rounds, and the in-context length of the Q&A. The questions in MMUD **involve 2 to 20 images**, with an average image&text token length of **8.2k tokens**, and a maximum image&text length reaching **18K tokens**, presenting significant challenges to existing multimodal large models. ### 🎆 MMDU-45k Instruct Tuning Dataset In the MMDU-45k, we construct a total of **45k instruct tuning data conversations**. Each data in our MMDU-45k dataset features an ultra-long context, with an average image&text token length of **5k** and a maximum image&text token length of **17k tokens**. Each dialogue contains an average of **9 turns of Q&A**, with a maximum of **27 turns**. Additionally, each data includes content from **2-5 images**. The dataset is constructed in a well-designed format, providing excellent scalability. It can be expanded to generate a larger number and longer multi-image, multi-turn dialogues through combinations. **The image-text length and the number of turns in MMDU-45k significantly surpass those of all existing instruct tuning datasets.** This enhancement greatly improves the model's capabilities in multi-image recognition and understanding, as well as its ability to handle long-context dialogues. License: Attribution-NonCommercial 4.0 International It should abide by the policy of OpenAI: https://openai.com/policies/terms-of-use For more information, please refer to our 💻[Github](https://github.com/Liuziyu77/MMDU/), 🏠[Homepage](https://liuziyu77.github.io/MMDU/), or 📖[Paper](https://arxiv.org/abs/2406.11833).

### 📢 消息公告 - [2024年6月13日] 🚀 我们已将MMDU基准测试集与MMDU-45k指令微调数据集发布至Hugging Face平台。 ### 💎 MMDU基准测试集为评估现有模型的多图像多轮对话能力，我们构建了MMDU基准测试集。该基准集包含**110组高质量多图像多轮对话，涵盖1600余道问题**，每组对话均配有详实的长文本回答。此前的基准测试集通常仅支持单图像或少量图像输入，且对话轮次较少、回答多为短文本。而MMDU基准集则大幅提升了图像数量、问答轮次与问答上下文长度。MMDU的单组问答涉及**2至20张图像**，平均图像与文本总Token长度达**8.2千个Token**，最大上下文长度可达**18千个Token**，对现有多模态大模型构成了显著挑战。 ### 🎆 MMDU-45k指令微调数据集在MMDU-45k数据集中，我们共构建了**4.5万组指令微调对话数据**。该数据集的每组样本均具备超长上下文特性，平均图像与文本总Token长度为**5千个**，最大长度可达**17千个Token**。每组对话平均包含**9轮问答**，最多可达**27轮**。此外，每组样本均包含**2至5张图像**。该数据集采用精心设计的格式构建，具备极佳的可扩展性，可通过组合方式生成更多数量、更长长度的多图像多轮对话数据。MMDU-45k的图文上下文长度与对话轮次，均显著优于当前所有公开的指令微调数据集。该特性可有效提升模型的多图像识别与理解能力，以及长上下文对话处理能力。许可证：署名-非商业性使用4.0国际版（CC BY-NC 4.0），需遵守OpenAI相关政策：https://openai.com/policies/terms-of-use 如需了解更多信息，请访问我们的💻[GitHub仓库](https://github.com/Liuziyu77/MMDU/)、🏠[项目主页](https://liuziyu77.github.io/MMDU/)或📖[研究论文](https://arxiv.org/abs/2406.11833)。

提供机构：

maas

创建时间：

2025-03-06

5,000+

优质数据集

54 个

任务类型

进入经典数据集