MMEB-eval

Name: MMEB-eval
Creator: maas
Published: 2026-05-15 20:58:36
License: 暂无描述

魔搭社区2026-05-15 更新2025-02-08 收录

下载链接：

https://modelscope.cn/datasets/TIGER-Lab/MMEB-eval

下载链接

链接失效反馈

官方服务：

资源简介：

# Massive Multimodal Embedding Benchmark We compile a large set of evaluation tasks to understand the capabilities of multimodal embedding models. This benchmark covers 4 meta tasks and 36 datasets meticulously selected for evaluation. The dataset is published in our paper [VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks](https://arxiv.org/abs/2410.05160). ## Dataset Usage For each dataset, we have 1000 examples for evaluation. Each example contains a query and a set of targets. Both the query and target could be any combination of image and text. The first one in the candidate list is the groundtruth target. ## Statistics We show the statistics of all the datasets as follows: <img width="900" alt="abs" src="statistics.png"> ## Per-dataset Results We list the performance of different embedding models in the following: <img width="900" alt="abs" src="leaderboard.png"> ## Submission We will set a formal leaderboard soon. If you want to add your results to the leaderboard, please send email to us at ziyanjiang528@gmail.com. ## Cite Us ``` @article{jiang2024vlm2vec, title={VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks}, author={Jiang, Ziyan and Meng, Rui and Yang, Xinyi and Yavuz, Semih and Zhou, Yingbo and Chen, Wenhu}, journal={arXiv preprint arXiv:2410.05160}, year={2024} } ```

# 大规模多模态嵌入基准测试集（Massive Multimodal Embedding Benchmark）我们构建了一套大规模的评估任务体系，以探究多模态嵌入模型（multimodal embedding models）的能力与性能表现。该基准涵盖4类元任务与36个经过精心遴选的评估数据集。本数据集的相关研究成果已发表于论文《VLM2Vec：面向大规模多模态嵌入任务的视觉语言模型（Vision-Language Models）训练》，链接为https://arxiv.org/abs/2410.05160。 ## 数据集使用说明针对每个评估数据集，我们均提供1000条测试样本。每条样本包含一个查询项与一组目标项，查询项与目标项均可为图像、文本或二者的任意组合。候选列表中的首个条目即为真实目标（groundtruth target）。 ## 统计信息我们将所有数据集的统计信息展示如下： <img width="900" alt="abs" src="statistics.png"> ## 单数据集评测结果我们在下方列出了不同嵌入模型的评测性能： <img width="900" alt="abs" src="leaderboard.png"> ## 结果提交我们即将上线正式的排行榜。若您希望将模型评测结果提交至该排行榜，请发送邮件至：ziyanjiang528@gmail.com。 ## 引用我们 @article{jiang2024vlm2vec, title={VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks}, author={Jiang, Ziyan and Meng, Rui and Yang, Xinyi and Yavuz, Semih and Zhou, Yingbo and Chen, Wenhu}, journal={arXiv preprint arXiv:2410.05160}, year={2024} }

提供机构：

maas

创建时间：

2025-02-03

搜集汇总

数据集介绍