MERLIM

Name: MERLIM
Creator: 阿卜杜拉国王科技大学 (KAUST)
Published: 2023-12-04 00:39:36
License: 暂无描述

arXiv2023-12-04 更新2024-06-21 收录

下载链接：

https://github.com/ojedaf/MERLIM

下载链接

链接失效反馈

官方服务：

资源简介：

MERLIM是由阿卜杜拉国王科技大学开发的用于评估大型图像-语言模型的多模态评估基准。该数据集包含超过279,000个图像-问题对，主要用于检测跨模态的'幻觉'事件，即语言输出指向图像中不存在或不相关的视觉概念。数据集通过编辑图像来验证模型的预测是否基于有效的视觉基础，从而评估模型在基本计算机视觉任务上的性能。MERLIM的应用领域包括对象识别、实例计数和对象间关系理解，旨在解决当前模型在零样本学习能力上的局限性。

MERLIM is a multimodal evaluation benchmark developed by King Abdullah University of Science and Technology (KAUST) for evaluating large image-language models. This dataset comprises over 279,000 image-question pairs, mainly used to detect cross-modal "hallucination" events—cases where the language output refers to visual concepts that do not exist or are irrelevant to the corresponding image content. By editing images to verify whether a model's predictions are grounded in valid visual information, the dataset assesses the model's performance on fundamental computer vision tasks. The application areas of MERLIM include object recognition, instance counting and inter-object relationship understanding, aiming to address the limitations of current models in zero-shot learning capabilities.

提供机构：

阿卜杜拉国王科技大学 (KAUST)

创建时间：

2023-12-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集