five

MERLIM

收藏
arXiv2023-12-04 更新2024-06-21 收录
下载链接:
https://github.com/ojedaf/MERLIM
下载链接
链接失效反馈
官方服务:
资源简介:
MERLIM是由阿卜杜拉国王科技大学开发的用于评估大型图像-语言模型的多模态评估基准。该数据集包含超过279,000个图像-问题对,主要用于检测跨模态的'幻觉'事件,即语言输出指向图像中不存在或不相关的视觉概念。数据集通过编辑图像来验证模型的预测是否基于有效的视觉基础,从而评估模型在基本计算机视觉任务上的性能。MERLIM的应用领域包括对象识别、实例计数和对象间关系理解,旨在解决当前模型在零样本学习能力上的局限性。

MERLIM is a multimodal evaluation benchmark developed by King Abdullah University of Science and Technology (KAUST) for evaluating large image-language models. This dataset comprises over 279,000 image-question pairs, mainly used to detect cross-modal "hallucination" events—cases where the language output refers to visual concepts that do not exist or are irrelevant to the corresponding image content. By editing images to verify whether a model's predictions are grounded in valid visual information, the dataset assesses the model's performance on fundamental computer vision tasks. The application areas of MERLIM include object recognition, instance counting and inter-object relationship understanding, aiming to address the limitations of current models in zero-shot learning capabilities.
提供机构:
阿卜杜拉国王科技大学 (KAUST)
创建时间:
2023-12-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作