s-u-do/mmerror-benchmark

Name: s-u-do/mmerror-benchmark
Creator: s-u-do
Published: 2026-04-22 10:39:17
License: 暂无描述

Hugging Face2026-04-22 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/s-u-do/mmerror-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

MMErroR Benchmark是一个多模态错误分析数据集，由配对的图像和JSON注释文件组成。该数据集与ACL 2026论文《MMErroR: A Benchmark for Erroneous Reasoning in Vision-Language Models》一起发布。数据集总共有1,997个样本，每个样本包含一个PNG图像和一个JSON文件，通过question_id进行匹配。JSON文件包含question_id、question、correct_answer、error_reason、label、domain和subdomain等字段。数据集可用于分析视觉语言模型中的错误推理。

MMErroR Benchmark is a multimodal error analysis dataset organized as paired image and JSON annotation files. It is released with the ACL 2026 paper `MMErroR: A Benchmark for Erroneous Reasoning in Vision-Language Models`. The dataset contains a total of 1,997 examples, each consisting of a PNG image and a JSON file matched by question_id. The JSON file includes fields such as question_id, question, correct_answer, error_reason, label, domain, and subdomain. The dataset is designed for analyzing erroneous reasoning in vision-language models.

提供机构：

s-u-do

5,000+

优质数据集

54 个

任务类型

进入经典数据集