five

s-u-do/mmerror-benchmark

收藏
Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/s-u-do/mmerror-benchmark
下载链接
链接失效反馈
官方服务:
资源简介:
MMErroR Benchmark是一个多模态错误分析数据集,由配对的图像和JSON注释文件组成。该数据集与ACL 2026论文《MMErroR: A Benchmark for Erroneous Reasoning in Vision-Language Models》一起发布。数据集总共有1,997个样本,每个样本包含一个PNG图像和一个JSON文件,通过question_id进行匹配。JSON文件包含question_id、question、correct_answer、error_reason、label、domain和subdomain等字段。数据集可用于分析视觉语言模型中的错误推理。

MMErroR Benchmark is a multimodal error analysis dataset organized as paired image and JSON annotation files. It is released with the ACL 2026 paper `MMErroR: A Benchmark for Erroneous Reasoning in Vision-Language Models`. The dataset contains a total of 1,997 examples, each consisting of a PNG image and a JSON file matched by question_id. The JSON file includes fields such as question_id, question, correct_answer, error_reason, label, domain, and subdomain. The dataset is designed for analyzing erroneous reasoning in vision-language models.
提供机构:
s-u-do
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作