MM-Hallu/RAH-Bench

Name: MM-Hallu/RAH-Bench
Creator: MM-Hallu
Published: 2026-04-30 05:13:35
License: 暂无描述

Hugging Face2026-04-30 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/MM-Hallu/RAH-Bench

下载链接

链接失效反馈

官方服务：

资源简介：

RAH-Bench是一个用于评估视觉语言模型（VLMs）中物体幻觉现象的基准数据集。该数据集包含3,000个关于COCO val2017图像的二元是/否问题，这些问题根据幻觉类型（如属性、类别、关系等）进行了分类。每个问题都有对应的图像、唯一的问题ID、COCO图像ID、问题文本、真实标签（“是”或“否”）以及幻觉类别。数据集的评估指标包括准确率、精确率、召回率和F1分数。数据来源于arXiv 2023年的一篇论文。

RAH-Bench is a benchmark for evaluating object hallucination in VLMs. It consists of 3,000 binary yes/no questions about COCO val2017 images, categorized by hallucination type. Each question includes an image, a unique question ID, a COCO image ID, the question text, a ground truth label ("yes" or "no"), and the hallucination category. Evaluation metrics include Accuracy, Precision, Recall, and F1. The data originates from a 2023 arXiv paper.

提供机构：

MM-Hallu

5,000+

优质数据集

54 个

任务类型

进入经典数据集