MMInstruction/VL-RewardBench

Name: MMInstruction/VL-RewardBench
Creator: MMInstruction
Published: 2025-05-19 10:50:44
License: 暂无描述

Hugging Face2025-05-19 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/MMInstruction/VL-RewardBench

下载链接

链接失效反馈

官方服务：

资源简介：

VLRewardBench是一个全面的基准，用于评估视觉-语言生成奖励模型在视觉感知、幻觉检测和推理任务上的表现。该基准包含1250个高质量示例，专门设计用于探测模型的局限性。每个实例由跨三个关键领域的多模态查询组成：真实用户的通用多模态查询、视觉幻觉检测任务以及多模态知识和数学推理。

VLRewardBench is a comprehensive benchmark designed to evaluate vision-language generative reward models (VL-GenRMs) across visual perception, hallucination detection, and reasoning tasks. The benchmark contains 1,250 high-quality examples specifically curated to probe model limitations. Each instance consists of multimodal queries spanning three key domains: general multimodal queries from real users, visual hallucination detection tasks, and multimodal knowledge and mathematical reasoning.

提供机构：

MMInstruction

5,000+

优质数据集

54 个

任务类型

进入经典数据集