VisuLogic/VisuLogic
收藏Hugging Face2025-07-09 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/VisuLogic/VisuLogic
下载链接
链接失效反馈官方服务:
资源简介:
VisuLogic是一个用于评估多模态大型语言模型中视觉推理能力的基准数据集。它是一个结合了视觉感知和逻辑推理的挑战性基准,具有以下特点:精心设计的1000个问题跨越6个领域和24个子类别;视觉中心的推理任务需要真正的多模态理解;与人类准确度超过50%的评价对齐,而最先进的MLLMs准确度低于30%。
VisuLogic is a benchmark for evaluating visual reasoning in multi-modal large language models. It is a challenging visual-centric benchmark integrating visual perception with logical reasoning for authentic multimodal evaluation, featuring rigorous design with 1,000 meticulously curated questions across 6 domains and 24 subcategories, visual centric reasoning tasks that require genuine multimodal understanding, and human-aligned evaluation with human accuracy over 50.0% and SOTA MLLMs accuracy below 30%.
提供机构:
VisuLogic



