cjerzak/MultimodalMathBenchmarks

Name: cjerzak/MultimodalMathBenchmarks
Creator: cjerzak
Published: 2026-04-22 03:06:13
License: 暂无描述

Hugging Face2026-04-22 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/cjerzak/MultimodalMathBenchmarks

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集用于评估多模态大语言模型（Multimodal LLMs）的算术能力，包含三个主要部分：1) SharedMultimodalGrid.csv：10,000个共享乘法问题，以文本、图像和音频形式呈现，分为训练、验证和测试集；2) HDSv2.csv：1,000个启发式不一致问题，用于指纹识别和探针式评估；3) Trapsv2.csv：30个对抗性陷阱问题，旨在针对特定启发式失败。数据集支持多模态输入（文本、图像、音频），并包含详细的元数据和评估指标。

This dataset is designed to evaluate the arithmetic capabilities of multimodal LLMs. It consists of three main components: 1) SharedMultimodalGrid.csv: 10,000 shared multiplication problems presented in text, image, and audio formats, split into train, validation, and test sets; 2) HDSv2.csv: 1,000 heuristic-disagreement problems for fingerprinting and probe-style evaluation; 3) Trapsv2.csv: 30 adversarial trap problems targeting heuristic-specific failures. The dataset supports multimodal inputs (text, images, audio) and includes detailed metadata and evaluation metrics.

提供机构：

cjerzak

5,000+

优质数据集

54 个

任务类型

进入经典数据集