MATH-V

arXiv2025-09-30 收录

下载链接：

https://mathvision-cuhk.github.io/

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个精心设计的基准测试，旨在评估基础模型在涵盖广泛数学任务中的多模态数学推理能力，特别是在视觉情境下。该基准测试覆盖了16个主题，分为5个难度级别，并且问题是从19个竞赛中手动搜集的，特别关注那些使用视觉输入的问题。该数据集规模包含3,040个问题（其中测试集有304个问题），任务重点在于结合视觉情境的数学推理。

This dataset is a meticulously crafted benchmark designed to evaluate the multimodal mathematical reasoning capabilities of foundation models across a broad spectrum of mathematical tasks, especially in visual scenarios. This benchmark covers 16 topics, which are categorized into five difficulty levels, and its questions are manually curated from 19 competitions, with a particular focus on those requiring visual inputs. The dataset comprises a total of 3,040 questions, with 304 of them assigned to the test set, and the core focus of the tasks is on mathematical reasoning integrated with visual contexts.

搜集汇总

背景与挑战

背景概述

MATH-V是一个多模态数学推理基准测试数据集，包含3,040个问题，覆盖16个主题和5个难度级别，特别关注视觉情境下的数学问题，旨在评估基础模型在此类任务中的表现。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集