EMMA
收藏魔搭社区2025-12-03 更新2025-07-26 收录
下载链接:
https://modelscope.cn/datasets/lmms-lab/EMMA
下载链接
链接失效反馈官方服务:
资源简介:
## Dataset Description
**EMMA (Enhanced MultiModal reAsoning)** is a benchmark targeting organic multimodal reasoning across mathematics, physics, chemistry, and coding.
EMMA tasks demand advanced cross-modal reasoning that cannot be solved by thinking separately in each modality, offering an enhanced test suite for MLLMs' reasoning capabilities.
EMMA is composed of 2,788 problems, of which 1,796 are newly constructed, across four domains. Within each subject, we further provide fine-grained labels for each question based on the specific skills it measures.
<p align="center">
<img src="https://huggingface.co/datasets/luckychao/EMMA/resolve/main/emma_composition.jpg" width="30%"> <br>
</p>
## Paper Information
- Paper: https://www.arxiv.org/abs/2501.05444
- Code: https://github.com/hychaochao/EMMA
- Project: https://emma-benchmark.github.io/
### Data Format
The dataset is provided in jsonl format and contains the following attributes:
```
{
"pid": [string] Problem ID, e.g., “math_1”,
"question": [string] The question text,
"options": [list] Choice options for multiple-choice problems. For free-form problems, this could be a 'none' value,
"answer": [string] The correct answer for the problem,
"image_1": [image] ,
"image_2": [image] ,
"image_3": [image] ,
"image_4": [image] ,
"image_5": [image] ,
"solution": [string] The detailed thinking steps required to solve the problem,
"subject": [string] The subject of data, e.g., “Math”, “Physics”...,
"task": [string] The task of the problem, e.g., “Code Choose Vis”,
"category": [string] The category of the problem, e.g., “2D Transformation”,
"source": [string] The original source dataset of the data, e.g., “math-vista”. For handmade data, this could be “Newly annotated” ,
"type": [string] Types of questions, e.g., “Multiple Choice”, “Open-ended”,
"context": [string] Background knowledge required for the question. For problems without context, this could be a 'none' value,
}
```
## Citation
```
@misc{hao2025mllmsreasonmultimodalityemma,
title={Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark},
author={Yunzhuo Hao and Jiawei Gu and Huichen Will Wang and Linjie Li and Zhengyuan Yang and Lijuan Wang and Yu Cheng},
year={2025},
eprint={2501.05444},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2501.05444},
}
```
**EMMA(Enhanced MultiModal reAsoning)** 是一款面向数学、物理、化学与编程领域的有机多模态推理基准测试集。
EMMA的任务需要依托高级跨模态推理能力,无法通过单一模态独立思考完成,为多模态大语言模型(Multimodal Large Language Model,MLLM)的推理性能提供了更为严苛的测试套件。
EMMA共包含2788道题目,其中1796道为全新构建,覆盖上述四个学科领域。在每个学科下,我们还会根据题目考察的具体技能为每道题提供细粒度标签。
<p align="center">
<img src="https://huggingface.co/datasets/luckychao/EMMA/resolve/main/emma_composition.jpg" width="30%"> <br>
</p>
## 论文信息
- 论文:https://www.arxiv.org/abs/2501.05444
- 代码:https://github.com/hychaochao/EMMA
- 项目主页:https://emma-benchmark.github.io/
### 数据格式
该数据集采用jsonl格式存储,包含以下字段:
{
"pid": [string] 问题ID,例如“math_1”,
"question": [string] 问题文本,
"options": [list] 选择题的选项列表。对于自由作答类题目,该字段取值为'none',
"answer": [string] 题目的正确答案,
"image_1": [image] ,
"image_2": [image] ,
"image_3": [image] ,
"image_4": [image] ,
"image_5": [image] ,
"solution": [string] 解题所需的详细思维步骤,
"subject": [string] 题目所属学科,例如“数学”、“物理”等,
"task": [string] 题目对应的任务类型,例如“代码选择可视化”,
"category": [string] 题目所属类别,例如“二维变换”,
"source": [string] 数据的原始来源数据集,例如“math-vista”。对于手工标注数据,该字段取值为“新标注”,
"type": [string] 题目类型,例如“选择题”、“开放作答题”,
"context": [string] 解题所需的背景知识。对于无背景知识的题目,该字段取值为'none',
}
### 引用
@misc{hao2025mllmsreasonmultimodalityemma,
title={Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark},
author={Yunzhuo Hao and Jiawei Gu and Huichen Will Wang and Linjie Li and Zhengyuan Yang and Lijuan Wang and Yu Cheng},
year={2025},
eprint={2501.05444},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2501.05444},
}
提供机构:
maas
创建时间:
2025-07-25



