AIME25-CoT-CN
收藏魔搭社区2025-11-30 更新2025-09-20 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/AIME25-CoT-CN
下载链接
链接失效反馈官方服务:
资源简介:
# Sci-Bench-AIME25'
This repo is a branch of Sci Bench made by IPF team. Mainly include the AIME 25' solution with multi-modal CoT and diverse solving path.
## 📚 Cite
If you use the **Sci-Bench-AIME25 (IPF/AIME25-CoT-CN)** dataset in your research, please cite:
```bibtex
@dataset{zhang2025scibench_aime25,
title = {{Sci-Bench-AIME25}: A Multi-Modal Chain-of-Thought Dataset for Advanced Tool-Intergrated Mathematical Reasoning},
author = {Zhang, Haoxiang and Wang, Siyuan and Fang, Xueji and Zou, Xinkai and Lyu, Tiange and Cao, Siyuan and Huang, Jingyuan and Xing, Jie},
year = {2025},
publisher = {Hugging Face},
doi = {10.5281/zenodo.17112481},
url = {https://huggingface.co/datasets/IPF/AIME25-CoT-CN},
note = {Also available on Zenodo}
}
```
-- -- --
# Brief intro
## 💻 Overview
A brief template and final report will be posted in [Isaac's Blog](https://isaacghx.github.io/2025/08/08/solutions/Hand-made-Solution-and-CoT-for-AIME25/)
And the markdown template can be found in data/I_2
## ❓ Why we do this?
- The multi-lingual datasets are scarce, while the CoT of Math is even less, no matter whether the CoT or the solution contains pictures, code blocks, etc.
- It can be used as a base benchmark for reasoning LLM post-training: SFT & RL.
- It's tiny but entropy-rich.
- It can solve the issue of overfitting on certain single, text-based solution CoT.
-- -- --
# 🗡 Template
This repo's structure is formed as:
```
/Sci-Bench-AIME25
├── data
│ ├── I_1
│ │ ├── problem.md
│ │ ├── ground_truth.md
│ │ ├── solution1
│ │ ├── description.md (if it only bears (a) handwritten picture(s), please fill this md with "ONLY HANDWRITTEN PIC" at the very beginning. )
│ │ ├── code1.py
│ │ ├── code2.py
│ ├── I_2
│ ├── ...
│ ├── II_1
│ ├── ...
│
├── images(image should be higher than 1080P resolution and ensure the words are clear. Tips: it can be unease to recognize but should be formed relative structured)
│ ├── gen_images
│ │ ├── I_2
│ │ │ ├── code_1.png
│ │ ├── └── code_2.png
│ │ ├── I_6
│ │ ├── └── code_1.png
│ │ └── ...
│ └── hand_written
│ ├── II_12
│ │ ├── description_1.png
│ │ └── description_2.png
│ ├── I_11
│ │ └── description_1.png
│ └── ...
└── README.md
```
- Detail format of `problem.md`
```markdown
求所有整数底数 $b>9$ 的和,使得 $17_b$ 是 $97_b$ 的因数。
```
- Detail format of `ground_truth.md`
```markdown
70
```
- Detail format of `description.md`: please use `## step_n. xxx` to make the step-by-step subtitle.
More to mention: when there exists python code in the markdown `description.md`, you should also format into a step-wise title `## step_n. xxx` and elaborate the function of it.
Furthermore, Python code should be encapsulated by md tag: ```python ./code_n.py ```
```markdown
## 1. 线段比例分析与平行关系
首先,分析边 $\overline{AB}$ 和 $\overline{AC}$ 上的线段长度和比例。
* **边 $\overline{AB}$**:
...
* **边 $\overline{AC}$**:
...
根据以上比例,我们发现:
$\frac{AD}{AB} = \frac{AF}{AC} = \frac{1}{7} \quad \text{以及} \quad \frac{AE}{AB} = \frac{AG}{AC} = \frac{5}{7}$
根据泰勒斯定理(逆定理),这些比例关系意味着:
...
## 2. 计算 $\triangle ABC$ 的面积
...
## 5. 答案/结论
七边形 $AFNBCEM$ 的面积等于 $\triangle ABC$ 的面积。
$S_{AFNBCEM} = S_{ABC} = 588$
```
# Sci-Bench-AIME25'
本仓库为IPF团队打造的Sci Bench分支项目,主要收录AIME 25'的多模态思维链(Chain-of-Thought, CoT)解题方案与多样化求解路径。
## 📚 引用规范
若您在研究中使用**Sci-Bench-AIME25 (IPF/AIME25-CoT-CN)** 数据集,请引用如下文献:
bibtex
@dataset{zhang2025scibench_aime25,
title = {{Sci-Bench-AIME25}: A Multi-Modal Chain-of-Thought Dataset for Advanced Tool-Intergrated Mathematical Reasoning},
author = {Zhang, Haoxiang and Wang, Siyuan and Fang, Xueji and Zou, Xinkai and Lyu, Tiange and Cao, Siyuan and Huang, Jingyuan and Xing, Jie},
year = {2025},
publisher = {Hugging Face},
doi = {10.5281/zenodo.17112481},
url = {https://huggingface.co/datasets/IPF/AIME25-CoT-CN},
note = {Also available on Zenodo}
}
-- -- --
# 简介
## 💻 项目概览
简短模板与最终报告将发布于[Isaac个人博客](https://isaacghx.github.io/2025/08/08/solutions/Hand-made-Solution-and-CoT-for-AIME25/),Markdown模板可在data/I_2目录中获取。
## ❓ 项目初衷
- 当前多语言数学数据集较为稀缺,而兼具图文、代码块等多模态形式的数学思维链(CoT)资源更是寥寥无几。
- 本数据集可作为推理型大语言模型(Large Language Model, LLM)后训练(包括监督微调SFT与强化学习RL)的基础基准测试集。
- 数据集体量小巧但信息熵丰富,极具研究价值。
- 可有效规避单一文本式思维链解题方案导致的过拟合问题。
-- -- --
# 🗂 项目目录结构与文件格式规范
本仓库的目录结构如下:
/Sci-Bench-AIME25
├── data
│ ├── I_1
│ │ ├── problem.md
│ │ ├── ground_truth.md
│ │ ├── solution1
│ │ ├── description.md (若该目录仅包含手写图片,请在文件开头填写"ONLY HANDWRITTEN PIC"字样)
│ │ ├── code1.py
│ │ ├── code2.py
│ ├── I_2
│ ├── ...
│ ├── II_1
│ ├── ...
│
├── images(图片分辨率需不低于1080P,且文字清晰可辨。提示:图片内容无需完全易读,但需具备相对结构化的呈现形式)
│ ├── gen_images
│ │ ├── I_2
│ │ │ ├── code_1.png
│ │ ├── └── code_2.png
│ │ ├── I_6
│ │ ├── └── code_1.png
│ │ └── ...
│ └── hand_written
│ ├── II_12
│ │ ├── description_1.png
│ │ └── description_2.png
│ ├── I_11
│ │ └── description_1.png
│ └── ...
└── README.md
### `problem.md` 文件详细格式示例
markdown
求所有整数底数 $b>9$ 的和,使得 $17_b$ 是 $97_b$ 的因数。
### `ground_truth.md` 文件详细格式示例
markdown
70
### `description.md` 文件详细格式说明
需使用 `## step_n. 主题` 的格式编写分步小标题。特别注意:若Markdown文件中包含Python代码,也需将其归入对应步骤标题下,并说明代码功能。此外,Python代码需使用Markdown代码块格式包裹:` python ./code_n.py `
markdown
## 1. 线段比例分析与平行关系
首先,分析边 $\overline{AB}$ 和 $\overline{AC}$ 上的线段长度和比例。
* **边 $\overline{AB}$**:
...
* **边 $\overline{AC}$**:
...
根据以上比例,我们发现:
$\frac{AD}{AB} = \frac{AF}{AC} = \frac{1}{7} \quad ext{以及} \quad \frac{AE}{AB} = \frac{AG}{AC} = \frac{5}{7}$
根据泰勒斯定理(逆定理),这些比例关系意味着:
...
## 2. 计算 $\triangle ABC$ 的面积
...
## 5. 答案/结论
七边形 $AFNBCEM$ 的面积等于 $\triangle ABC$ 的面积。
$S_{AFNBCEM} = S_{ABC} = 588$
提供机构:
maas
创建时间:
2025-09-14



