five

AIME25-CoT-CN

收藏
魔搭社区2025-11-30 更新2025-09-20 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/AIME25-CoT-CN
下载链接
链接失效反馈
官方服务:
资源简介:
# Sci-Bench-AIME25' This repo is a branch of Sci Bench made by IPF team. Mainly include the AIME 25' solution with multi-modal CoT and diverse solving path. ## 📚 Cite If you use the **Sci-Bench-AIME25 (IPF/AIME25-CoT-CN)** dataset in your research, please cite: ```bibtex @dataset{zhang2025scibench_aime25, title = {{Sci-Bench-AIME25}: A Multi-Modal Chain-of-Thought Dataset for Advanced Tool-Intergrated Mathematical Reasoning}, author = {Zhang, Haoxiang and Wang, Siyuan and Fang, Xueji and Zou, Xinkai and Lyu, Tiange and Cao, Siyuan and Huang, Jingyuan and Xing, Jie}, year = {2025}, publisher = {Hugging Face}, doi = {10.5281/zenodo.17112481}, url = {https://huggingface.co/datasets/IPF/AIME25-CoT-CN}, note = {Also available on Zenodo} } ``` -- -- -- # Brief intro ## 💻 Overview A brief template and final report will be posted in [Isaac's Blog](https://isaacghx.github.io/2025/08/08/solutions/Hand-made-Solution-and-CoT-for-AIME25/) And the markdown template can be found in data/I_2 ## ❓ Why we do this? - The multi-lingual datasets are scarce, while the CoT of Math is even less, no matter whether the CoT or the solution contains pictures, code blocks, etc. - It can be used as a base benchmark for reasoning LLM post-training: SFT & RL. - It's tiny but entropy-rich. - It can solve the issue of overfitting on certain single, text-based solution CoT. -- -- -- # 🗡 Template This repo's structure is formed as: ``` /Sci-Bench-AIME25 ├── data │ ├── I_1 │ │ ├── problem.md │ │ ├── ground_truth.md │ │ ├── solution1 │ │ ├── description.md (if it only bears (a) handwritten picture(s), please fill this md with "ONLY HANDWRITTEN PIC" at the very beginning. ) │ │ ├── code1.py │ │ ├── code2.py │ ├── I_2 │ ├── ... │ ├── II_1 │ ├── ... │ ├── images(image should be higher than 1080P resolution and ensure the words are clear. Tips: it can be unease to recognize but should be formed relative structured) │ ├── gen_images │ │ ├── I_2 │ │ │ ├── code_1.png │ │ ├── └── code_2.png │ │ ├── I_6 │ │ ├── └── code_1.png │ │ └── ... │ └── hand_written │ ├── II_12 │ │ ├── description_1.png │ │ └── description_2.png │ ├── I_11 │ │ └── description_1.png │ └── ... └── README.md ``` - Detail format of `problem.md` ```markdown 求所有整数底数 $b>9$ 的和,使得 $17_b$ 是 $97_b$ 的因数。 ``` - Detail format of `ground_truth.md` ```markdown 70 ``` - Detail format of `description.md`: please use `## step_n. xxx` to make the step-by-step subtitle. More to mention: when there exists python code in the markdown `description.md`, you should also format into a step-wise title `## step_n. xxx` and elaborate the function of it. Furthermore, Python code should be encapsulated by md tag: ```python ./code_n.py ``` ```markdown ## 1. 线段比例分析与平行关系 首先,分析边 $\overline{AB}$ 和 $\overline{AC}$ 上的线段长度和比例。 * **边 $\overline{AB}$**: ... * **边 $\overline{AC}$**: ... 根据以上比例,我们发现: $\frac{AD}{AB} = \frac{AF}{AC} = \frac{1}{7} \quad \text{以及} \quad \frac{AE}{AB} = \frac{AG}{AC} = \frac{5}{7}$ 根据泰勒斯定理(逆定理),这些比例关系意味着: ... ## 2. 计算 $\triangle ABC$ 的面积 ... ## 5. 答案/结论 七边形 $AFNBCEM$ 的面积等于 $\triangle ABC$ 的面积。 $S_{AFNBCEM} = S_{ABC} = 588$ ```

# Sci-Bench-AIME25' 本仓库为IPF团队打造的Sci Bench分支项目,主要收录AIME 25'的多模态思维链(Chain-of-Thought, CoT)解题方案与多样化求解路径。 ## 📚 引用规范 若您在研究中使用**Sci-Bench-AIME25 (IPF/AIME25-CoT-CN)** 数据集,请引用如下文献: bibtex @dataset{zhang2025scibench_aime25, title = {{Sci-Bench-AIME25}: A Multi-Modal Chain-of-Thought Dataset for Advanced Tool-Intergrated Mathematical Reasoning}, author = {Zhang, Haoxiang and Wang, Siyuan and Fang, Xueji and Zou, Xinkai and Lyu, Tiange and Cao, Siyuan and Huang, Jingyuan and Xing, Jie}, year = {2025}, publisher = {Hugging Face}, doi = {10.5281/zenodo.17112481}, url = {https://huggingface.co/datasets/IPF/AIME25-CoT-CN}, note = {Also available on Zenodo} } -- -- -- # 简介 ## 💻 项目概览 简短模板与最终报告将发布于[Isaac个人博客](https://isaacghx.github.io/2025/08/08/solutions/Hand-made-Solution-and-CoT-for-AIME25/),Markdown模板可在data/I_2目录中获取。 ## ❓ 项目初衷 - 当前多语言数学数据集较为稀缺,而兼具图文、代码块等多模态形式的数学思维链(CoT)资源更是寥寥无几。 - 本数据集可作为推理型大语言模型(Large Language Model, LLM)后训练(包括监督微调SFT与强化学习RL)的基础基准测试集。 - 数据集体量小巧但信息熵丰富,极具研究价值。 - 可有效规避单一文本式思维链解题方案导致的过拟合问题。 -- -- -- # 🗂 项目目录结构与文件格式规范 本仓库的目录结构如下: /Sci-Bench-AIME25 ├── data │ ├── I_1 │ │ ├── problem.md │ │ ├── ground_truth.md │ │ ├── solution1 │ │ ├── description.md (若该目录仅包含手写图片,请在文件开头填写"ONLY HANDWRITTEN PIC"字样) │ │ ├── code1.py │ │ ├── code2.py │ ├── I_2 │ ├── ... │ ├── II_1 │ ├── ... │ ├── images(图片分辨率需不低于1080P,且文字清晰可辨。提示:图片内容无需完全易读,但需具备相对结构化的呈现形式) │ ├── gen_images │ │ ├── I_2 │ │ │ ├── code_1.png │ │ ├── └── code_2.png │ │ ├── I_6 │ │ ├── └── code_1.png │ │ └── ... │ └── hand_written │ ├── II_12 │ │ ├── description_1.png │ │ └── description_2.png │ ├── I_11 │ │ └── description_1.png │ └── ... └── README.md ### `problem.md` 文件详细格式示例 markdown 求所有整数底数 $b>9$ 的和,使得 $17_b$ 是 $97_b$ 的因数。 ### `ground_truth.md` 文件详细格式示例 markdown 70 ### `description.md` 文件详细格式说明 需使用 `## step_n. 主题` 的格式编写分步小标题。特别注意:若Markdown文件中包含Python代码,也需将其归入对应步骤标题下,并说明代码功能。此外,Python代码需使用Markdown代码块格式包裹:` python ./code_n.py ` markdown ## 1. 线段比例分析与平行关系 首先,分析边 $\overline{AB}$ 和 $\overline{AC}$ 上的线段长度和比例。 * **边 $\overline{AB}$**: ... * **边 $\overline{AC}$**: ... 根据以上比例,我们发现: $\frac{AD}{AB} = \frac{AF}{AC} = \frac{1}{7} \quad ext{以及} \quad \frac{AE}{AB} = \frac{AG}{AC} = \frac{5}{7}$ 根据泰勒斯定理(逆定理),这些比例关系意味着: ... ## 2. 计算 $\triangle ABC$ 的面积 ... ## 5. 答案/结论 七边形 $AFNBCEM$ 的面积等于 $\triangle ABC$ 的面积。 $S_{AFNBCEM} = S_{ABC} = 588$
提供机构:
maas
创建时间:
2025-09-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作