oieieio/NuminaMath-CoT
收藏Hugging Face2025-02-13 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/oieieio/NuminaMath-CoT
下载链接
链接失效反馈官方服务:
资源简介:
NuminaMath CoT数据集包含大约86万个数学问题,每个解决方案都采用链式思维(CoT)格式。数据来源包括中国高中数学练习题、美国和国际数学奥林匹克竞赛问题等。数据主要从在线考试纸张PDF文件和数学讨论论坛收集而来。处理步骤包括:(a)从原始PDF进行OCR识别,(b)将内容分割成问题-解决方案对,(c)翻译成英文,(d)重新排列以生成CoT推理格式,(e)最终答案格式化。
The NuminaMath CoT dataset consists of approximately 860k math problems, where each solution is formatted in a Chain of Thought (CoT) manner. The sources of the dataset range from Chinese high school math exercises to US and international mathematics olympiad competition problems. The data were primarily collected from online exam paper PDFs and mathematics discussion forums. The processing steps include (a) OCR from the original PDFs, (b) segmentation into problem-solution pairs, (c) Translation into English, (d) realignment to produce a CoT reasoning format, and (e) final answer formatting.
提供机构:
oieieio



