huggingworld/gsm8k
收藏Hugging Face2026-04-26 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/huggingworld/gsm8k
下载链接
链接失效反馈官方服务:
资源简介:
GSM8K(Grade School Math 8K)是一个包含8.5K个高质量、语言多样的小学数学文字题的数据集。该数据集的创建旨在支持对需要多步推理的基础数学问题进行问答的任务。这些问题的解决需要2到8个步骤,主要涉及使用基本算术运算(+ − ×÷)进行一系列基础计算以得出最终答案。一个聪明的中学生应该能够解决所有问题:根据论文,问题不需要超出早期代数水平的概念,绝大多数问题无需显式定义变量即可解决。解决方案以自然语言形式提供,而不是纯数学表达式。从论文中可以看出:“我们相信这是最通用的数据格式,并期望它能揭示大型语言模型内部独白的特性。”
GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning. These problems take between 2 and 8 steps to solve. Solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the final answer. A bright middle school student should be able to solve every problem: from the paper, Problems require no concepts beyond the level of early Algebra, and the vast majority of problems can be solved without explicitly defining a variable. Solutions are provided in natural language, as opposed to pure math expressions. From the paper: We believe this is the most generally useful data format, and we expect it to shed light on the properties of large language models’ internal monologues
提供机构:
huggingworld



