GSM8K-Struct
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/OpenCausaLab/StructuralGeneration
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是从原始GSM8K数据集生成的,包含了39K个问题,这些问题带有标记的中间步骤,并且设立了一个难度更高的基准。该数据集是通过基于原始GSM8K数据集的逐步扩展过程生成的,旨在提升大型语言模型(LLMs)的推理能力。其规模包括39K个样本用于训练,6.1K个样本用于测试,任务重点是数学推理。
This dataset is generated from the original GSM8K dataset, containing 39K questions with annotated intermediate reasoning steps and establishing a more challenging benchmark. It is constructed via a stepwise expansion process based on the original GSM8K dataset, aiming to enhance the reasoning capabilities of Large Language Models (LLMs). The dataset consists of 39K training samples and 6.1K test samples, with the task focusing on mathematical reasoning.
提供机构:
OpenCausaLab



