endurasolution/ron-math-dataset

Name: endurasolution/ron-math-dataset
Creator: endurasolution
Published: 2026-01-20 16:28:22
License: 暂无描述

Hugging Face2026-01-20 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/endurasolution/ron-math-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - text-generation - table-question-answering language: - en tags: - math - mathematics - reasoning - synthetic - instruction-tuning - openron size_categories: - 100M<n<1B --- # OPENRON Math Instruction Dataset A **massive-scale mathematical reasoning dataset** developed by **OPENRON**, designed for training and evaluating high-performance large language models (LLMs) on mathematical instruction following and reasoning tasks. --- ## Dataset Overview The **OPENRON Math Instruction Dataset** contains high-quality, synthetic mathematical instruction–response pairs generated at scale. It is specifically curated to support **reasoning-focused training**, including structured problem solving with step-by-step logic. Each data sample follows a standardized schema and is suitable for supervised fine-tuning (SFT), instruction tuning, and reasoning research. --- ## Key Features - **Large Scale**: Available in multiple sizes ranging from **1M to 500M samples** - **Reasoning-Oriented**: Includes detailed **Chain-of-Thought (CoT)** reasoning - **LaTeX Support**: Mathematical expressions are written in LaTeX for clarity and correctness - **Structured Format**: JSONL with clearly separated fields - **Model-Friendly**: Optimized for modern reasoning-capable LLMs --- ## Dataset Structure Each JSONL entry contains the following fields: ```json { "instruction": "Problem description or question", "input": "Optional additional context (may be empty)", "output": "Detailed solution with step-by-step reasoning" }

提供机构：

endurasolution

5,000+

优质数据集

54 个

任务类型

进入经典数据集