five

endurasolution/ron-math-dataset

收藏
Hugging Face2026-01-20 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/endurasolution/ron-math-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-generation - table-question-answering language: - en tags: - math - mathematics - reasoning - synthetic - instruction-tuning - openron size_categories: - 100M<n<1B --- # OPENRON Math Instruction Dataset A **massive-scale mathematical reasoning dataset** developed by **OPENRON**, designed for training and evaluating high-performance large language models (LLMs) on mathematical instruction following and reasoning tasks. --- ## Dataset Overview The **OPENRON Math Instruction Dataset** contains high-quality, synthetic mathematical instruction–response pairs generated at scale. It is specifically curated to support **reasoning-focused training**, including structured problem solving with step-by-step logic. Each data sample follows a standardized schema and is suitable for supervised fine-tuning (SFT), instruction tuning, and reasoning research. --- ## Key Features - **Large Scale**: Available in multiple sizes ranging from **1M to 500M samples** - **Reasoning-Oriented**: Includes detailed **Chain-of-Thought (CoT)** reasoning - **LaTeX Support**: Mathematical expressions are written in LaTeX for clarity and correctness - **Structured Format**: JSONL with clearly separated fields - **Model-Friendly**: Optimized for modern reasoning-capable LLMs --- ## Dataset Structure Each JSONL entry contains the following fields: ```json { "instruction": "Problem description or question", "input": "Optional additional context (may be empty)", "output": "Detailed solution with step-by-step reasoning" }
提供机构:
endurasolution
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作