prithivMLmods/Pegasus-Tiny-250K
收藏Hugging Face2025-11-26 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/prithivMLmods/Pegasus-Tiny-250K
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-generation
- question-answering
language:
- en
tags:
- code-x
- code
- math
- agent
size_categories:
- 100K<n<1M
---

# **Pegasus-Tiny-250K**
> **Pegasus-Tiny-250K** is a compact, high-quality mathematical reasoning dataset curated by **prithivMLmods** and hosted on Hugging Face. It contains approximately **~291K structured reasoning traces** in Parquet format, optimized for efficient training, evaluation, and reasoning-aligned fine-tuning of AI models. This dataset provides diverse mathematics-focused problem statements paired with detailed step-by-step reasoning solutions. Pegasus-Tiny-250K emphasizes clear reasoning flow and structured problem solving, making it suitable for training lightweight reasoning models, educational tools, and benchmarking tasks.
## Quick Start
```bash
pip install -U datasets
```
```python
from datasets import load_dataset
dataset = load_dataset("prithivMLmods/Pegasus-Tiny-250K", split="train")
```
## Dataset Overview
| Feature | Value |
| ---------------------- | ------------------------------------------------------ |
| **Rows** | ~291,505 |
| **Preview-shard rows** | 221,720 |
| **Size[partial]** | 2.22 GB |
| **Format** | Parquet |
| **Language** | English |
| **License** | Apache-2.0 |
| **Primary Focus** | Mathematical reasoning, structured step-wise solutions |
## Data Structure
* **problem**: Math or logic-based task prompt
* **solution**: Chain-of-thought reasoning ending with final answer
## Source Inputs
Includes reasoning from:
* **Xen-Arc AI CodeX-2M-Thinking**: [Small traces, depending on the specific problem] Code-x structured programming logic, [XenArcAI/CodeX-2M-Thinking](https://huggingface.co/datasets/XenArcAI/CodeX-2M-Thinking)
* **Math-aligned custom prompts** : [Gargantua-R1-Wee](https://huggingface.co/datasets/prithivMLmods/Gargantua-R1-Wee)
* **Hybrid algorithmic reasoning tasks**: [Gargantua-R1-Wee](https://huggingface.co/datasets/prithivMLmods/Gargantua-R1-Wee)
## Use Cases
* Fine-tuning compact reasoning models
* Training models on problem-solving trace generation
* Benchmarking math reasoning ability
* Research in chain-of-thought modeling
* Educational AI and tutoring systems
## Maintainer
| Author | Last Updated |
| --------------------------------------------------------- | ------------ |
| **[prithivMLmods](https://huggingface.co/prithivMLmods)** | **Nov 2025** |
提供机构:
prithivMLmods



