prithivMLmods/Pegasus-Tiny-250K

Name: prithivMLmods/Pegasus-Tiny-250K
Creator: prithivMLmods
Published: 2025-11-26 04:21:13
License: 暂无描述

Hugging Face2025-11-26 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/prithivMLmods/Pegasus-Tiny-250K

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - text-generation - question-answering language: - en tags: - code-x - code - math - agent size_categories: - 100K<n<1M --- ![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/qnCISDW9_NK4KCp_PAaG4.png) # **Pegasus-Tiny-250K** > **Pegasus-Tiny-250K** is a compact, high-quality mathematical reasoning dataset curated by **prithivMLmods** and hosted on Hugging Face. It contains approximately **~291K structured reasoning traces** in Parquet format, optimized for efficient training, evaluation, and reasoning-aligned fine-tuning of AI models. This dataset provides diverse mathematics-focused problem statements paired with detailed step-by-step reasoning solutions. Pegasus-Tiny-250K emphasizes clear reasoning flow and structured problem solving, making it suitable for training lightweight reasoning models, educational tools, and benchmarking tasks. ## Quick Start ```bash pip install -U datasets ``` ```python from datasets import load_dataset dataset = load_dataset("prithivMLmods/Pegasus-Tiny-250K", split="train") ``` ## Dataset Overview | Feature | Value | | ---------------------- | ------------------------------------------------------ | | **Rows** | ~291,505 | | **Preview-shard rows** | 221,720 | | **Size[partial]** | 2.22 GB | | **Format** | Parquet | | **Language** | English | | **License** | Apache-2.0 | | **Primary Focus** | Mathematical reasoning, structured step-wise solutions | ## Data Structure * **problem**: Math or logic-based task prompt * **solution**: Chain-of-thought reasoning ending with final answer ## Source Inputs Includes reasoning from: * **Xen-Arc AI CodeX-2M-Thinking**: [Small traces, depending on the specific problem] Code-x structured programming logic, [XenArcAI/CodeX-2M-Thinking](https://huggingface.co/datasets/XenArcAI/CodeX-2M-Thinking) * **Math-aligned custom prompts** : [Gargantua-R1-Wee](https://huggingface.co/datasets/prithivMLmods/Gargantua-R1-Wee) * **Hybrid algorithmic reasoning tasks**: [Gargantua-R1-Wee](https://huggingface.co/datasets/prithivMLmods/Gargantua-R1-Wee) ## Use Cases * Fine-tuning compact reasoning models * Training models on problem-solving trace generation * Benchmarking math reasoning ability * Research in chain-of-thought modeling * Educational AI and tutoring systems ## Maintainer | Author | Last Updated | | --------------------------------------------------------- | ------------ | | **[prithivMLmods](https://huggingface.co/prithivMLmods)** | **Nov 2025** |

提供机构：

prithivMLmods

5,000+

优质数据集

54 个

任务类型

进入经典数据集