five

prithivMLmods/Gacrux-Tiny-1M

收藏
Hugging Face2025-11-26 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/prithivMLmods/Gacrux-Tiny-1M
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-generation - question-answering language: - en tags: - code-x - code - math - agent size_categories: - 1M<n<10M --- ![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/u4BHIlU7aiN6J6abaSHbJ.png) # **Gacrux-Tiny-1M** > **Gacrux-Tiny-1M** is a compact, high-quality reasoning dataset curated by **prithivMLmods**, containing **~1.06M chain-of-thought reasoning traces** optimized for mathematical problem solving, algorithmic coding challenges, and structured reasoning across competitive programming tasks. This dataset is ideal for lightweight reasoning model training and benchmarking. The dataset provides real structured problem statements with detailed reasoning step-by-step solutions that demonstrate problem-solving methods relevant for AI tutoring systems, reasoning LLMs, and code-based reasoning tasks. ## Quick Start ```bash pip install -U datasets ``` ```python from datasets import load_dataset dataset = load_dataset("prithivMLmods/Gacrux-Tiny-1M", split="train") ``` ## Dataset Overview | Feature | Value | | ---------------- | ---------------------------------------- | | **Total Rows** | ~1,066,324 | | **Approx. Size** | 12.3 GB | | **Format** | Parquet | | **Language** | English | | **License** | Apache-2.0 | | **Domains** | Math, competitive programming, reasoning | | **Tags** | code-x, math, code, agent | ## Data Structure * **problem**: Task description from math, programming, and logic domains * **solution**: Chain-of-thought reasoning and final resolution ## Source Inputs Includes reasoning from: * **Xen-Arc AI CodeX-2M-Thinking**: [Small traces, depending on the specific problem] Code-x structured programming logic, [XenArcAI/CodeX-2M-Thinking](https://huggingface.co/datasets/XenArcAI/CodeX-2M-Thinking) * **Math-aligned custom prompts** : [Gargantua-R1-Compact](https://huggingface.co/datasets/prithivMLmods/Gargantua-R1-Compact) * **Hybrid algorithmic reasoning tasks**: [Gargantua-R1-Compact](https://huggingface.co/datasets/prithivMLmods/Gargantua-R1-Compact) ## Ideal Use Cases * Fine-tuning small-to-mid scale reasoning models * LLM alignment on stepwise chain-of-thought reasoning * Competitive programming tutoring and explanation agents * Math problem solver model development * Code reasoning and debugging training frameworks ## Maintainer | Author | Last Updated | | --------------------------------------------------------- | ------------ | | **[prithivMLmods](https://huggingface.co/prithivMLmods)** | **Nov 2025** |
提供机构:
prithivMLmods
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作