five

LAP2-K-Think-v1.b

收藏
魔搭社区2025-12-03 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/prithivMLmods/LAP2-K-Think-v1.b
下载链接
链接失效反馈
官方服务:
资源简介:
![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/g94LSXim37LbRN2TJ3hOl.png) # **LAP2-K-Think-v1.b** > **LAP2-K-Think-v1.b** is the refined continuation of the LAP2 reasoning series, curated by **prithivMLmods** and hosted on Hugging Face. This version expands the dataset to **~380K entries** and focuses heavily on **coding-based mathematical reasoning**, algorithmic logic, and structured step-wise problem solving. This dataset integrates a macro mixture of reasoning traces, combined with curated mathematical and computational reasoning tasks. Each example pairs a complex prompt with a detailed reasoning-based solution formatted for training advanced code-aware and reasoning-capable LLMs. ## Quick Start ```bash pip install -U datasets ``` ```python from datasets import load_dataset dataset = load_dataset("prithivMLmods/LAP2-K-Think-v1.b", split="train") ``` ## Dataset Overview | Attribute | Value | | -------------------- | ---------------------------------------------------- | | **Rows** | ~380,190 | | **Size[partial]** | ~2.29 GB | | **Format** | Parquet | | **Language** | English | | **License** | Apache-2.0 | | **Focus Areas** | Code reasoning, algorithmic math, logic, computation | ## Structure * **problem**: A coding, math, or mixed reasoning challenge * **solution**: Structured and chain-of-thought style reasoning with final answer --- ## Source Inputs Includes reasoning from: * **Xen-Arc AI CodeX-2M-Thinking**: [Small traces, depending on the specific problem] Code-x structured programming logic, [XenArcAI/CodeX-2M-Thinking](https://huggingface.co/datasets/XenArcAI/CodeX-2M-Thinking) * **Math-aligned custom prompts** : [Gargantua-R1-Wee](https://huggingface.co/datasets/prithivMLmods/Gargantua-R1-Wee) * **Hybrid algorithmic reasoning tasks**: [Gargantua-R1-Wee](https://huggingface.co/datasets/prithivMLmods/Gargantua-R1-Wee) ## Use Cases * Fine-tuning LLMs for programming reasoning * Training models to think step-by-step before producing code * Evaluation of reasoning alignment and computational logic * Building tutoring or solver AI systems for coding and mathematics ## Maintainer | Author | Last Updated | | --------------------------------------------------------- | ------------ | | **[prithivMLmods](https://huggingface.co/prithivMLmods)** | **Nov 2025** |

![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/g94LSXim37LbRN2TJ3hOl.png) # **LAP2-K-Think-v1.b** > **LAP2-K-Think-v1.b** 是 LAP2 推理系列的精修续作,由 **prithivMLmods** 整理并托管于 Hugging Face。该版本将数据集规模扩展至约38万条数据,核心聚焦于基于编码的数学推理、算法逻辑与结构化分步问题求解。本数据集整合了推理轨迹的宏观混合集,并搭配精选的数学与计算推理任务。每个样本均将复杂提示词与基于推理的详细解决方案配对,专为训练具备高级代码理解与推理能力的大语言模型(Large Language Model, LLM)设计。 ## 快速上手 bash pip install -U datasets python from datasets import load_dataset dataset = load_dataset("prithivMLmods/LAP2-K-Think-v1.b", split="train") ## 数据集概览 | 属性名 | 值 | | -------------------- | ---------------------------------------------------- | | **数据行数** | 约380,190条 | | **[部分]大小** | 约2.29 GB | | **数据格式** | Parquet | | **语言** | 英语 | | **许可证** | Apache-2.0 | | **核心领域** | 代码推理、算法数学、逻辑推理、计算 | ## 数据结构 * **problem**:涵盖编码、数学或混合推理挑战 * **solution**:采用结构化链式思维风格的推理过程,并包含最终答案 --- ## 来源输入 包含来自以下数据源的推理轨迹: * **Xen-Arc AI CodeX-2M-Thinking**:[针对特定问题的小型推理轨迹] 采用代码-x结构化编程逻辑,详见 [XenArcAI/CodeX-2M-Thinking](https://huggingface.co/datasets/XenArcAI/CodeX-2M-Thinking) * **数学对齐自定义提示词**:详见 [Gargantua-R1-Wee](https://huggingface.co/datasets/prithivMLmods/Gargantua-R1-Wee) * **混合算法推理任务**:详见 [Gargantua-R1-Wee](https://huggingface.co/datasets/prithivMLmods/Gargantua-R1-Wee) ## 应用场景 * 面向编程推理的大语言模型微调 * 训练模型在生成代码前进行分步思考 * 评估推理对齐性与计算逻辑能力 * 构建面向编码与数学的辅导或求解AI智能体(AI Agent)系统 ## 维护者 | 作者 | 最后更新时间 | | --------------------------------------------------------- | ------------ | | **[prithivMLmods](https://huggingface.co/prithivMLmods)** | **2025年11月** |
提供机构:
maas
创建时间:
2025-11-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作