five

Math-Glot-Cleaned-10K

收藏
魔搭社区2025-12-10 更新2025-06-28 收录
下载链接:
https://modelscope.cn/datasets/prithivMLmods/Math-Glot-Cleaned-10K
下载链接
链接失效反馈
官方服务:
资源简介:
# **Math-Glot-Cleaned-10K** **Math-Glot-Cleaned-10K** is a curated subset of 10,000 math and code reasoning samples, extracted and cleaned from the original [NVIDIA AceReason-1.1-SFT](https://huggingface.co/datasets/NVIDIA/AceReason-1.1-SFT) dataset. This refined version focuses exclusively on high-quality, structured mathematical prompts paired with chain-of-thought style reasoning. ## Dataset Summary * **Source**: Derived from NVIDIA's AceReason-1.1-SFT dataset * **Total Entries**: 10,000 * **Format**: Text-to-text (input → output) * **Modality**: Text * **Language**: English * **License**: Apache 2.0 Each entry in this dataset contains: * **input**: A mathematical or computational problem, often using LaTeX-style expressions * **output**: A detailed reasoning response, typically in chain-of-thought format ## Processing Details The following steps were applied during dataset creation: * **Filtered**: Retained only math- and code-related samples from the full AceReason dataset * **Normalized**: Standardized LaTeX formatting, spacing, and markdown structure * **Cleaned**: Removed noisy, unrelated, or improperly formatted records * **Column Reduction**: Dropped unnecessary columns from the original dataset ## Use Cases This dataset is intended for: * Fine-tuning models on math reasoning tasks * Developing instruction-following LLMs for structured problem solving * Benchmarking reasoning capabilities of large models in math or algorithmic domains ## Example | input | output | | ----------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | | Find all prime numbers $p$ such that $p^3 + 1 = k^2$. | `<think>` Okay, let's see. I need to find all prime numbers p where p cubed plus one equals a perfect square... | ## Citation If using this dataset, please cite the original [AceReason-1.1-SFT dataset](https://huggingface.co/datasets/NVIDIA/AceReason-1.1-SFT) and credit this cleaned 10K subset for its focused structure.

# **Math-Glot-Cleaned-10K** **Math-Glot-Cleaned-10K** 是从原始[NVIDIA AceReason-1.1-SFT](https://huggingface.co/datasets/NVIDIA/AceReason-1.1-SFT)数据集提取并清洗得到的精选子集,共包含10000条数学与代码推理样本。该优化版本仅聚焦于高质量、结构化的数学提示词与思维链(chain-of-thought)风格推理内容的配对组合。 ## 数据集概述 * **来源**:衍生自NVIDIA的AceReason-1.1-SFT数据集 * **总样本数**:10,000 * **格式**:文本到文本(输入→输出) * **模态**:文本 * **语言**:英语 * **授权协议**:Apache 2.0 该数据集的每条样本均包含: * **输入(input)**:一道数学或计算问题,通常采用LaTeX格式的表达式 * **输出(output)**:详细的推理响应,一般采用思维链(chain-of-thought)格式 ## 数据处理细节 数据集构建过程中执行了以下步骤: * **筛选**:仅保留原始AceReason数据集中与数学和代码相关的样本 * **标准化**:统一LaTeX格式、空格排版与Markdown结构 * **清洗**:移除存在噪声、无关内容或格式错误的记录 * **列精简**:删除原始数据集中的非必要列 ## 应用场景 本数据集适用于: * 针对数学推理任务的模型微调 * 开发面向结构化问题求解的指令遵循型大语言模型(Large Language Model,LLM) * 针对大模型在数学或算法领域的推理能力进行基准测试 ## 示例 | 输入(input) | 输出(output) | | ----------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | | 找出所有满足 $p^3 + 1 = k^2$ 的质数 $p$ | `<think>` 好的,我们来看看。我需要找出所有质数 $p$,使得 $p$ 的三次方加1等于一个完全平方数... | ## 引用说明 若使用本数据集,请引用原始[NVIDIA AceReason-1.1-SFT数据集](https://huggingface.co/datasets/NVIDIA/AceReason-1.1-SFT),并注明本清洗后的10K子集的结构化优化工作。
提供机构:
maas
创建时间:
2025-06-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作