five

sklmindforge/llm_addition_training

收藏
Hugging Face2026-03-20 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/sklmindforge/llm_addition_training
下载链接
链接失效反馈
官方服务:
资源简介:
# LLM Addition Training Dataset (Odometer-Style Logic) ## Overview This dataset is designed to teach Large Language Models (LLMs) the foundational logic of addition through **Chain-of-Thought (CoT)** and **Place-Value Expansion**. Instead of simple $A + B = C$ pairs, this dataset forces the model to "think" through the process of splitting numbers into their constituent parts (units, tens, hundreds, thousands) and adding them step-by-step. ## Dataset Structure The dataset consists of approximately 60,000 examples across three complexity tiers: 1. **Foundation Tier (1-20):** Direct recall for small-number addition. 2. **Expansion Tier (21-150):** Introduction to splitting tens and units. 3. **Odometer Tier (151-99,999):** Multi-digit addition using recursive place-value logic. ### Format Each entry follows a consistent text completion format: - **Question:** The addition problem. - **Think:** The logical breakdown (e.g., "Split 532 into 500 then 30 then 2"). - **Result:** The final verified sum. ## Intended Use This dataset is ideal for: - **Continued Pre-training:** Injecting arithmetic stability into small models (0.1B - 3B parameters). - **Fine-Tuning:** Teaching a model to follow a specific "scratchpad" reasoning format. - **Arithmetic Benchmarking:** Testing if a model can handle multi-digit carry-over logic. ## Logic Example **Question:** 53242 + 123 **Think:** Split 123 into 100 then 20 then 3. 53242 + 100 = 53342 -> 53342 + 20 = 53362 -> 53362 + 3 = 53365 **Result:** 53365 ## License MIT
提供机构:
sklmindforge
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作