Intermediate-Thinking-130k
收藏魔搭社区2025-08-08 更新2025-08-09 收录
下载链接:
https://modelscope.cn/datasets/HelpingAI/Intermediate-Thinking-130k
下载链接
链接失效反馈官方服务:
资源简介:
# Intermediate-Thinking-130k
A comprehensive dataset of 135,000 high-quality samples designed to advance language model reasoning capabilities through structured intermediate thinking processes. This dataset enables training and evaluation of models with sophisticated self-correction and iterative reasoning abilities across 42 languages.
## Overview
Intermediate-Thinking-130k addresses a fundamental limitation in current language models: their inability to pause, reflect, and refine reasoning during response generation. Traditional models produce responses in a single pass, often leading to incomplete or incorrect reasoning. This dataset provides examples of **Intermediate Thinking**—a cognitive approach that mirrors human reasoning processes through iterative refinement and self-correction.
### Key Innovations
- **Multi-Phase Reasoning**: Structured thinking blocks that demonstrate iterative problem-solving
- **Self-Correction Mechanisms**: Examples of error identification and mid-process refinement
- **Emotional Context Awareness**: Optional structured emotional reasoning (``) blocks
- **Multilingual Support**: Coverage across 42 languages for global applicability
- **Human-Like Cognition**: Patterns that mirror natural human reasoning processes
## Dataset Structure
### Format
Each example follows a standardized JSON structure:
```json
{
"instruction": "What is the direction of the gravitational force exerted by the Moon on an object at the Earth's equator?",
"input": "",
"output": "\nThis is a physics question about gravitational forces. The Moon exerts a gravitational pull on objects on Earth. The direction of this force would be along the line connecting the object at the equator to the center of the Moon.\n\n\nThe direction of the gravitational force is along the line connecting the object to the Moon.\n\n\nWait, I should clarify that gravitational force follows Newton's law of universal gravitation. The direction is not static—it varies depending on the Moon's position relative to Earth.\n\n\nSo, to be more precise, the gravitational force from the Moon on an object at the equator points directly toward the Moon's position in the sky at that time. As the Earth rotates and the Moon orbits, the direction of this force constantly shifts.\n\n\nMaybe I should connect this to real-world effects like ocean tides.\n\n\nThis changing gravitational pull is what drives ocean tides. When the Moon is overhead, the pull is upward; when it's on the horizon, the pull is nearly horizontal."
}
```
### Reasoning Architecture
The dataset implements a three-tier thinking framework:
#### 1. Initial Thinking (``)
- **Purpose**: First-pass problem analysis and approach formulation
- **Characteristics**: Basic understanding, initial strategy development
- **Frequency**: Present in every example
#### 2. Intermediate Thinking (``)
- **Purpose**: Self-correction and iterative refinement
- **Characteristics**: Error identification, approach adjustment, deeper exploration
- **Frequency**: Present in every example (1-17 blocks per example)
#### 3. End Thinking (``)
- **Purpose**: Final reflection for complex scenarios
- **Characteristics**: Human-like reasoning steps, comprehensive synthesis
- **Frequency**: Rare, only for complex challenges
### Emotional Context Awareness
Optional `` blocks encode contextual awareness:
```xml
Emotion ==> curiosity
Cause ==> ambiguous mathematical structure
Mind ==> analytical but cautious
Growth ==> refining symbolic manipulation
```
## Comparison with Traditional Approaches
### Chain-of-Thought (CoT) Limitations
- **Linear reasoning**: Single-pass thinking process
- **No self-correction**: Once generated, thoughts remain static
- **Limited perspective**: Single approach to problem-solving
- **No iteration**: No refinement or improvement cycles
### Intermediate Thinking Advantages
- **Iterative refinement**: Multiple thinking cycles with self-correction
- **Self-evaluation**: Recognition and correction of logical errors
- **Adaptive reasoning**: Dynamic approach adjustment based on insights
- **Multi-perspective synthesis**: Integration of multiple viewpoints
- **Human-like cognition**: Natural reasoning patterns
## Dataset Statistics
| Metric | Value |
|--------|-------|
| **Total Examples** | 135,000 |
| **Languages Supported** | 42 languages |
| **Thinking Blocks per Example** | 1-17 |
| **Topics Covered** | Mathematics, Physics, Logic, Philosophy, Ethics |
| **Format** | JSON (auto-converted to Parquet) |
| **License** | Apache 2.0 |
## Use Cases
### Primary Applications
- **Model Training**: Fine-tuning language models for intermediate reasoning
- **Performance Evaluation**: Benchmarking reasoning capabilities
- **Research Development**: Advancing AI reasoning methodologies
- **Educational Applications**: Teaching structured problem-solving
### Research Areas
- Cognitive modeling and reasoning in AI systems
- Emotional intelligence and context awareness
- Multi-step problem solving evaluation
- Self-reflective AI capabilities assessment
## Quick Start
### Installation
```bash
pip install datasets pandas
```
### Basic Usage
```python
from datasets import load_dataset
# Load the dataset
dataset = load_dataset("HelpingAI/Intermediate-Thinking-130k", split="train")
# Explore a sample
print(dataset[0]["instruction"])
print(dataset[0]["output"])
# Stream for large datasets
dataset_stream = load_dataset("HelpingAI/Intermediate-Thinking-130k", split="train", streaming=True)
for row in dataset_stream:
print(row["instruction"])
break
```
### Working with Thinking Blocks
```python
import re
def extract_thinking_blocks(text):
"""Extract all blocks from a response"""
think_pattern = r'(.*?)'
return re.findall(think_pattern, text, re.DOTALL)
# Example usage
sample = dataset[0]
thinking_blocks = extract_thinking_blocks(sample["output"])
print(f"Found {len(thinking_blocks)} thinking blocks")
```
## Data Collection Methodology
### Source and Processing
- **Origin**: Synthetic generation with human curation
- **Regeneration**: Quality improvement through synthetic regeneration
- **Quality Control**: Manual review of reasoning structure
- **Validation**: Heuristic parsing for structural integrity
- **Deduplication**: Elimination of near-identical pairs
### Processing Pipeline
1. **Synthetic Generation**: Creation of multi-step reasoning examples
2. **Regeneration**: Quality enhancement through iterative refinement
3. **Filtering**: Structural integrity and quality validation
4. **Annotation**: Emotional context where applicable
5. **Validation**: Random sample review for quality assurance
## Limitations and Considerations
### Current Limitations
- **Synthetic Nature**: Primarily generated content, not all pedagogically verified
- **Language Distribution**: Multilingual support with English-dominant distribution
- **Reasoning Variation**: Styles may vary across different examples
### Best Practices
- **Training Focus**: Use for model training rather than direct inference
- **Complementary Datasets**: Combine with domain-specific datasets
- **Diverse Evaluation**: Test across various reasoning tasks
- **Capability Testing**: Thoroughly evaluate self-correction abilities
## Contributing
We welcome contributions from the research community to advance intermediate reasoning capabilities.
### Contribution Areas
**Dataset Improvements**
- Report data quality issues or inconsistencies
- Suggest new reasoning patterns or problem types
- Contribute validated examples following established format
- Help expand language coverage and improve multilingual examples
**Research and Evaluation**
- Share evaluation results and benchmarks
- Contribute to reasoning capability assessments
- Develop new metrics for intermediate thinking
- Publish research using this dataset
**Documentation and Resources**
- Improve dataset documentation and examples
- Create tutorials or educational materials
- Develop tools for working with reasoning blocks
- Contribute to the broader reasoning research ecosystem
## Citation
If you use this dataset in your research, please cite:
```bibtex
@misc{intermediate-thinking-130k,
title = {Intermediate-Thinking-130k: A Dataset for Multi-Step Mathematical and Logical Reasoning},
author = {HelpingAI},
year = {2025},
publisher = {HelpingAI},
howpublished = {\url{https://huggingface.co/datasets/HelpingAI/Intermediate-Thinking-130k}},
license = {Apache 2.0}
}
```
## Contact and Support
- **Issues**: [Hugging Face Issues](https://huggingface.co/datasets/HelpingAI/Intermediate-Thinking-130k/issues)
- **Discussions**: [Hugging Face Discussions](https://huggingface.co/datasets/HelpingAI/Intermediate-Thinking-130k/discussions)
- **Email**: [team@helpingai.co](mailto:team@helpingai.co)
---
*Intermediate-Thinking-130k represents a significant advancement in AI reasoning research, providing the foundation for developing language models with sophisticated intermediate thinking capabilities that move beyond traditional single-pass reasoning toward more human-like cognitive processes.*
# 中间思维-130k(Intermediate-Thinking-130k)
本数据集包含13.5万条高质量样本,旨在通过结构化的中间思维流程,提升语言模型的推理能力。本数据集支持针对42种语言的、具备复杂自我修正与迭代推理能力的模型训练与评估。
## 数据集概览
中间思维-130k旨在解决当前语言模型的一项核心局限:无法在生成回复的过程中暂停、反思并优化推理流程。传统模型仅通过单次流程生成回复,常导致推理过程不完整或出现错误。本数据集提供了**中间思维(Intermediate Thinking)**的示例——一种通过迭代优化与自我修正来模拟人类推理过程的认知方法。
### 核心创新点
- **多阶段推理(Multi-Phase Reasoning)**:用于展示迭代式问题求解的结构化思维模块
- **自我修正机制(Self-Correction Mechanisms)**:包含错误识别与流程中期优化的示例
- **情感上下文感知(Emotional Context Awareness)**:可选的结构化情感推理模块
- **多语言支持(Multilingual Support)**:覆盖42种语言,适配全球应用场景
- **类人认知(Human-Like Cognition)**:模拟自然人类推理过程的模式
## 数据集结构
### 数据格式
每条样本均遵循标准化JSON格式:
json
{
"instruction": "What is the direction of the gravitational force exerted by the Moon on an object at the Earth's equator?",
"input": "",
"output": "
This is a physics question about gravitational forces. The Moon exerts a gravitational pull on objects on Earth. The direction of this force would be along the line connecting the object at the equator to the center of the Moon.
The direction of the gravitational force is along the line connecting the object to the Moon.
Wait, I should clarify that gravitational force follows Newton's law of universal gravitation. The direction is not static—it varies depending on the Moon's position relative to Earth.
So, to be more precise, the gravitational force from the Moon on an object at the equator points directly toward the Moon's position in the sky at that time. As the Earth rotates and the Moon orbits, the direction of this force constantly shifts.
Maybe I should connect this to real-world effects like ocean tides.
This changing gravitational pull is what drives ocean tides. When the Moon is overhead, the pull is upward; when it's on the horizon, the pull is nearly horizontal."
}
### 推理架构
本数据集采用三级思维框架:
#### 1. 初始思维(Initial Thinking)
- **核心用途**:首次问题分析与解题思路构建
- **典型特征**:基础认知理解与初始策略制定
- **出现频率**:所有样本均包含
#### 2. 中间思维(Intermediate Thinking)
- **核心用途**:自我修正与迭代优化
- **典型特征**:错误识别、思路调整与深度探索
- **出现频率**:所有样本均包含(单样本含1-17个思维模块)
#### 3. 收尾思维(End Thinking)
- **核心用途**:针对复杂场景的最终反思
- **典型特征**:类人推理步骤与全面综合
- **出现频率**:仅在复杂挑战场景中少量出现
### 情感上下文感知
可选的``模块用于编码上下文感知信息:
xml
Emotion ==> curiosity
Cause ==> ambiguous mathematical structure
Mind ==> analytical but cautious
Growth ==> refining symbolic manipulation
## 与传统方法的对比
### 思维链(Chain-of-Thought,CoT)的局限
- **线性推理**:单次流程式思维过程
- **无自我修正能力**:思维生成后保持静态,无法调整
- **视角单一**:仅采用单一解题思路
- **无迭代机制**:不存在优化与改进循环
### 中间思维的优势
- **迭代优化**:包含自我修正的多轮思维循环
- **自我评估**:识别并修正逻辑错误
- **自适应推理**:基于新认知动态调整解题思路
- **多视角综合**:整合多种观点
- **类人认知**:自然的推理模式
## 数据集统计数据
| 指标 | 数值 |
|--------|-------|
| **总样本数** | 135,000 |
| **支持语言** | 42种 |
| **单样本思维模块数** | 1-17 |
| **覆盖主题** | 数学、物理、逻辑、哲学、伦理学 |
| **数据格式** | JSON(可自动转换为Parquet格式) |
| **开源协议** | Apache 2.0 |
## 应用场景
### 核心应用场景
- **模型训练**:针对中间思维推理任务微调语言模型
- **性能评估**:对模型推理能力进行基准测试
- **研究开发**:推动AI推理方法的迭代升级
- **教育应用**:用于教授结构化问题求解方法
### 研究方向
- AI系统中的认知建模与推理
- 情感智能与上下文感知
- 多步问题求解能力评估
- 自反思型AI能力评估
## 快速入门
### 安装步骤
bash
pip install datasets pandas
### 基础使用方法
python
from datasets import load_dataset
# Load the dataset
dataset = load_dataset("HelpingAI/Intermediate-Thinking-130k", split="train")
# Explore a sample
print(dataset[0]["instruction"])
print(dataset[0]["output"])
# Stream for large datasets
dataset_stream = load_dataset("HelpingAI/Intermediate-Thinking-130k", split="train", streaming=True)
for row in dataset_stream:
print(row["instruction"])
break
### 思维模块提取示例
python
import re
def extract_thinking_blocks(text):
"""Extract all blocks from a response"""
think_pattern = r'(.*?)'
return re.findall(think_pattern, text, re.DOTALL)
# Example usage
sample = dataset[0]
thinking_blocks = extract_thinking_blocks(sample["output"])
print(f"Found {len(thinking_blocks)} thinking blocks")
## 数据采集方法
### 数据来源与处理流程
- **数据来源**:人工审核的合成生成样本
- **二次生成**:通过合成重生成提升数据质量
- **质量管控**:对推理结构进行人工审核
- **有效性验证**:通过启发式解析确保结构完整性
- **去重处理**:移除近似重复的样本对
### 处理流程
1. **合成生成**:构建多步推理样本
2. **二次生成**:通过迭代优化提升数据质量
3. **筛选过滤**:验证样本结构完整性与质量
4. **标注处理**:为适配样本添加情感上下文标注
5. **最终验证**:随机抽样审核以确保质量
## 局限与注意事项
### 当前局限
- **合成属性**:样本主要为生成内容,未完全经过教学验证
- **语言分布**:虽支持多语言,但样本以英语为主
- **推理风格差异**:不同样本的推理风格可能存在差异
### 最佳实践
- **优先用于训练**:建议将数据集用于模型微调而非直接推理
- **搭配其他数据集**:与领域专用数据集结合使用
- **多样化评估**:在多种推理任务中开展模型测试
- **能力测试**:全面评估模型的自我修正能力
## 贡献指南
我们欢迎研究社区为推动中间思维推理能力发展贡献力量。
### 贡献方向
**数据集优化**
- 报告数据质量问题或不一致之处
- 提出新的推理模式或问题类型建议
- 按照既定格式提交经过验证的样本
- 协助扩展语言覆盖范围并优化多语言样本
**研究与评估**
- 分享评估结果与基准测试数据
- 参与推理能力评估相关工作
- 开发针对中间思维的新型评估指标
- 使用本数据集发表相关研究成果
**文档与资源**
- 完善数据集文档与示例
- 编写教程或教育材料
- 开发用于处理思维模块的工具
- 参与构建更广泛的推理研究生态
## 引用方式
如果您在研究中使用本数据集,请引用以下内容:
bibtex
@misc{intermediate-thinking-130k,
title = {Intermediate-Thinking-130k: A Dataset for Multi-Step Mathematical and Logical Reasoning},
author = {HelpingAI},
year = {2025},
publisher = {HelpingAI},
howpublished = {url{https://huggingface.co/datasets/HelpingAI/Intermediate-Thinking-130k}},
license = {Apache 2.0}
}
## 联系与支持
- **问题反馈**:[Hugging Face 社区议题](https://huggingface.co/datasets/HelpingAI/Intermediate-Thinking-130k/issues)
- **讨论交流**:[Hugging Face 社区讨论](https://huggingface.co/datasets/HelpingAI/Intermediate-Thinking-130k/discussions)
- **邮箱联系**:[team@helpingai.co](mailto:team@helpingai.co)
---
*中间思维-130k是AI推理研究领域的一项重要进展,为开发具备复杂中间思维能力的语言模型奠定了基础——这类模型能够跳出传统单次流程推理的局限,实现更接近人类的认知过程。*
提供机构:
maas
创建时间:
2025-08-07



