PXIN/reasoning-cocktail-6k
收藏Hugging Face2026-03-18 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/PXIN/reasoning-cocktail-6k
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: apache-2.0
size_categories:
- 1K<n<10K
task_categories:
- text-generation
dataset_info:
features:
- name: messages
list:
- name: role
dtype: string
- name: content
dtype: string
splits:
- name: train
num_examples: 5114
---
# Summarized Reasoning Cocktail 6k
### **Dataset Overview**
**Summarized Reasoning Cocktail 6k** is a meticulously curated and blended dataset consisting of **5,114 examples** in **ChatML format**. It is specifically designed to fine-tune sub-1B and small parameter models (like Qwen 0.8B) by balancing **Claude 4.6 Opus summarized reasoning** with high-quality human tone, instruction following, and formatting.
## Technical Note: "Summarized" vs. "Raw" CoT
This dataset uses "Extended Thinking" summaries from Claude 4.6 Opus. While not the raw, unedited internal CoT (which is restricted by the Messages API), these summaries represent a massive upgrade in logical structure and articulation for small-parameter models.
### **The "Golden Ratio" Sources**
The dataset is a balanced mix of four specialized sources to prevent catastrophic forgetting:
1. **[PXIN/reasoning-chatml-3k](https://huggingface.co/datasets/PXIN/reasoning-chatml-3k)** (50%): ~2,800 rows of Claude Opus **summarized reasoning** distillation.
2. **[HuggingFaceH4/no_robots](https://huggingface.co/datasets/HuggingFaceH4/no_robots)** (20%): ~1,200 rows of 100% human-written data for natural tone.
3. **[teknium/OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5)** (20%): ~1,200 rows of general conversational intelligence.
4. **[LDJnr/Capybara](https://huggingface.co/datasets/LDJnr/Capybara)** (10%): ~600 rows for complex markdown and table formatting.
### **Processing & Cleaning**
* **Format Unification**: All sources converted strictly to ChatML `messages` format.
* **Refusal Removal**: 103 model refusals (e.g., "I cannot provide", "As an AI") were purged.
* **Shuffled**: Balanced for simultaneous learning of reasoning and formatting.
### **Intended Use**
Ideal for Unsloth fine-tuning on models where parameter efficiency is critical. Teaches small models to "think" with the depth and clarity of frontier architectures.
提供机构:
PXIN



