Jongsim/claude-opus-4.6-reasoning-12k
收藏Hugging Face2026-04-04 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Jongsim/claude-opus-4.6-reasoning-12k
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: other
tags:
- reasoning
- chain-of-thought
- distillation
- claude-opus
- math
- logic
size_categories:
- 10K<n<100K
task_categories:
- text-generation
pretty_name: Claude Opus 4.6 Reasoning 12K
dataset_info:
features:
- name: id
dtype: string
- name: source
dtype: string
- name: messages
dtype: string
- name: domain
dtype: string
- name: difficulty
dtype: string
- name: teacher_model
dtype: string
splits:
- name: train
num_examples: 12842
---
# Claude Opus 4.6 Reasoning 12K
A curated dataset of **12,842** reasoning samples distilled from Claude Opus 4.6 and related models.
Each sample contains detailed chain-of-thought reasoning traces alongside final answers.
## Dataset Description
This dataset is a merged and deduplicated collection from multiple high-quality Claude Opus reasoning datasets.
It covers math, logic, science, and code domains with step-by-step reasoning.
## Data Sources
| Source | Count |
|--------|-------|
| [Roman1111111/claude-opus-4.6-10000x](https://huggingface.co/datasets/Roman1111111/claude-opus-4.6-10000x) | 9,633 |
| [nohurry/Opus-4.6-Reasoning-3000x-filtered](https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered) | 2,326 |
| [Jackrong/Qwen3.5-reasoning-700x](https://huggingface.co/datasets/Jackrong/Qwen3.5-reasoning-700x) | 633 |
| [TeichAI/claude-4.5-opus-high-reasoning-250x](https://huggingface.co/datasets/TeichAI/claude-4.5-opus-high-reasoning-250x) | 250 |
| **Total** | **12,842** |
## Dataset Structure
### Fields
| Field | Type | Description |
|-------|------|-------------|
| `id` | string | Unique sample identifier |
| `source` | string | Original dataset source |
| `messages` | string | JSON array of conversation messages (user + assistant with reasoning) |
| `domain` | string | Problem domain |
| `difficulty` | string | Problem difficulty level |
| `teacher_model` | string | Model used to generate the reasoning |
### Message Format
Each `messages` field is a JSON array with the following structure:
```json
[
{"role": "user", "content": "problem statement"},
{"role": "assistant", "content": "final answer", "reasoning": "step-by-step chain-of-thought"}
]
```
### Domain Distribution
| Domain | Count |
|--------|-------|
| simple logic and math | 7,473 |
| math | 4,447 |
| code | 366 |
| science | 166 |
| instruction_following | 140 |
### Difficulty Distribution
| Difficulty | Count |
|------------|-------|
| medium | 11,827 |
| hard | 61 |
| phd | 71 |
### Teacher Model Distribution
| Model | Count |
|-------|-------|
| claude-opus-4.6 | 11,959 |
| Qwen3.5-27B | 633 |
| claude-opus-4.5 | 250 |
## Usage
```python
from datasets import load_dataset
import json
ds = load_dataset("Jongsim/claude-opus-4.6-reasoning-12k")
# Parse messages
sample = ds["train"][0]
messages = json.loads(sample["messages"])
print(messages[0]["content"]) # user question
print(messages[1]["reasoning"]) # chain-of-thought
print(messages[1]["content"]) # final answer
```
## Related Datasets
- 🇰🇷 Korean translation: [Jongsim/claude-opus-4.6-reasoning-12k-ko](https://huggingface.co/datasets/Jongsim/claude-opus-4.6-reasoning-12k-ko)
提供机构:
Jongsim



