reasoning-degeneration-dev/wmc-sft-warmup-v1
收藏Hugging Face2026-03-22 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/reasoning-degeneration-dev/wmc-sft-warmup-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
tags:
- world-model-curiosity
- sft-warmup
- countdown
---
# wmc-sft-warmup-v1
SFT warmup data for World Model Curiosity experiment. Two conditions: baseline (no confidence tags) and cpb (with <c>X.X</c> confidence tags). Generated from 1200 Countdown problems (5 numbers, +/-/*, targets 10-500).
## Dataset Info
- **Rows**: 2400
- **Columns**: 8
## Columns
| Column | Type | Description |
|--------|------|-------------|
| condition | Value('string') | baseline (no confidence) or cpb (with confidence tag) |
| system_prompt | Value('string') | Full system prompt used for this condition |
| user_prompt | Value('string') | Countdown problem description |
| assistant_response | Value('string') | Full model response including <think> block |
| correct | Value('bool') | Whether the model solved the problem correctly |
| confidence | Value('float64') | Annotated confidence value (CPB only, None for baseline). Sampled from Beta(8,2) if correct, Beta(2,5) if wrong. |
| difficulty | Value('string') | Problem difficulty tier: easy (targets 10-80), medium (50-200), hard (100-500) |
| response_length | Value('int64') | Character length of assistant response |
## Generation Parameters
```json
{
"script_name": "generate_sft_data.py",
"model": "Qwen/Qwen3-1.7B",
"description": "SFT warmup data for World Model Curiosity experiment. Two conditions: baseline (no confidence tags) and cpb (with <c>X.X</c> confidence tags). Generated from 1200 Countdown problems (5 numbers, +/-/*, targets 10-500).",
"hyperparameters": {
"n_problems": 1200,
"temperature": 0.7,
"max_tokens": 512,
"correct_confidence_distribution": "Beta(8, 2), mean ~0.80",
"wrong_confidence_distribution": "Beta(2, 5), mean ~0.28"
},
"input_datasets": []
}
```
## Experiment Documentation
For complete experiment details, see [https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/world-model-curiosity](https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/world-model-curiosity)
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("reasoning-degeneration-dev/wmc-sft-warmup-v1", split="train")
print(f"Loaded {len(dataset)} rows")
```
---
*This dataset is tracked in [reasoning-degeneration-dev/PROJECT-MANIFEST](https://huggingface.co/datasets/reasoning-degeneration-dev/PROJECT-MANIFEST)*
提供机构:
reasoning-degeneration-dev



