five

reasoning-degeneration-dev/wmc-sft-warmup-v1

收藏
Hugging Face2026-03-22 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/reasoning-degeneration-dev/wmc-sft-warmup-v1
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit tags: - world-model-curiosity - sft-warmup - countdown --- # wmc-sft-warmup-v1 SFT warmup data for World Model Curiosity experiment. Two conditions: baseline (no confidence tags) and cpb (with <c>X.X</c> confidence tags). Generated from 1200 Countdown problems (5 numbers, +/-/*, targets 10-500). ## Dataset Info - **Rows**: 2400 - **Columns**: 8 ## Columns | Column | Type | Description | |--------|------|-------------| | condition | Value('string') | baseline (no confidence) or cpb (with confidence tag) | | system_prompt | Value('string') | Full system prompt used for this condition | | user_prompt | Value('string') | Countdown problem description | | assistant_response | Value('string') | Full model response including <think> block | | correct | Value('bool') | Whether the model solved the problem correctly | | confidence | Value('float64') | Annotated confidence value (CPB only, None for baseline). Sampled from Beta(8,2) if correct, Beta(2,5) if wrong. | | difficulty | Value('string') | Problem difficulty tier: easy (targets 10-80), medium (50-200), hard (100-500) | | response_length | Value('int64') | Character length of assistant response | ## Generation Parameters ```json { "script_name": "generate_sft_data.py", "model": "Qwen/Qwen3-1.7B", "description": "SFT warmup data for World Model Curiosity experiment. Two conditions: baseline (no confidence tags) and cpb (with <c>X.X</c> confidence tags). Generated from 1200 Countdown problems (5 numbers, +/-/*, targets 10-500).", "hyperparameters": { "n_problems": 1200, "temperature": 0.7, "max_tokens": 512, "correct_confidence_distribution": "Beta(8, 2), mean ~0.80", "wrong_confidence_distribution": "Beta(2, 5), mean ~0.28" }, "input_datasets": [] } ``` ## Experiment Documentation For complete experiment details, see [https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/world-model-curiosity](https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/world-model-curiosity) ## Usage ```python from datasets import load_dataset dataset = load_dataset("reasoning-degeneration-dev/wmc-sft-warmup-v1", split="train") print(f"Loaded {len(dataset)} rows") ``` --- *This dataset is tracked in [reasoning-degeneration-dev/PROJECT-MANIFEST](https://huggingface.co/datasets/reasoning-degeneration-dev/PROJECT-MANIFEST)*
提供机构:
reasoning-degeneration-dev
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作