reasoning-degeneration-dev/precommittal-canary-results-v1
收藏Hugging Face2026-03-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/reasoning-degeneration-dev/precommittal-canary-results-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
tags:
- precommittal
- canary
- cosine-similarity
- layer-sweep
- probing
---
# precommittal-canary-results-v1
Canary shard results for precommittal experiment. Contains signal check metrics, layer sweep results (layers 8/17/26/35, alpha 0.8-0.95), and answer extraction check samples. Canary FAILED: rho≈0 at all layers, cosine@25%<0.7.
## Dataset Info
- **Rows**: 41
- **Columns**: 17
## Columns
| Column | Type | Description |
|--------|------|-------------|
| artifact_type | Value('string') | Type: signal_check, layer_sweep, layer_sweep_alpha, or answer_extraction_check |
| key | Value('string') | Identifier within artifact type (e.g., layer index, sample index) |
| model | Value('string') | Model used for generation (Qwen/Qwen3-8B) |
| n_samples | Value('int64') | *No description provided* |
| accuracy | Value('float64') | *No description provided* |
| rho_mean | Value('float64') | Mean precommittal index (1=immediate commit, 0=no commit). Expected >0.2, actual ≈0.001 |
| rho_median | Value('float64') | Median precommittal index |
| cosine_at_25pct | Value('float64') | Mean cosine similarity between hidden state at 25% of CoT and final state |
| cosine_at_50pct | Value('float64') | Mean cosine similarity at 50% of CoT |
| cosine_at_75pct | Value('float64') | Mean cosine similarity at 75% of CoT |
| n_correction_tokens | Value('int64') | Number of self-correction tokens (Wait, Actually, etc.) detected |
| genuineness_rate | Value('float64') | Fraction of correction tokens showing genuine cosine dip (>0.02 magnitude) |
| mean_dip_magnitude | Value('float64') | Average cosine dip magnitude at correction tokens |
| canary_rho_pass | Value('bool') | Whether rho>0.2 check passed (bool or null) |
| canary_cos25_pass | Value('bool') | Whether cosine@25%>0.7 check passed (bool or null) |
| canary_corrections_pass | Value('bool') | Whether corrections>=20 check passed (bool or null) |
| details_json | Value('string') | Full JSON details for this row (signal check results, layer data, or extraction sample) |
## Generation Parameters
```json
{
"script_name": "scripts/upload_canary_artifacts.py",
"model": "Qwen/Qwen3-8B",
"description": "Canary shard results for precommittal experiment. Contains signal check metrics, layer sweep results (layers 8/17/26/35, alpha 0.8-0.95), and answer extraction check samples. Canary FAILED: rho\u22480 at all layers, cosine@25%<0.7.",
"hyperparameters": {
"n_samples": 200,
"cosine_alpha": 0.95,
"max_tokens": 4096
},
"input_datasets": []
}
```
## Experiment Documentation
For complete experiment details, see [https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/precommittal](https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/precommittal)
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("reasoning-degeneration-dev/precommittal-canary-results-v1", split="train")
print(f"Loaded {len(dataset)} rows")
```
---
*This dataset is tracked in [reasoning-degeneration-dev/PROJECT-MANIFEST](https://huggingface.co/datasets/reasoning-degeneration-dev/PROJECT-MANIFEST)*
提供机构:
reasoning-degeneration-dev



