iamseungpil/metacognition-behavior-uncertainty-snapshot
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/iamseungpil/metacognition-behavior-uncertainty-snapshot
下载链接
链接失效反馈官方服务:
资源简介:
# Four Habits Mechanism Lab
Project root: `/home/v-seungplee/metacognition-behavior-uncertainty`
This repository studies one question:
**Why do the Four Habits improve reasoning performance?**
## First Read
There are two papers in scope, and they are not the same experiment:
1. `Four Habits` paper
- this repository's main target
- exact-paper question: are data generation, SFT, PPO, and behavioral evaluation being run the same way?
2. `epistemic analysis` paper
- used here as a separate analysis layer
- fixed-prefix and token-suppression interventions belong here, not to the original Four Habits training recipe
## Current Answer
The repository now reconstructs the Four Habits public experiment structure correctly, but the current local pipeline is **not yet an exact paper-method rerun**.
Why:
1. the released priming generator uses `claude-3-5-sonnet-20241022`
2. the current shell does not expose `ANTHROPIC_API_KEY`
3. paper-style raw priming assets are not present locally
4. the local evaluation path is a portable wrapper rather than the released `gpt-4o-mini` batch path
5. the current derivative priming plan uses `TRAPI + gpt-5.4`, which is not the paper's exact Claude generator
## What Is Valid Right Now
### A. Exactness and Release Audits
The repository can now audit:
1. the Four Habits dataset and condition structure
2. the released SFT, PPO, and behavioral-eval chain
3. the gap between exact-paper execution and the current local setup
4. the gap between the released script paths and public Hugging Face assets
### B. Public Executable Baseline
The repository also has one valid executable public baseline:
1. model: `obiwan96/qwen-cd-100`
2. dataset: `obiwan96/countdown-env` `eval`
3. node: reserved 4-GPU analysis node
Current synced summary:
1. `n_samples = 100`
2. `accuracy = 0.22`
3. `mean_avg_logprob = -0.0737`
4. `approx_mean_entropy = 0.1847`
### C. Public Intervention Analysis
The repository has also executed an epistemic-style intervention sweep on that same public checkpoint:
1. `baseline`: `accuracy=0.22`, `entropy=0.1847`
2. `fixed_prefix_okay_so_i`: `accuracy=0.11`, `entropy=0.5693`
3. `suppress_epistemic_tokens`: `accuracy=0.22`, `entropy=0.1847`
Current read:
1. the fixed prefix damages the released public model
2. the currently tracked epistemic lexical tokens are not carrying the public baseline
3. the strongest visible useful behavior is lightweight verification
### D. Derivative TRAPI Priming Path
The repository now supports a derivative priming path based on:
1. original Four Habits condition prompts
2. `TRAPI` as the API transport
3. `gpt-5.4` as the generator model
This path is useful for a controlled follow-up study, but it is not an exact-paper priming run.
Current smoke status:
1. all five core habit conditions now have derivative raw JSON outputs
2. all five core habit conditions now have derivative `train.parquet` and `test.parquet` outputs
3. this confirms derivative infrastructure readiness, not paper-faithful learning-stage reproduction
## What Is Not Valid To Claim Yet
Do not currently claim:
1. exact Four Habits data generation
2. exact Four Habits learning-stage rerun
3. exact Four Habits behavioral evaluation rerun
4. learning-stage causal conclusions about why each habit improves performance
## Repository Layout
Core documents:
1. `PLAN.md`
- full experiment plan in `Intent / Hypothesis / Validation Method / Current Result` form
2. `CURRENT_STATUS.md`
- current exactness and execution state
3. `NODE_POLICY.md`
- node policy and runtime notes
4. `docs/EXPERIMENT_DESIGN.md`
- experiment-stage design
5. `docs/TRAINING_TRACKS.md`
- exact-vs-derivative training split
6. `docs/EPISTEMIC_ANALYSIS_PLAN.md`
- entropy and intervention analysis plan
7. `docs/EXTERNAL_SOURCES.md`
- upstream provenance
Core scripts:
1. `scripts/run_smoke.py`
2. `scripts/audit_four_habits_repro.py`
3. `scripts/audit_public_release_closure.py`
4. `scripts/audit_exact_method_alignment.py`
5. `scripts/prepare_training_study.py`
6. `scripts/prepare_epistemic_analysis.py`
7. `scripts/run_critic.py`
8. `scripts/render_report.py`
9. `scripts/build_working_note_pdf.sh`
## External Repositories
The local source-of-truth repositories are:
1. `external/cognitive-behaviors`
2. `external/strategic-information-allocation-llm-reasoning`
## Recommended Audit Loop
```bash
cd /home/v-seungplee/metacognition-behavior-uncertainty
bash scripts/run_loop.sh
bash scripts/build_working_note_pdf.sh
```
## Exact Training Gate
Before any honest exact-paper learning-stage run, the repository still needs:
1. exact paper-style priming assets or exact Claude-backed priming access
2. exact learning-stage inputs with explicit provenance
3. an exact behavioral-eval path or an explicitly documented reason for any deviation
Until then, this repository should be read as:
1. an exact-structure audit
2. a public-baseline mechanism study
3. a guarded derivative-training scaffold
提供机构:
iamseungpil



