reasoning-degeneration-dev/sdc-responses-hard-v1-partial
收藏Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/reasoning-degeneration-dev/sdc-responses-hard-v1-partial
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
tags:
- semantic-distance-coding
- hard
- responses
---
# sdc-responses-hard-v1-partial
Partial responses (hard tier)
## Dataset Info
- **Rows**: 420
- **Columns**: 15
## Columns
| Column | Type | Description |
|--------|------|-------------|
| problem_id | Value('string') | Problem identifier from EsoLang-Bench (H01-H20) |
| language | Value('string') | Target programming language name |
| tiobe_rank | Value('int64') | TIOBE index rank (1=Python, 47=OCaml) |
| tiobe_pct | Value('float64') | TIOBE index percentage share |
| condition | Value('string') | Prompting strategy: zero-shot |
| run | Value('int64') | Independent run index (0, 1, 2) |
| iteration | Value('int64') | Self-scaffolding iteration (always 1 for zero-shot) |
| prompt | Value('string') | Full prompt text sent to GPT-5.2 |
| response | Value('string') | Full untruncated model response |
| code_extracted | Value('string') | Code parsed from response via markdown code block extraction |
| compiled | Value('bool') | Whether compilation succeeded (bool) |
| compile_errors | Value('string') | Full compiler stderr if failed, empty string otherwise |
| test_results | List({'actual': Value('string'), 'error': Value('string'), 'expected': Value('string'), 'input': Value('string'), 'passed': Value('bool'), 'time_ms': Value('float64')}) | List of dicts: input, expected, actual, passed, time_ms |
| all_passed | Value('bool') | True iff all test cases passed (correctness criterion) |
| tokens_used | {'input': Value('int64'), 'output': Value('int64')} | Dict with input and output token counts from API |
## Generation Parameters
```json
{
"script_name": "run_hard_zero_shot.py",
"model": "gpt-5-2",
"description": "Partial responses (hard tier)",
"tier": "hard",
"hyperparameters": {
"temperature": 0.7,
"max_tokens": "model_maximum"
},
"input_datasets": [
"Lossfunk/Esolang-Bench"
]
}
```
## Experiment Documentation
For complete experiment details, see [https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding](https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding)
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("reasoning-degeneration-dev/sdc-responses-hard-v1-partial", split="train")
print(f"Loaded {len(dataset)} rows")
```
---
*This dataset is tracked in [reasoning-degeneration-dev/PROJECT-MANIFEST](https://huggingface.co/datasets/reasoning-degeneration-dev/PROJECT-MANIFEST)*
提供机构:
reasoning-degeneration-dev



