reasoning-degeneration-dev/sdc-scores-xhard-v1
收藏Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/reasoning-degeneration-dev/sdc-scores-xhard-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
tags:
- semantic-distance-coding
- extra-hard
- scores
---
# sdc-scores-xhard-v1
Aggregated scores — Extra Hard tier, mainstream only
## Dataset Info
- **Rows**: 8
- **Columns**: 10
## Columns
| Column | Type | Description |
|--------|------|-------------|
| language | Value('string') | Programming language name |
| tiobe_rank | Value('int64') | TIOBE index rank (1=Python, 47=OCaml) |
| tiobe_pct | Value('float64') | TIOBE index percentage share |
| condition | Value('string') | zero-shot |
| pass_at_1 | Value('float64') | % of 20 Extra Hard problems solved, averaged over 3 runs |
| pass_at_1_std | Value('float64') | Standard deviation of pass@1 across 3 runs |
| compile_rate | Value('float64') | % that compiled successfully |
| num_problems | Value('int64') | Number of problems evaluated |
| num_runs | Value('int64') | Number of independent runs |
| per_problem | List({'pass_rate': Value('float64'), 'problem_id': Value('string')}) | List of per-problem pass rates across runs |
## Generation Parameters
```json
{
"script_name": "run_extra_hard_zero_shot.py",
"model": "gpt-5-2",
"description": "Aggregated scores \u2014 Extra Hard tier, mainstream only",
"tier": "extra_hard",
"hyperparameters": {
"temperature": 0.7,
"max_tokens": "model_maximum"
},
"input_datasets": [
"Lossfunk/Esolang-Bench"
]
}
```
## Experiment Documentation
For complete experiment details, see [https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding](https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding)
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("reasoning-degeneration-dev/sdc-scores-xhard-v1", split="train")
print(f"Loaded {len(dataset)} rows")
```
---
*This dataset is tracked in [reasoning-degeneration-dev/PROJECT-MANIFEST](https://huggingface.co/datasets/reasoning-degeneration-dev/PROJECT-MANIFEST)*
提供机构:
reasoning-degeneration-dev



