raca-workspace-v1/sdc-all-responses-v1
收藏Hugging Face2026-04-04 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/raca-workspace-v1/sdc-all-responses-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
tags:
- semantic-distance
- esolang-bench
- reasoning-stickiness
- tiobe
- coding
- programming-languages
---
# sdc-all-responses-v1
Complete semantic-distance-coding experiment: 15 programming languages × 4 difficulty tiers × 20 EsoLang-Bench problems × 3 runs. Full prompts, model responses, extracted code, compilation results, and per-test-case outcomes. Wave 1 (8 mainstream: Python, C++, Java, Perl, Rust, Go, Haskell, OCaml) + Wave 2 (7 added: Fortran, Ada, Prolog, COBOL, F#, Erlang, Tcl). Zero-shot condition with GPT-5.2.
## Dataset Info
- **Rows**: 3600
- **Columns**: 16
## Columns
| Column | Type | Description |
|--------|------|-------------|
| problem_id | Value('string') | EsoLang-Bench problem ID (E01-E20 easy, M01-M20 medium, H01-H20 hard, X01-X20 extra hard) |
| language | Value('string') | Programming language name (15 total: Python, C++, Java, Perl, Rust, Go, Haskell, OCaml, Fortran, Ada, Prolog, COBOL, F#, Erlang, Tcl) |
| tiobe_rank | Value('int64') | TIOBE index rank of the language (1=Python, 47=OCaml) |
| tiobe_pct | Value('float64') | TIOBE index percentage share |
| condition | Value('string') | Experimental condition (zero-shot for all rows) |
| run | Value('int64') | Run index (0-2) for 3 independent runs per config |
| iteration | Value('int64') | Self-scaffolding iteration (always 1 for zero-shot) |
| prompt | Value('string') | Full prompt sent to GPT-5.2 |
| response | Value('string') | Full model response (not truncated) |
| code_extracted | Value('string') | Code extracted from model response (from code block) |
| compiled | Value('bool') | Whether the code compiled/parsed successfully |
| compile_errors | Value('string') | Compilation error messages (empty if compiled) |
| test_results | Value('string') | JSON array of per-test-case results (input, expected, actual, passed, time_ms, error) |
| all_passed | Value('bool') | Whether all 6 test cases passed (exact match) |
| tokens_used | Value('string') | JSON object with input/output token counts |
| tier | Value('string') | Difficulty tier: easy, medium, hard, extra_hard |
## Generation Parameters
```json
{
"script_name": "consolidation from raw JSONL results",
"model": "openai/gpt-5.2",
"description": "Complete semantic-distance-coding experiment: 15 programming languages \u00d7 4 difficulty tiers \u00d7 20 EsoLang-Bench problems \u00d7 3 runs. Full prompts, model responses, extracted code, compilation results, and per-test-case outcomes. Wave 1 (8 mainstream: Python, C++, Java, Perl, Rust, Go, Haskell, OCaml) + Wave 2 (7 added: Fortran, Ada, Prolog, COBOL, F#, Erlang, Tcl). Zero-shot condition with GPT-5.2.",
"hyperparameters": {
"temperature": 0.7,
"condition": "zero-shot",
"runs_per_config": 3
},
"input_datasets": [
"Lossfunk/Esolang-Bench"
]
}
```
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("raca-workspace-v1/sdc-all-responses-v1", split="train")
print(f"Loaded {len(dataset)} rows")
```
---
*Uploaded via [RACA](https://github.com/Zayne-sprague/Dr-Claude-Code) hf_utility.*
提供机构:
raca-workspace-v1



