raca-workspace-v1/sdc-all-responses-v1

Name: raca-workspace-v1/sdc-all-responses-v1
Creator: raca-workspace-v1
Published: 2026-04-04 23:45:33
License: 暂无描述

Hugging Face2026-04-04 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/raca-workspace-v1/sdc-all-responses-v1

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit tags: - semantic-distance - esolang-bench - reasoning-stickiness - tiobe - coding - programming-languages --- # sdc-all-responses-v1 Complete semantic-distance-coding experiment: 15 programming languages × 4 difficulty tiers × 20 EsoLang-Bench problems × 3 runs. Full prompts, model responses, extracted code, compilation results, and per-test-case outcomes. Wave 1 (8 mainstream: Python, C++, Java, Perl, Rust, Go, Haskell, OCaml) + Wave 2 (7 added: Fortran, Ada, Prolog, COBOL, F#, Erlang, Tcl). Zero-shot condition with GPT-5.2. ## Dataset Info - **Rows**: 3600 - **Columns**: 16 ## Columns | Column | Type | Description | |--------|------|-------------| | problem_id | Value('string') | EsoLang-Bench problem ID (E01-E20 easy, M01-M20 medium, H01-H20 hard, X01-X20 extra hard) | | language | Value('string') | Programming language name (15 total: Python, C++, Java, Perl, Rust, Go, Haskell, OCaml, Fortran, Ada, Prolog, COBOL, F#, Erlang, Tcl) | | tiobe_rank | Value('int64') | TIOBE index rank of the language (1=Python, 47=OCaml) | | tiobe_pct | Value('float64') | TIOBE index percentage share | | condition | Value('string') | Experimental condition (zero-shot for all rows) | | run | Value('int64') | Run index (0-2) for 3 independent runs per config | | iteration | Value('int64') | Self-scaffolding iteration (always 1 for zero-shot) | | prompt | Value('string') | Full prompt sent to GPT-5.2 | | response | Value('string') | Full model response (not truncated) | | code_extracted | Value('string') | Code extracted from model response (from code block) | | compiled | Value('bool') | Whether the code compiled/parsed successfully | | compile_errors | Value('string') | Compilation error messages (empty if compiled) | | test_results | Value('string') | JSON array of per-test-case results (input, expected, actual, passed, time_ms, error) | | all_passed | Value('bool') | Whether all 6 test cases passed (exact match) | | tokens_used | Value('string') | JSON object with input/output token counts | | tier | Value('string') | Difficulty tier: easy, medium, hard, extra_hard | ## Generation Parameters ```json { "script_name": "consolidation from raw JSONL results", "model": "openai/gpt-5.2", "description": "Complete semantic-distance-coding experiment: 15 programming languages \u00d7 4 difficulty tiers \u00d7 20 EsoLang-Bench problems \u00d7 3 runs. Full prompts, model responses, extracted code, compilation results, and per-test-case outcomes. Wave 1 (8 mainstream: Python, C++, Java, Perl, Rust, Go, Haskell, OCaml) + Wave 2 (7 added: Fortran, Ada, Prolog, COBOL, F#, Erlang, Tcl). Zero-shot condition with GPT-5.2.", "hyperparameters": { "temperature": 0.7, "condition": "zero-shot", "runs_per_config": 3 }, "input_datasets": [ "Lossfunk/Esolang-Bench" ] } ``` ## Usage ```python from datasets import load_dataset dataset = load_dataset("raca-workspace-v1/sdc-all-responses-v1", split="train") print(f"Loaded {len(dataset)} rows") ``` --- *Uploaded via [RACA](https://github.com/Zayne-sprague/Dr-Claude-Code) hf_utility.*

提供机构：

raca-workspace-v1

5,000+

优质数据集

54 个

任务类型

进入经典数据集