five

raca-workspace-v1/sdc-all-responses-v1

收藏
Hugging Face2026-04-04 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/raca-workspace-v1/sdc-all-responses-v1
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit tags: - semantic-distance - esolang-bench - reasoning-stickiness - tiobe - coding - programming-languages --- # sdc-all-responses-v1 Complete semantic-distance-coding experiment: 15 programming languages × 4 difficulty tiers × 20 EsoLang-Bench problems × 3 runs. Full prompts, model responses, extracted code, compilation results, and per-test-case outcomes. Wave 1 (8 mainstream: Python, C++, Java, Perl, Rust, Go, Haskell, OCaml) + Wave 2 (7 added: Fortran, Ada, Prolog, COBOL, F#, Erlang, Tcl). Zero-shot condition with GPT-5.2. ## Dataset Info - **Rows**: 3600 - **Columns**: 16 ## Columns | Column | Type | Description | |--------|------|-------------| | problem_id | Value('string') | EsoLang-Bench problem ID (E01-E20 easy, M01-M20 medium, H01-H20 hard, X01-X20 extra hard) | | language | Value('string') | Programming language name (15 total: Python, C++, Java, Perl, Rust, Go, Haskell, OCaml, Fortran, Ada, Prolog, COBOL, F#, Erlang, Tcl) | | tiobe_rank | Value('int64') | TIOBE index rank of the language (1=Python, 47=OCaml) | | tiobe_pct | Value('float64') | TIOBE index percentage share | | condition | Value('string') | Experimental condition (zero-shot for all rows) | | run | Value('int64') | Run index (0-2) for 3 independent runs per config | | iteration | Value('int64') | Self-scaffolding iteration (always 1 for zero-shot) | | prompt | Value('string') | Full prompt sent to GPT-5.2 | | response | Value('string') | Full model response (not truncated) | | code_extracted | Value('string') | Code extracted from model response (from code block) | | compiled | Value('bool') | Whether the code compiled/parsed successfully | | compile_errors | Value('string') | Compilation error messages (empty if compiled) | | test_results | Value('string') | JSON array of per-test-case results (input, expected, actual, passed, time_ms, error) | | all_passed | Value('bool') | Whether all 6 test cases passed (exact match) | | tokens_used | Value('string') | JSON object with input/output token counts | | tier | Value('string') | Difficulty tier: easy, medium, hard, extra_hard | ## Generation Parameters ```json { "script_name": "consolidation from raw JSONL results", "model": "openai/gpt-5.2", "description": "Complete semantic-distance-coding experiment: 15 programming languages \u00d7 4 difficulty tiers \u00d7 20 EsoLang-Bench problems \u00d7 3 runs. Full prompts, model responses, extracted code, compilation results, and per-test-case outcomes. Wave 1 (8 mainstream: Python, C++, Java, Perl, Rust, Go, Haskell, OCaml) + Wave 2 (7 added: Fortran, Ada, Prolog, COBOL, F#, Erlang, Tcl). Zero-shot condition with GPT-5.2.", "hyperparameters": { "temperature": 0.7, "condition": "zero-shot", "runs_per_config": 3 }, "input_datasets": [ "Lossfunk/Esolang-Bench" ] } ``` ## Usage ```python from datasets import load_dataset dataset = load_dataset("raca-workspace-v1/sdc-all-responses-v1", split="train") print(f"Loaded {len(dataset)} rows") ``` --- *Uploaded via [RACA](https://github.com/Zayne-sprague/Dr-Claude-Code) hf_utility.*
提供机构:
raca-workspace-v1
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作