reasoning-degeneration-dev/sdc-scores-v1

Name: reasoning-degeneration-dev/sdc-scores-v1
Creator: reasoning-degeneration-dev
Published: 2026-03-25 06:02:30
License: 暂无描述

Hugging Face2026-03-25 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/reasoning-degeneration-dev/sdc-scores-v1

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit tags: - semantic-distance-coding - scores - easy --- # sdc-scores-v1 Aggregated pass@1 scores — Easy tier, mainstream languages only. Esoteric baselines removed (paper transcriptions, not per-tier measurements). ## Dataset Info - **Rows**: 8 - **Columns**: 10 ## Columns | Column | Type | Description | |--------|------|-------------| | language | Value('string') | Programming language name | | tiobe_rank | Value('int64') | TIOBE index rank (1=Python, 47=OCaml) | | tiobe_pct | Value('float64') | TIOBE index percentage share | | condition | Value('string') | zero-shot | | pass_at_1 | Value('float64') | % of 20 problems solved, averaged over 3 runs | | pass_at_1_std | Value('float64') | Standard deviation of pass@1 across 3 runs | | compile_rate | Value('float64') | % that compiled successfully | | num_problems | Value('int64') | Number of problems evaluated | | num_runs | Value('int64') | Number of independent runs | | per_problem | List({'pass_rate': Value('float64'), 'problem_id': Value('string')}) | List of per-problem pass rates across runs | ## Generation Parameters ```json { "script_name": "run_zero_shot.py", "model": "gpt-5-2", "description": "Aggregated pass@1 scores \u2014 Easy tier, mainstream languages only. Esoteric baselines removed (paper transcriptions, not per-tier measurements).", "tier": "easy", "hyperparameters": {}, "input_datasets": [] } ``` ## Experiment Documentation For complete experiment details, see [https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding](https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding) ## Usage ```python from datasets import load_dataset dataset = load_dataset("reasoning-degeneration-dev/sdc-scores-v1", split="train") print(f"Loaded {len(dataset)} rows") ``` --- *This dataset is tracked in [reasoning-degeneration-dev/PROJECT-MANIFEST](https://huggingface.co/datasets/reasoning-degeneration-dev/PROJECT-MANIFEST)*

许可证：MIT 标签：语义距离编码（semantic-distance-coding）、评分、简易 # sdc-scores-v1 聚合后的pass@1评分——仅覆盖简易层级、主流编程语言，已移除小众基准测试（仅保留论文转录内容，而非层级专属测量结果）。 ## 数据集信息 - **数据行数**：8 - **列数**：10 ## 字段说明 | 字段名 | 数据类型 | 字段说明 | |--------|----------|----------| | language | 字符串（string） | 编程语言名称 | | tiobe_rank | 64位整型（int64） | TIOBE指数排名（1代表Python，47代表OCaml） | | tiobe_pct | 64位浮点型（float64） | TIOBE指数市场占比 | | condition | 字符串（string） | 零样本（zero-shot） | | pass_at_1 | 64位浮点型（float64） | 20道测试题的解题占比，经3次独立运行取平均值 | | pass_at_1_std | 64位浮点型（float64） | 3次运行中pass@1的标准差 | | compile_rate | 64位浮点型（float64） | 代码编译成功率 | | num_problems | 64位整型（int64） | 参与评估的测试题总数 | | num_runs | 64位整型（int64） | 独立运行的总次数 | | per_problem | 列表类型（List），元素为字典{'pass_rate': 浮点型, 'problem_id': 字符串型} | 各测试题的逐次运行通过率列表 | ## 生成参数 json { "脚本名称": "run_zero_shot.py", "模型": "gpt-5-2", "描述": "聚合后的pass@1评分——仅针对简易层级、主流编程语言，已移除小众基准测试（仅保留论文转录内容，而非层级专属测量结果）", "层级": "简易", "超参数": {}, "输入数据集": [] } ## 实验文档如需获取完整实验细节，请访问 [https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding](https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding) ## 使用方法 python from datasets import load_dataset dataset = load_dataset("reasoning-degeneration-dev/sdc-scores-v1", split="train") print(f"已加载 {len(dataset)} 行数据") *本数据集已在 [reasoning-degeneration-dev/PROJECT-MANIFEST](https://huggingface.co/datasets/reasoning-degeneration-dev/PROJECT-MANIFEST) 中进行追踪*

提供机构：

reasoning-degeneration-dev

5,000+

优质数据集

54 个

任务类型

进入经典数据集