reasoning-degeneration-dev/sdc-scores-v1
收藏Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/reasoning-degeneration-dev/sdc-scores-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
tags:
- semantic-distance-coding
- scores
- easy
---
# sdc-scores-v1
Aggregated pass@1 scores — Easy tier, mainstream languages only. Esoteric baselines removed (paper transcriptions, not per-tier measurements).
## Dataset Info
- **Rows**: 8
- **Columns**: 10
## Columns
| Column | Type | Description |
|--------|------|-------------|
| language | Value('string') | Programming language name |
| tiobe_rank | Value('int64') | TIOBE index rank (1=Python, 47=OCaml) |
| tiobe_pct | Value('float64') | TIOBE index percentage share |
| condition | Value('string') | zero-shot |
| pass_at_1 | Value('float64') | % of 20 problems solved, averaged over 3 runs |
| pass_at_1_std | Value('float64') | Standard deviation of pass@1 across 3 runs |
| compile_rate | Value('float64') | % that compiled successfully |
| num_problems | Value('int64') | Number of problems evaluated |
| num_runs | Value('int64') | Number of independent runs |
| per_problem | List({'pass_rate': Value('float64'), 'problem_id': Value('string')}) | List of per-problem pass rates across runs |
## Generation Parameters
```json
{
"script_name": "run_zero_shot.py",
"model": "gpt-5-2",
"description": "Aggregated pass@1 scores \u2014 Easy tier, mainstream languages only. Esoteric baselines removed (paper transcriptions, not per-tier measurements).",
"tier": "easy",
"hyperparameters": {},
"input_datasets": []
}
```
## Experiment Documentation
For complete experiment details, see [https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding](https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding)
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("reasoning-degeneration-dev/sdc-scores-v1", split="train")
print(f"Loaded {len(dataset)} rows")
```
---
*This dataset is tracked in [reasoning-degeneration-dev/PROJECT-MANIFEST](https://huggingface.co/datasets/reasoning-degeneration-dev/PROJECT-MANIFEST)*
许可证:MIT
标签:语义距离编码(semantic-distance-coding)、评分、简易
# sdc-scores-v1
聚合后的pass@1评分——仅覆盖简易层级、主流编程语言,已移除小众基准测试(仅保留论文转录内容,而非层级专属测量结果)。
## 数据集信息
- **数据行数**:8
- **列数**:10
## 字段说明
| 字段名 | 数据类型 | 字段说明 |
|--------|----------|----------|
| language | 字符串(string) | 编程语言名称 |
| tiobe_rank | 64位整型(int64) | TIOBE指数排名(1代表Python,47代表OCaml) |
| tiobe_pct | 64位浮点型(float64) | TIOBE指数市场占比 |
| condition | 字符串(string) | 零样本(zero-shot) |
| pass_at_1 | 64位浮点型(float64) | 20道测试题的解题占比,经3次独立运行取平均值 |
| pass_at_1_std | 64位浮点型(float64) | 3次运行中pass@1的标准差 |
| compile_rate | 64位浮点型(float64) | 代码编译成功率 |
| num_problems | 64位整型(int64) | 参与评估的测试题总数 |
| num_runs | 64位整型(int64) | 独立运行的总次数 |
| per_problem | 列表类型(List),元素为字典{'pass_rate': 浮点型, 'problem_id': 字符串型} | 各测试题的逐次运行通过率列表 |
## 生成参数
json
{
"脚本名称": "run_zero_shot.py",
"模型": "gpt-5-2",
"描述": "聚合后的pass@1评分——仅针对简易层级、主流编程语言,已移除小众基准测试(仅保留论文转录内容,而非层级专属测量结果)",
"层级": "简易",
"超参数": {},
"输入数据集": []
}
## 实验文档
如需获取完整实验细节,请访问 [https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding](https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding)
## 使用方法
python
from datasets import load_dataset
dataset = load_dataset("reasoning-degeneration-dev/sdc-scores-v1", split="train")
print(f"已加载 {len(dataset)} 行数据")
*本数据集已在 [reasoning-degeneration-dev/PROJECT-MANIFEST](https://huggingface.co/datasets/reasoning-degeneration-dev/PROJECT-MANIFEST) 中进行追踪*
提供机构:
reasoning-degeneration-dev



