five

reasoning-degeneration-dev/sdc-scores-hard-v1

收藏
Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/reasoning-degeneration-dev/sdc-scores-hard-v1
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit tags: - semantic-distance-coding - scores - hard --- # sdc-scores-hard-v1 Aggregated pass@1 scores — Hard tier, mainstream languages only. Esoteric baselines removed (paper transcriptions, not per-tier measurements). ## Dataset Info - **Rows**: 8 - **Columns**: 10 ## Columns | Column | Type | Description | |--------|------|-------------| | language | Value('string') | Programming language name | | tiobe_rank | Value('int64') | TIOBE index rank (1=Python, 47=OCaml) | | tiobe_pct | Value('float64') | TIOBE index percentage share | | condition | Value('string') | zero-shot | | pass_at_1 | Value('float64') | % of 20 problems solved, averaged over 3 runs | | pass_at_1_std | Value('float64') | Standard deviation of pass@1 across 3 runs | | compile_rate | Value('float64') | % that compiled successfully | | num_problems | Value('int64') | Number of problems evaluated | | num_runs | Value('int64') | Number of independent runs | | per_problem | List({'pass_rate': Value('float64'), 'problem_id': Value('string')}) | List of per-problem pass rates across runs | ## Generation Parameters ```json { "script_name": "run_hard_zero_shot.py", "model": "gpt-5-2", "description": "Aggregated pass@1 scores \u2014 Hard tier, mainstream languages only. Esoteric baselines removed (paper transcriptions, not per-tier measurements).", "tier": "hard", "hyperparameters": {}, "input_datasets": [] } ``` ## Experiment Documentation For complete experiment details, see [https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding](https://github.com/Zayne-sprague/SC-Research-Notes/tree/main/experiments/semantic-distance-coding) ## Usage ```python from datasets import load_dataset dataset = load_dataset("reasoning-degeneration-dev/sdc-scores-hard-v1", split="train") print(f"Loaded {len(dataset)} rows") ``` --- *This dataset is tracked in [reasoning-degeneration-dev/PROJECT-MANIFEST](https://huggingface.co/datasets/reasoning-degeneration-dev/PROJECT-MANIFEST)*
提供机构:
reasoning-degeneration-dev
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作