kth8/gpt-oss-20b-MMLU-Pro-benchmark
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/kth8/gpt-oss-20b-MMLU-Pro-benchmark
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
language:
- en
base_model: openai/gpt-oss-20b
datasets:
- TIGER-Lab/MMLU-Pro
---
Benchmark of [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) against [TIGER-Lab/MMLU-Pro](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro) dataset.
Accuracy: 75.5% with Python tool.
| Metric | Value |
|----------------------|---------------|
| **Correct** | 755 |
| **Incorrect** | 245 |
| **Errors** | 0 |
| **Total samples** | 1000 |
| **Python tool calls**| 411 |
| **Total completion tokens** | 1,179,150 |
Raw stats:
```json
{
"accuracy": 0.755,
"correct": 755,
"incorrect": 245,
"error": 0,
"total": 1000,
"python_tool_calls": 411,
"completion_tokens": 1179150
}
```
提供机构:
kth8



