kth8/Qwen3.6-35B-A3B-SuperGPQA-benchmark
收藏Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/kth8/Qwen3.6-35B-A3B-SuperGPQA-benchmark
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
language:
- en
base_model: Qwen/Qwen3.6-35B-A3B
datasets:
- m-a-p/SuperGPQA
---
Benchmark of [Qwen/Qwen3.6-35B-A3B](https://huggingface.co/Qwen/Qwen3.6-35B-A3B) against [m-a-p/SuperGPQA](https://huggingface.co/datasets/m-a-p/SuperGPQA) dataset.
Accuracy: 64.8% with Python tool.
| Metric | Value |
|----------------------|---------------|
| **Correct** | 648 |
| **Incorrect** | 337 |
| **Errors** | 15 |
| **Total samples** | 1000 |
| **Python tool calls**| 1473 |
| **Total completion tokens** | 4,837,137 |
Raw stats:
```json
{
"accuracy": 0.648,
"correct": 648,
"incorrect": 337,
"error": 15,
"total": 1000,
"python_tool_calls": 1473,
"completion_tokens": 4837137
}
```
提供机构:
kth8



