kth8/gemma-4-E4B-it-ValleyBench-benchmark
收藏Hugging Face2026-04-30 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/kth8/gemma-4-E4B-it-ValleyBench-benchmark
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
language:
- en
base_model: google/gemma-4-E4B-it
datasets:
- kth8/ValleyBench
---
Benchmark of [google/gemma-4-E4B-it](https://huggingface.co/google/gemma-4-E4B-it) against [kth8/ValleyBench](https://huggingface.co/datasets/kth8/ValleyBench) dataset. Model's answer is considered correct if it is within 0.01 of ground answer.
Accuracy: 80.3% with Python tool.
| Metric | Value |
|----------------------|---------------|
| **Correct** | 4014 |
| **Incorrect** | 958 |
| **Errors** | 28 |
| **Total samples** | 5000 |
| **Python tool calls**| 4843 |
| **Total completion tokens** | 4,595,312 |
Raw stats:
```json
{
"accuracy": 0.803,
"correct": 4014,
"incorrect": 958,
"error": 28,
"total": 5000,
"python_tool_calls": 4843,
"completion_tokens": 4595312
}
```
提供机构:
kth8



