LinhIcey/mathematics_competition
收藏Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/LinhIcey/mathematics_competition
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
- zh
license: apache-2.0
task_categories:
- text-generation
tags:
- math
- competition
- evaluation
size_categories:
- 1K<n<10K
configs:
- config_name: default
data_files:
- split: test
path: data/competition_math_stream.jsonl
---
# Mathematics Competition Evaluation
Competition-level mathematics evaluation dataset with 3-run predictions from Gemini model.
## Dataset Structure
Each row contains:
- `uuid`: unique identifier
- `question`: math competition problem
- `answer`: ground truth answer
- `source`: problem source
- `run_0`, `run_1`, `run_2`: each a dict with:
- `prediction`: model's answer
- `stream_output`: list of stream output segments
- `stream_output_kinds`: list of output kinds (thought/text/tool_call)
- `correct`: whether prediction matches answer
- `reserve`: `True` if at least one run is correct, `False` if all three are wrong
## Statistics
- **Total**: 3625 problems (with all 3 runs completed)
- **reserve=True**: 3165 (87.3%)
- **reserve=False**: 460 (12.7%)
提供机构:
LinhIcey



