llm-compe-2025-kato/test_eval_hle

Name: llm-compe-2025-kato/test_eval_hle
Creator: llm-compe-2025-kato
Published: 2025-08-20 14:04:11
License: 暂无描述

Hugging Face2025-08-20 更新2025-09-13 收录

下载链接：

https://hf-mirror.com/datasets/llm-compe-2025-kato/test_eval_hle

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是用于学生模型选择的评估数据集，包含了Qwen3模型在不同规模下的推理结果及使用o3-mini-2025-01-31模型进行判定的结果。数据集详细记录了每个问题的回答尝试和正确与否的判定信息。

This dataset is for the evaluation of student models, including the inference results of the Qwen3 model at different scales and the judgment results using the o3-mini-2025-01-31 model. The dataset records in detail each questions attempt to answer and the judgment of whether it is correct or not.

提供机构：

llm-compe-2025-kato

5,000+

优质数据集

54 个

任务类型

进入经典数据集