haoranli-ml/AIME2025_Qwen3-4B-Instruct-raw-sft_rl_16384budget_16rollouts_0.8temp
收藏Hugging Face2025-11-08 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/haoranli-ml/AIME2025_Qwen3-4B-Instruct-raw-sft_rl_16384budget_16rollouts_0.8temp
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了问题、答案、分词后的提示、多个响应、每个响应的奖励、最大奖励和平均奖励等信息。它被分为AIME2025部分,共有30个示例。
The dataset includes problem, answer, tokenized prompt, multiple responses, reward for each response, maximum reward, and average reward. It is split into the AIME2025 section with a total of 30 examples.
提供机构:
haoranli-ml



