haoranli-ml/AIME2025_Qwen3-4B-Instruct_shallow-summed-sft_rl_32768budget_16rollouts_0.8temp
收藏Hugging Face2025-11-09 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/haoranli-ml/AIME2025_Qwen3-4B-Instruct_shallow-summed-sft_rl_32768budget_16rollouts_0.8temp
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含问题、答案、分词提示、响应、奖励、最大奖励和平均奖励等字段。数据集被分割为AIME2025部分,共有30个示例。具体的数据集内容和用途在README中未详细描述。
The dataset includes fields such as problem, answer, tokenized prompt, responses, rewards, maximum reward, and average reward. The dataset is split into the AIME2025 section, containing 30 examples. The specific content and purpose of the dataset are not described in detail in the README.
提供机构:
haoranli-ml



