internlm/OREAL-RL-Prompts
收藏Hugging Face2025-02-17 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/internlm/OREAL-RL-Prompts
下载链接
链接失效反馈官方服务:
资源简介:
OREAL-RL-Prompts数据集包含了OREAL项目强化学习训练阶段所使用的提示。这些提示主要来源于MATH、Numina以及历史AMC/AIME考试(2024年数据除外)。数据集中的每个提示都通过OREAL-7B-SFT模型进行了16次推断以计算通过率。
The OREAL-RL-Prompts dataset contains the prompts used in the reinforcement learning training phase of the OREAL project. These prompts are sourced from MATH, Numina, and historical AMC/AIME exams (excluding 2024). Each prompt in the dataset has a pass rate calculated through 16 times of inference using the OREAL-7B-SFT model.
提供机构:
internlm



