OREAL-RL-Prompts
收藏魔搭社区2026-01-06 更新2025-05-31 收录
下载链接:
https://modelscope.cn/datasets/Shanghai_AI_Laboratory/OREAL-RL-Prompts
下载链接
链接失效反馈官方服务:
资源简介:
# OREAL-RL-Prompts
## Links
- [Arxiv](https://arxiv.org/abs/2502.06781)
- [Github](https://github.com/InternLM/OREAL)
- [OREAL-7B Model](https://huggingface.co/internlm/OREAL-7B)
- [OREAL-32B Model](https://huggingface.co/internlm/OREAL-32B)
- [Data](https://huggingface.co/datasets/InternLM/OREAL-RL-Prompts)
## Introduction
This repository contains the prompts used in the RL training phase of the OREAL project. The prompts are collected from MATH, Numina, and historical AMC/AIME (2024 is excluded). The pass rate of the prompts are calculated with 16 times of inference with OREAL-7B-SFT.
# OREAL-RL-Prompts
## 链接
- [arXiv(国际学术预印本平台)](https://arxiv.org/abs/2502.06781)
- [GitHub(全球代码托管平台)](https://github.com/InternLM/OREAL)
- [OREAL-7B 模型](https://huggingface.co/internlm/OREAL-7B)
- [OREAL-32B 模型](https://huggingface.co/internlm/OREAL-32B)
- [数据集](https://huggingface.co/datasets/InternLM/OREAL-RL-Prompts)
## 简介
本仓库收录了OREAL项目强化学习(Reinforcement Learning,RL)训练阶段使用的提示词(Prompt)。该批提示词采集自MATH数据集、Numina数据集以及历史美国数学竞赛(American Mathematics Competition,AMC)与美国邀请赛数学考试(American Invitational Mathematics Examination,AIME,合称AMC/AIME)试题(2024年试题除外)。提示词的通过率通过使用OREAL-7B-SFT模型进行16轮模型推理计算得出。
提供机构:
maas
创建时间:
2025-05-29



