OREAL-RL-Prompts

Name: OREAL-RL-Prompts
Creator: maas
Published: 2026-01-06 16:34:11
License: 暂无描述

魔搭社区2026-01-06 更新2025-05-31 收录

下载链接：

https://modelscope.cn/datasets/Shanghai_AI_Laboratory/OREAL-RL-Prompts

下载链接

链接失效反馈

官方服务：

资源简介：

# OREAL-RL-Prompts ## Links - [Arxiv](https://arxiv.org/abs/2502.06781) - [Github](https://github.com/InternLM/OREAL) - [OREAL-7B Model](https://huggingface.co/internlm/OREAL-7B) - [OREAL-32B Model](https://huggingface.co/internlm/OREAL-32B) - [Data](https://huggingface.co/datasets/InternLM/OREAL-RL-Prompts) ## Introduction This repository contains the prompts used in the RL training phase of the OREAL project. The prompts are collected from MATH, Numina, and historical AMC/AIME (2024 is excluded). The pass rate of the prompts are calculated with 16 times of inference with OREAL-7B-SFT.

# OREAL-RL-Prompts ## 链接 - [arXiv（国际学术预印本平台）](https://arxiv.org/abs/2502.06781) - [GitHub（全球代码托管平台）](https://github.com/InternLM/OREAL) - [OREAL-7B 模型](https://huggingface.co/internlm/OREAL-7B) - [OREAL-32B 模型](https://huggingface.co/internlm/OREAL-32B) - [数据集](https://huggingface.co/datasets/InternLM/OREAL-RL-Prompts) ## 简介本仓库收录了OREAL项目强化学习（Reinforcement Learning，RL）训练阶段使用的提示词（Prompt）。该批提示词采集自MATH数据集、Numina数据集以及历史美国数学竞赛（American Mathematics Competition，AMC）与美国邀请赛数学考试（American Invitational Mathematics Examination，AIME，合称AMC/AIME）试题（2024年试题除外）。提示词的通过率通过使用OREAL-7B-SFT模型进行16轮模型推理计算得出。

提供机构：

maas

创建时间：

2025-05-29

5,000+

优质数据集

54 个

任务类型

进入经典数据集