nvidia/Nemotron-Cascade-RL-Instruction-Following
收藏Hugging Face2025-12-16 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/nvidia/Nemotron-Cascade-RL-Instruction-Following
下载链接
链接失效反馈官方服务:
资源简介:
Nemotron-Cascade-RL-IF-RL数据集专为指令跟随强化学习(IF-RL)设计,包含用于提升语言模型指令跟随能力的提示词及相关元数据。该数据集可用于商业用途(需署名)。数据集包含以下子集:训练数据包含108,938个样本,用于IF-RL训练,包括提示词、数据源及规则验证器所需的指令跟随元注释。数据源包括经过滤和预处理的[Llama-Nemotron-Post-Training-Dataset](https://huggingface.co/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset/viewer/RL)以及使用[LMSYS-Chat-1M](https://huggingface.co/datasets/lmsys/lmsys-chat-1m)提示词增强的指令跟随数据。
The Nemotron-Cascade-RL-IF-RL dataset is designed for Instruction-Following Reinforcement Learning (IF-RL). It contains prompts and associated metadata for improving language models instruction following capability. This dataset is ready for commercial use (with attribution). The dataset contains the following subset: Training Data contains 108,938 samples used for IF-RL training. It includes prompts, data sources, and instruction-following meta annotations required for the rule verifier. The data sources are the following: Filtered and pre-processed [Llama-Nemotron-Post-Training-Dataset](https://huggingface.co/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset/viewer/RL) and Augmented instruction-following data using the prompts from [LMSYS-Chat-1M](https://huggingface.co/datasets/lmsys/lmsys-chat-1m).
提供机构:
nvidia



