Nemotron-RL-instruction_following

Name: Nemotron-RL-instruction_following
Creator: maas
Published: 2025-12-04 09:19:27
License: 暂无描述

魔搭社区2025-12-04 更新2025-11-22 收录

下载链接：

https://modelscope.cn/datasets/nv-community/Nemotron-RL-instruction_following

下载链接

链接失效反馈

官方服务：

资源简介：

## Dataset Description: The Nemotron-RL-instruction_following is a dataset created by combining prompts from the [WildChat-1M dataset](https://huggingface.co/datasets/allenai/WildChat-1M) (made available under the [ODC Attribution License](https://opendatacommons.org/licenses/by/1-0/)) with instructions from the [Open-Instruct code base](https://github.com/allenai/open-instruct). The instructions are designed to be easily verifiable, such as requiring responses under 200 words. This makes the dataset well-suited for evaluating and training models on objective instruction adherence. This dataset is released as part of NVIDIA [NeMo Gym](https://github.com/NVIDIA-NeMo/Gym), a framework for building reinforcement learning environments to train large language models. NeMo Gym contains a growing collection of training environments and datasets to enable Reinforcement Learning from Verifiable Reward (RLVR). NeMo Gym is an open-source library within the [NVIDIA NeMo framework](https://github.com/NVIDIA-NeMo/), NVIDIA's GPU accelerated, end-to-end training framework for large language models (LLMs), multi-modal models and speech models. This dataset is part of the [Nemo Gym Collection](https://huggingface.co/collections/nvidia/nemo-gym). This dataset is ready for commercial use. ## Dataset Owner(s): NVIDIA Corporation ## Dataset Creation Date: September 1st, 2025 ## License/Terms of Use: ODC Attribution License ## Intended Usage: To be used with [NeMo-Gym](https://github.com/NVIDIA-NeMo/Gym) for post-training LLMs. ## Dataset Characterization Data Collection Method<br> * [Automated] <br> Labeling Method<br> * [Automated] <br> ## Dataset Format Text Only, Compatible with [NeMo-Gym](https://github.com/NVIDIA-NeMo/Gym) ## Dataset Quantification Number of records: 46391 tuples of (question, verifiable instruction) Features present in record count above: N/A Total Data Storage: 93 MB ## Reference(s): [NeMo-Gym](https://github.com/NVIDIA-NeMo/Gym) [PAPER LINK](https://github.com/allenai/IFBench/blob/main/Precise_IF_Generalization_Abilities.pdf) [BLOG POST LINK]() ## Ethical Considerations: NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

### 数据集描述 Nemotron-RL-instruction_following数据集是通过将[WildChat-1M数据集](https://huggingface.co/datasets/allenai/WildChat-1M)（采用[ODC署名许可协议](https://opendatacommons.org/licenses/by/1-0/)发布）的提示词与[Open-Instruct代码库](https://github.com/allenai/open-instruct)中的指令相结合构建而成。该数据集的指令设计为易于验证，例如要求回复字数不超过200词，因此非常适合用于评估和训练模型遵循客观指令的能力。本数据集作为NVIDIA [NeMo Gym](https://github.com/NVIDIA-NeMo/Gym)的一部分发布，NeMo Gym是用于构建强化学习环境以训练大语言模型（Large Language Model, LLM）的框架。NeMo Gym收录了不断扩充的训练环境与数据集，旨在支持基于可验证奖励的强化学习（Reinforcement Learning from Verifiable Reward, RLVR）。 NeMo Gym是[NVIDIA NeMo框架](https://github.com/NVIDIA-NeMo/)内的开源库，该框架是NVIDIA推出的GPU加速型端到端训练框架，支持大语言模型、多模态模型与语音模型的训练。本数据集隶属于[NeMo Gym数据集合集](https://huggingface.co/collections/nvidia/nemo-gym)。本数据集可用于商业用途。 ### 数据集所有者 NVIDIA公司 ### 数据集创建日期 2025年9月1日 ### 使用许可 ODC署名许可协议 ### 预期用途需配合[NeMo-Gym](https://github.com/NVIDIA-NeMo/Gym)使用，用于大语言模型的后训练阶段。 ### 数据集特征数据收集方法 * [自动化采集] 标注方法 * [自动化标注] ### 数据集格式纯文本格式，兼容[NeMo-Gym](https://github.com/NVIDIA-NeMo/Gym) ### 数据集量化统计记录数量：共46391条（问题，可验证指令）二元组上述记录中包含的特征数量：无可用信息总数据存储量：93 MB ### 参考文献 [NeMo-Gym](https://github.com/NVIDIA-NeMo/Gym) [论文链接](https://github.com/allenai/IFBench/blob/main/Precise_IF_Generalization_Abilities.pdf) [博客文章链接]() ### 伦理考量 NVIDIA认为，可信人工智能（Trustworthy AI）是一项共同责任，我们已制定相关政策与实践规范，以支持各类人工智能应用的开发。开发者在按照服务条款下载或使用本数据集时，应与内部模型团队协作，确保所开发的模型符合相关行业与应用场景的要求，并应对可能出现的产品误用问题。请通过[此链接](https://www.nvidia.com/en-us/support/submit-security-vulnerability/)报告模型质量、风险、安全漏洞或NVIDIA人工智能相关问题。

提供机构：

maas

创建时间：

2025-11-15

5,000+

优质数据集

54 个

任务类型

进入经典数据集