Nemotron-3-Nano-RL-Training-Blend

Name: Nemotron-3-Nano-RL-Training-Blend
Creator: maas
Published: 2026-01-02 16:57:01
License: 暂无描述

魔搭社区2026-01-02 更新2026-01-03 收录

下载链接：

https://modelscope.cn/datasets/nv-community/Nemotron-3-Nano-RL-Training-Blend

下载链接

链接失效反馈

官方服务：

资源简介：

## Dataset Description: Nemotron-3-Nano-RL-Training-Blend is a curated dataset blend used to train the Nemotron-3-Nano-30B-A3B model. The blend consists of the following component datasets, with mixing ratios shown in parentheses: * [nvidia/Nemotron-RL-instruction_following](https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following) (0.17) * [nvidia/Nemotron-RL-knowledge-mcqa](https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-mcqa) (0.20) * [nvidia/Nemotron-RL-agent-workplace_assistant](https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-workplace_assistant) (0.10) * [nvidia/Nemotron-RL-instruction_following-structured_outputs](https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following-structured_outputs) (0.05) * [nvidia/Nemotron-RL-coding-competitive_coding](https://huggingface.co/datasets/nvidia/Nemotron-RL-coding-competitive_coding) (0.25) * [BytedTsinghua-SIA/DAPO-Math-17k](https://huggingface.co/datasets/BytedTsinghua-SIA/DAPO-Math-17k) (0.10) * [Skywork/Skywork-OR1-RL-Data (excluding OmniMath)](https://huggingface.co/datasets/Skywork/Skywork-OR1-RL-Data) (0.12) For the BytedTsinghua-SIA/DAPO-Math-17k and Skywork/Skywork-OR1-RL-Data data in this blend, instead of replicating the data in this dataset, placeholders are used that point to entries in the original datasets. A script is provided that can be used to download the data from the original datasets into the blend. For each dataset an user elects to use, the user is responsible for checking if the dataset license is fit for the intended purpose. For the nvidia/Nemotron-RL-coding-competitive_coding data, we do not include samples from the `tacos` or `apps` subsets. The dataset is preprocessed according to the curriculum described in the Nemotron-3-Nano-30B-A3B technical report (http://research.nvidia.com/labs/nemotron). Samples are ordered from higher pass-rate (easier) to lower pass-rate (harder), ensuring a balanced learning progression. This dataset is released as part of NVIDIA [NeMo Gym](https://github.com/NVIDIA-NeMo/Gym), a framework for building reinforcement learning environments to train large language models. NeMo Gym contains a growing collection of training environments and datasets to enable Reinforcement Learning from Verifiable Reward (RLVR). NeMo Gym is an open-source library within the [NVIDIA NeMo framework](https://github.com/NVIDIA-NeMo/), NVIDIA's GPU accelerated, end-to-end training framework for large language models (LLMs), multi-modal models and speech models. This dataset is part of the [NeMo Gym](https://github.com/NVIDIA-NeMo/Gym) family. This dataset is ready for commercial use. ## Dataset Owner(s): NVIDIA Corporation ## Dataset Creation Date: 12/20/2025 ## License/Terms of Use: ODC Attribution License ## Intended Usage: To be used with [NeMo Gym](https://github.com/NVIDIA-NeMo/Gym) for post-training LLMs. ## Dataset Characterization ** Data Collection Method * [Synthetic] ** Labeling Method [Synthetic] This dataset contains synthetic data created using * [nvidia/Nemotron-RL-instruction_following](https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following) (0.17) * [nvidia/Nemotron-RL-knowledge-mcqa](https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-mcqa) (0.20) * [nvidia/Nemotron-RL-agent-workplace_assistant](https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-workplace_assistant) (0.10) * [nvidia/Nemotron-RL-instruction_following-structured_outputs](https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following-structured_outputs) (0.05) * [nvidia/Nemotron-RL-coding-competitive_coding](https://huggingface.co/datasets/nvidia/Nemotron-RL-coding-competitive_coding) (0.25) * [BytedTsinghua-SIA/DAPO-Math-17k](https://huggingface.co/datasets/BytedTsinghua-SIA/DAPO-Math-17k) (0.10) * [Skywork/Skywork-OR1-RL-Data (excluding OmniMath)](https://huggingface.co/datasets/Skywork/Skywork-OR1-RL-Data) (0.12) ## Dataset Format Text Only, Compatible with [NeMo Gym](https://github.com/NVIDIA-NeMo/Gym) ## Dataset Quantification Number of records: 93244 samples Total Data Storage: 6.92 GB ## Reference(s): [NeMo Gym](https://github.com/NVIDIA-NeMo/Gym) ## Ethical Considerations: NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.

数据集说明： Nemotron-3-Nano-RL-Training-Blend 是一款经精心甄选的混合数据集，用于训练 Nemotron-3-Nano-30B-A3B 模型。该混合数据集包含以下组件数据集，各组件的混合占比如括号所示： * [nvidia/Nemotron-RL-instruction_following](https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following) (0.17) * [nvidia/Nemotron-RL-knowledge-mcqa](https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-mcqa) (0.20) * [nvidia/Nemotron-RL-agent-workplace_assistant](https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-workplace_assistant) (0.10) * [nvidia/Nemotron-RL-instruction_following-structured_outputs](https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following-structured_outputs) (0.05) * [nvidia/Nemotron-RL-coding-competitive_coding](https://huggingface.co/datasets/nvidia/Nemotron-RL-coding-competitive_coding) (0.25) * [BytedTsinghua-SIA/DAPO-Math-17k](https://huggingface.co/datasets/BytedTsinghua-SIA/DAPO-Math-17k) (0.10) * [Skywork/Skywork-OR1-RL-Data (excluding OmniMath)](https://huggingface.co/datasets/Skywork/Skywork-OR1-RL-Data) (0.12) 针对此混合数据集中的 BytedTsinghua-SIA/DAPO-Math-17k 与 Skywork/Skywork-OR1-RL-Data 数据，本数据集并未直接复刻原始数据，而是使用指向原始数据集条目的占位符。我们提供了配套脚本，可用于从原始数据集下载数据并整合至本混合数据集。对于用户选择使用的任一数据集，用户需自行核查该数据集的授权协议是否适配其预期用途。针对 nvidia/Nemotron-RL-coding-competitive_coding 数据集，我们未包含 `tacos` 与 `apps` 子集的样本。本数据集已按照 Nemotron-3-Nano-30B-A3B 技术报告（http://research.nvidia.com/labs/nemotron）中所述的课程学习规则完成预处理。样本按照通过率从高（难度较低）到低（难度较高）的顺序排列，以确保学习过程的平衡性与连贯性。本数据集作为 NVIDIA **NeMo Gym**（用于构建大语言模型强化学习环境的框架）的一部分发布。NeMo Gym 是一套不断扩充的训练环境与数据集集合，旨在支持可验证奖励强化学习（Reinforcement Learning from Verifiable Reward, RLVR）。 NeMo Gym 是 [NVIDIA NeMo 框架](https://github.com/NVIDIA-NeMo/) 内的开源库，该框架是 NVIDIA 推出的基于 GPU 加速的端到端训练框架，可用于大语言模型（Large Language Model, LLM）、多模态模型与语音模型的训练。本数据集属于 NeMo Gym 系列数据集。本数据集可用于商业用途。 ## 数据集所有者： NVIDIA 公司（NVIDIA Corporation） ## 数据集创建日期： 2025年12月20日 ## 授权/使用条款： ODC 署名许可协议（ODC Attribution License） ## 预期用途：与 NeMo Gym 配合使用，用于大语言模型的后训练。 ## 数据集特征 ** 数据收集方法 * [合成数据] ** 标注方法 [合成数据] 本数据集包含使用以下数据集生成的合成数据： * [nvidia/Nemotron-RL-instruction_following](https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following) (0.17) * [nvidia/Nemotron-RL-knowledge-mcqa](https://huggingface.co/datasets/nvidia/Nemotron-RL-knowledge-mcqa) (0.20) * [nvidia/Nemotron-RL-agent-workplace_assistant](https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-workplace_assistant) (0.10) * [nvidia/Nemotron-RL-instruction_following-structured_outputs](https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following-structured_outputs) (0.05) * [nvidia/Nemotron-RL-coding-competitive_coding](https://huggingface.co/datasets/nvidia/Nemotron-RL-coding-competitive_coding) (0.25) * [BytedTsinghua-SIA/DAPO-Math-17k](https://huggingface.co/datasets/BytedTsinghua-SIA/DAPO-Math-17k) (0.10) * [Skywork/Skywork-OR1-RL-Data (excluding OmniMath)](https://huggingface.co/datasets/Skywork/Skywork-OR1-RL-Data) (0.12) ## 数据集格式仅文本格式，兼容 NeMo Gym。 ## 数据集量化信息样本总数：93244 条总数据存储量：6.92 GB ## 参考文献： [NeMo Gym](https://github.com/NVIDIA-NeMo/Gym) ## 伦理考量： NVIDIA 坚信可信人工智能是一项共同责任，我们已制定相关政策与实践规范，以支撑各类人工智能应用的开发。开发者在按照我们的服务条款下载或使用本模型时，应与其内部模型团队协作，确保该模型符合相关行业与应用场景的要求，并应对可能出现的产品误用问题。请在此处报告模型质量、风险、安全漏洞或 NVIDIA 人工智能相关问题。

提供机构：

maas

创建时间：

2025-12-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集