nvidia/Nemotron-Cascade-2-SFT-Data
收藏Hugging Face2026-03-19 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/nvidia/Nemotron-Cascade-2-SFT-Data
下载链接
链接失效反馈官方服务:
资源简介:
---
license: other
license_name: nvidia-open-model-license
license_link: >-
https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/
configs:
- config_name: math
data_files:
- split: train
path: math/*
- config_name: science
data_files:
- split: train
path: science/*
- config_name: chat
data_files:
- split: train
path: chat/*
- config_name: instruction_following
data_files:
- split: train
path: instruction_following/*
- config_name: safety
data_files:
- split: train
path: safety/*
- config_name: conversational_agent
data_files:
- split: train
path: conversational_agent/*
- config_name: swe
data_files:
- split: train
path: swe/*
- config_name: terminal_agent
data_files:
- split: train
path: terminal_agent/*
---
# Nemotron-Cascade-2-SFT-Data
We release the SFT data used for training [Nemotron-Cascade-2](https://huggingface.co/nvidia/Nemotron-Cascade-2-30B-A3B).
## Data sources
#### Math
Our non-proof math prompts are sourced from [Nemotron-Cascade-1-SFT](https://huggingface.co/datasets/nvidia/Nemotron-Cascade-SFT-Stage-2) and [Nemotron-Math-v2](https://huggingface.co/datasets/nvidia/Nemotron-Math-v2), with responses generated by DeepSeek-V3.2, DeepSeek-V3.2-Speciale, and GPT-OSS-120B. For mathematical proofs, prompts are taken from [Nemotron-Math-Proofs-v1](https://huggingface.co/datasets/nvidia/Nemotron-Math-Proofs-v1) and generated using DeepSeek-V3.2-Speciale.
#### Science
We collect science prompts from [Nemotron-Cascade-1-SFT](https://huggingface.co/datasets/nvidia/Nemotron-Cascade-SFT-Stage-2) and [Nemotron-Science-v1](https://huggingface.co/datasets/nvidia/Nemotron-Science-v1), coving physics, chemistry, and biology. Responses are generated by GPT-OSS-120B.
#### General Chat
We source general chat samples from [Nemotron-Cascade-1-SFT](https://huggingface.co/datasets/nvidia/Nemotron-Cascade-SFT-Stage-2) and [Nemotron-Instruction-Following-Chat-v1](https://huggingface.co/datasets/nvidia/Nemotron-Instruction-Following-Chat-v1).
#### Instruction Following
The samples are sourced from [Nemotron-Cascade-1-SFT](https://huggingface.co/datasets/nvidia/Nemotron-Cascade-SFT-Stage-2) and [Nemotron-Instruction-Following-Chat-v1](https://huggingface.co/datasets/nvidia/Nemotron-Instruction-Following-Chat-v1).
#### Safety
The samples are sourced from [Nemotron-SFT-Safety-v1](https://huggingface.co/datasets/nvidia/Nemotron-SFT-Safety-v1).
#### Conversational Agent
The prompts are sourced from [Nemotron-Agentic-v1](https://huggingface.co/datasets/nvidia/Nemotron-Agentic-v1) and [Nemotron-RL-Agentic-Conversational-Tool-Use-Pivot-v1](https://huggingface.co/datasets/nvidia/Nemotron-RL-Agentic-Conversational-Tool-Use-Pivot-v1), with responses generated by Qwen3-235B-A22B-Thinking-2507, Qwen3-32B, Qwen3-235B-A22B-Instruct-2507, and GPT-OSS-120B.
#### Software Engineering Agent
We collect agentless samples from [Nemotron-Cascade-1-SFT-SWE](https://huggingface.co/datasets/nvidia/Nemotron-Cascade-1-SFT-SWE), covering buggy code localization, code repair, and test case generation. Agentic samples are drawn from [SWE-Gym](https://huggingface.co/datasets/SWE-Gym/SWE-Gym), [SWE-rebench](https://huggingface.co/datasets/nebius/SWE-rebench), and [R2E-Gym-Subset](https://huggingface.co/datasets/R2E-Gym/R2E-Gym-Subset).
#### Terminal Agent
The samples are sourced from [Nemotron-Terminal-Corpus](https://huggingface.co/datasets/nvidia/Nemotron-Terminal-Corpus).
## Training
We pack all SFT samples into sequences of up to 256K tokens and train the model in a single stage. Empirically, we find that the SFT model reaches optimal performance after approximately 1.5 epochs.
| Hyperparameters | |
| :--- | :---: |
| Global Batch Size | 64 |
| Packed Sequence Length | 256K |
| Max Learning Rate | 5e-5 |
| Min Learning Rate | 5e-6 |
| Learning Rate Warmup Steps | 200 |
| Scheduler | cosine |
| Max Steps | 40,000 |
| Optimizer | AdamW |
| Optimizer Config | beta_1=0.9<br>beta_2=0.98 |
| Weight Decay | 0.1 |
| # of training steps | 33,000 |
## Statistics
| Domain | # Samples |
| :--- | :---: |
| Math | 5,226,364 |
| Science | 2,717,163 |
| General Chat | 13,972,873 |
| Instruction Following | 820,263 |
| Safety | 3,570 |
| Conversational Agent | 822,213 |
| Software Engineering Agent | 439,610 |
| Terminal Agent | 822,213 |
## Release Date
Mar 19, 2026
## License
Your use of this model is governed by the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
## Citation
```
@article{Nemotron_Cascade_2,
title={Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation},
author={Yang, Zhuolin and Liu, Zihan and Chen, Yang and Dai, Wenliang and Wang, Boxin and Lin, Sheng-Chieh and Lee, Chankyu and Chen, Yangyi and Jiang, Dongfu and He, Jiafan and Pi, Renjie and Lam, Grace and Lee, Nayeon and Bukharin, Alexander and Shoeybi, Mohammad and Catanzaro, Bryan and Ping, Wei},
year={2026}
}
```
提供机构:
nvidia



