nvidia/Nemotron-Cascade-2-SFT-Data

Name: nvidia/Nemotron-Cascade-2-SFT-Data
Creator: nvidia
Published: 2026-03-19 23:57:48
License: 暂无描述

Hugging Face2026-03-19 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/nvidia/Nemotron-Cascade-2-SFT-Data

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: other license_name: nvidia-open-model-license license_link: >- https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/ configs: - config_name: math data_files: - split: train path: math/* - config_name: science data_files: - split: train path: science/* - config_name: chat data_files: - split: train path: chat/* - config_name: instruction_following data_files: - split: train path: instruction_following/* - config_name: safety data_files: - split: train path: safety/* - config_name: conversational_agent data_files: - split: train path: conversational_agent/* - config_name: swe data_files: - split: train path: swe/* - config_name: terminal_agent data_files: - split: train path: terminal_agent/* --- # Nemotron-Cascade-2-SFT-Data We release the SFT data used for training [Nemotron-Cascade-2](https://huggingface.co/nvidia/Nemotron-Cascade-2-30B-A3B). ## Data sources #### Math Our non-proof math prompts are sourced from [Nemotron-Cascade-1-SFT](https://huggingface.co/datasets/nvidia/Nemotron-Cascade-SFT-Stage-2) and [Nemotron-Math-v2](https://huggingface.co/datasets/nvidia/Nemotron-Math-v2), with responses generated by DeepSeek-V3.2, DeepSeek-V3.2-Speciale, and GPT-OSS-120B. For mathematical proofs, prompts are taken from [Nemotron-Math-Proofs-v1](https://huggingface.co/datasets/nvidia/Nemotron-Math-Proofs-v1) and generated using DeepSeek-V3.2-Speciale. #### Science We collect science prompts from [Nemotron-Cascade-1-SFT](https://huggingface.co/datasets/nvidia/Nemotron-Cascade-SFT-Stage-2) and [Nemotron-Science-v1](https://huggingface.co/datasets/nvidia/Nemotron-Science-v1), coving physics, chemistry, and biology. Responses are generated by GPT-OSS-120B. #### General Chat We source general chat samples from [Nemotron-Cascade-1-SFT](https://huggingface.co/datasets/nvidia/Nemotron-Cascade-SFT-Stage-2) and [Nemotron-Instruction-Following-Chat-v1](https://huggingface.co/datasets/nvidia/Nemotron-Instruction-Following-Chat-v1). #### Instruction Following The samples are sourced from [Nemotron-Cascade-1-SFT](https://huggingface.co/datasets/nvidia/Nemotron-Cascade-SFT-Stage-2) and [Nemotron-Instruction-Following-Chat-v1](https://huggingface.co/datasets/nvidia/Nemotron-Instruction-Following-Chat-v1). #### Safety The samples are sourced from [Nemotron-SFT-Safety-v1](https://huggingface.co/datasets/nvidia/Nemotron-SFT-Safety-v1). #### Conversational Agent The prompts are sourced from [Nemotron-Agentic-v1](https://huggingface.co/datasets/nvidia/Nemotron-Agentic-v1) and [Nemotron-RL-Agentic-Conversational-Tool-Use-Pivot-v1](https://huggingface.co/datasets/nvidia/Nemotron-RL-Agentic-Conversational-Tool-Use-Pivot-v1), with responses generated by Qwen3-235B-A22B-Thinking-2507, Qwen3-32B, Qwen3-235B-A22B-Instruct-2507, and GPT-OSS-120B. #### Software Engineering Agent We collect agentless samples from [Nemotron-Cascade-1-SFT-SWE](https://huggingface.co/datasets/nvidia/Nemotron-Cascade-1-SFT-SWE), covering buggy code localization, code repair, and test case generation. Agentic samples are drawn from [SWE-Gym](https://huggingface.co/datasets/SWE-Gym/SWE-Gym), [SWE-rebench](https://huggingface.co/datasets/nebius/SWE-rebench), and [R2E-Gym-Subset](https://huggingface.co/datasets/R2E-Gym/R2E-Gym-Subset). #### Terminal Agent The samples are sourced from [Nemotron-Terminal-Corpus](https://huggingface.co/datasets/nvidia/Nemotron-Terminal-Corpus). ## Training We pack all SFT samples into sequences of up to 256K tokens and train the model in a single stage. Empirically, we find that the SFT model reaches optimal performance after approximately 1.5 epochs. | Hyperparameters | | | :--- | :---: | | Global Batch Size | 64 | | Packed Sequence Length | 256K | | Max Learning Rate | 5e-5 | | Min Learning Rate | 5e-6 | | Learning Rate Warmup Steps | 200 | | Scheduler | cosine | | Max Steps | 40,000 | | Optimizer | AdamW | | Optimizer Config | beta_1=0.9<br>beta_2=0.98 | | Weight Decay | 0.1 | | # of training steps | 33,000 | ## Statistics | Domain | # Samples | | :--- | :---: | | Math | 5,226,364 | | Science | 2,717,163 | | General Chat | 13,972,873 | | Instruction Following | 820,263 | | Safety | 3,570 | | Conversational Agent | 822,213 | | Software Engineering Agent | 439,610 | | Terminal Agent | 822,213 | ## Release Date Mar 19, 2026 ## License Your use of this model is governed by the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/). ## Citation ``` @article{Nemotron_Cascade_2, title={Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation}, author={Yang, Zhuolin and Liu, Zihan and Chen, Yang and Dai, Wenliang and Wang, Boxin and Lin, Sheng-Chieh and Lee, Chankyu and Chen, Yangyi and Jiang, Dongfu and He, Jiafan and Pi, Renjie and Lam, Grace and Lee, Nayeon and Bukharin, Alexander and Shoeybi, Mohammad and Catanzaro, Bryan and Ping, Wei}, year={2026} } ```

提供机构：

nvidia

5,000+

优质数据集

54 个

任务类型

进入经典数据集