five

nvidia/Nemotron-Cascade-SFT-Stage-2

收藏
Hugging Face2025-12-18 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/nvidia/Nemotron-Cascade-SFT-Stage-2
下载链接
链接失效反馈
官方服务:
资源简介:
Nemotron-Cascade-SFT-Stage-2是一个用于监督微调(SFT)的数据集,属于Nemotron-Cascade项目的第二阶段。该数据集覆盖了数学、代码、科学、工具调用、软件工程(SWE)、指令遵循和通用领域等多个领域。数据集来源于多个公开数据集,如OpenMathReasoning、OpenCodeReasoning、MagicoderEvolInstruct等,并包含了不同领域的样本数量和来源统计信息。数据集的响应部分使用了DeepSeek系列模型生成,部分包含显式推理痕迹。数据集还提供了详细的统计数据,包括各领域的样本数量和来源分布。

Nemotron-Cascade-SFT-Stage-2 is a supervised fine-tuning (SFT) dataset, part of the second stage of the Nemotron-Cascade project. This dataset covers multiple domains including math, code, science, tool calling, software engineering (SWE), instruction following, and general domains. The dataset is sourced from various public datasets such as OpenMathReasoning, OpenCodeReasoning, MagicoderEvolInstruct, etc., and includes statistics on the number of samples and sources across different domains. The responses in the dataset are generated using the DeepSeek series of models, some of which include explicit reasoning traces. The dataset also provides detailed statistics, including the number of samples and source distribution for each domain.
提供机构:
nvidia
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作