saurabh5/nemotron-post-training-dataset-v1-code
收藏Hugging Face2025-08-01 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/saurabh5/nemotron-post-training-dataset-v1-code
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个字段,如唯一标识符(uuid)、许可证信息(license)、数据生成器(generator)、版本(version)、类别(category)、推理类型(reasoning)等。数据集中的消息包含内容(content)、角色(role)和工具调用(tool_calls)。数据集分为训练集,其大小为约94.5亿字节,共有约189.6万个示例。数据集的总下载大小约为36.8亿字节。
The dataset includes fields such as unique identifier (uuid), license information (license), data generator (generator), version (version), category (category), reasoning type (reasoning), etc. Messages in the dataset contain content (content), role (role), and tool calls (tool_calls). The dataset is split into a training set, which is about 9.45 billion bytes in size and contains about 1.896 million examples. The total download size of the dataset is about 3.68 billion bytes.
提供机构:
saurabh5



