SYNTHETIC-2-SFT-unverified
收藏魔搭社区2025-12-05 更新2025-07-12 收录
下载链接:
https://modelscope.cn/datasets/PrimeIntellect/SYNTHETIC-2-SFT-unverified
下载链接
链接失效反馈官方服务:
资源简介:
# SYNTHETIC-2
SYNTHETIC-2 is an open reasoning dataset spanning a variety of math, coding and general reasoning tasks along with reasoning traces generated in a collaborative manner. The dataset contains both high quality reasoning traces from Deepseek-R1-0528 ideally suited for SFT, as well as multiple reasoning traces from smaller models which can be used for difficulty estimation.
To read more about our data collection approach, check out our [blog post](https://www.primeintellect.ai/blog/synthetic-2-release).

We release the following final dataset splits on Huggingface:
- [SYNTHETIC-2](https://huggingface.co/datasets/PrimeIntellect/SYNTHETIC-2): The full SYNTHETIC-2 dataset consisting of all prompts and completions along with rewards
- [SYNTHETIC-2-SFT-verified](https://huggingface.co/datasets/PrimeIntellect/SYNTHETIC-2-SFT-verified): The SFT split of SYNTHETIC-2 with responses from Deepseek-R1-0528 verified as correct (rewards of 1 for binary rewards and over 0.7 for non-binary rewards)
- [SYNTHETIC-2-SFT-unverified](https://huggingface.co/datasets/PrimeIntellect/SYNTHETIC-2-SFT-unverified): The SFT split of SYNTHETIC-2 with all responses, including those not verified as correct
- [SYNTHETIC-2-RL](https://huggingface.co/datasets/PrimeIntellect/SYNTHETIC-2-RL): The RL subset of SYNTHETIC-2 with difficulty annotations from Qwen3-32B, Qwen3-4B and DeepSeek-R1-0528-Qwen3-8B
# SYNTHETIC-2
SYNTHETIC-2 是一款开源推理数据集,涵盖各类数学、编程与通用推理任务,同时包含以协作方式生成的推理轨迹(reasoning traces)。该数据集既包含源自Deepseek-R1-0528的高质量推理轨迹,这类轨迹非常适用于监督微调(Supervised Fine-Tuning, SFT),同时也包含来自小型模型的多组推理轨迹,可用于任务难度评估。
若欲了解更多数据收集方法的细节,请查阅我们的[博客文章](https://www.primeintellect.ai/blog/synthetic-2-release)。

我们在Huggingface平台上发布了如下最终数据集子集划分:
- [SYNTHETIC-2](https://huggingface.co/datasets/PrimeIntellect/SYNTHETIC-2):完整的SYNTHETIC-2数据集,涵盖所有提示词(prompt)、输出补全结果与奖励值
- [SYNTHETIC-2-SFT-verified](https://huggingface.co/datasets/PrimeIntellect/SYNTHETIC-2-SFT-verified):SYNTHETIC-2的监督微调(Supervised Fine-Tuning, SFT)子集,其中Deepseek-R1-0528生成的回复均经过正确性验证(二元奖励场景下奖励值为1,非二元奖励场景下奖励值大于0.7)
- [SYNTHETIC-2-SFT-unverified](https://huggingface.co/datasets/PrimeIntellect/SYNTHETIC-2-SFT-unverified):SYNTHETIC-2的监督微调(SFT)子集,包含所有回复,其中涵盖未通过正确性验证的内容
- [SYNTHETIC-2-RL](https://huggingface.co/datasets/PrimeIntellect/SYNTHETIC-2-RL):SYNTHETIC-2的强化学习(Reinforcement Learning, RL)子集,包含来自Qwen3-32B、Qwen3-4B以及DeepSeek-R1-0528-Qwen3-8B的难度标注信息
提供机构:
maas
创建时间:
2025-07-10



