Tachibana2-DeepSeek-R1
收藏魔搭社区2025-07-11 更新2025-07-12 收录
下载链接:
https://modelscope.cn/datasets/sequelbox/Tachibana2-DeepSeek-R1
下载链接
链接失效反馈官方服务:
资源简介:
**[Click here to support our open-source dataset and model releases!](https://huggingface.co/spaces/sequelbox/SupportOpenSource)**
**Tachibana2-DeepSeek-R1** is a code-reasoning dataset, testing the limits of [DeepSeek R1's](https://huggingface.co/deepseek-ai/DeepSeek-R1) coding skills!
This dataset contains:
- 27.2k synthetically generated code-reasoning prompts. All responses are generated using [DeepSeek R1.](https://huggingface.co/deepseek-ai/DeepSeek-R1)
- Synthetic prompts are generated using [Llama 3.1 405b Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct), based on the original [sequelbox/Tachibana](https://huggingface.co/datasets/sequelbox/Tachibana) dataset with increased task complexity.
- Responses demonstrate the code-reasoning capabilities of DeepSeek's 685b parameter R1 reasoning model.
**Responses have not been filtered or edited at all:** the Tachibana 2 dataset strives to accurately represent the R1 model. Potential issues may include inaccurate answers and infinite thought loops. Tachibana 2 is presented as-is to be used at your discretion.
Users should consider applying their own sub-filtering and manual examination of the dataset before use in training.
Do as you will.
**[点击此处支持我们的开源数据集与模型发布!](https://huggingface.co/spaces/sequelbox/SupportOpenSource)**
**Tachibana2-DeepSeek-R1** 是一款代码推理数据集,旨在探索[DeepSeek R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)的编码能力边界!
本数据集包含以下内容:
- 27.2k条经合成生成的代码推理提示词(prompt),所有回复均由[DeepSeek R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)生成。
- 合成提示词基于原始[sequelbox/Tachibana](https://huggingface.co/datasets/sequelbox/Tachibana)数据集,通过[Llama 3.1 405b Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct)生成,且提升了任务复杂度。
- 生成的回复可用于展示DeepSeek 6850亿参数R1推理模型的代码推理能力。
**所有回复均未经过任何过滤或编辑**:Tachibana 2数据集旨在真实还原R1模型的原始表现。该数据集可能存在答案不准确、思维链无限循环等问题。Tachibana 2数据集将以原始状态发布,供使用者自行酌情使用。
用户在将该数据集用于训练前,应考虑自行进行二次筛选并手动检查数据集内容。
请按需使用。
提供机构:
maas
创建时间:
2025-07-10



