Tachibana3-Part2-DeepSeek-V3.2
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/sequelbox/Tachibana3-Part2-DeepSeek-V3.2
下载链接
链接失效反馈官方服务:
资源简介:
**[Click here to support our open-source dataset and model releases!](https://huggingface.co/spaces/sequelbox/SupportOpenSource)**
**Tachibana3-Part2-DeepSeek-V3.2** is a dataset focused on high-difficulty code production tasks, testing the limits of [DeepSeek V3.2's](https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp) code-reasoning skills!
This dataset contains 9.3k high-difficulty code-production prompts:
- Questions prioritize real-world, challenging coding tasks across a variety of programming languages and topics.
- Areas of focus include back-end and front-end development, mobile, gamedev, cloud, QA, custom tooling, and embedded systems.
- A wide variety of emphasized languages improves development capability: Python, C, C++, C#, TypeScript, Java, JavaScript, Go, Haskell, R, Ruby, SQL, shell scripts, assembly code, and more!
- Some questions test the model's ability to follow very specific coding instructions, while others provide general business requirements and leave the specific implementation to the model.
- Responses demonstrate the code-reasoning capabilities of DeepSeek's 685b parameter V3.2 model in reasoning mode.
**Responses have not been edited at all:** the Tachibana dataset strives to accurately represent the V3.1 model. Potential issues may include inaccurate answers and infinite thought loops. Tachibana 3 is presented as-is to be used at your discretion.
**Tachibana 3 is a multi-part dataset;** additional coding queries answered by DeepSeek-V3.1-Terminus [can be found here.](https://huggingface.co/datasets/sequelbox/Tachibana3-Part1-DeepSeek-V3.1-Terminus) The sorting of responses into the two parts is approximate; 1-3% of rows may be placed in the wrong part of the dataset (actually answered by V3.1-Terminus instead of V3.2.) There is no meaningful impact to the user.
Users should consider applying their own sub-filtering and manual examination of the dataset before use in training.
Do as you will.
**[点击此处支持我们的开源数据集与模型发布!](https://huggingface.co/spaces/sequelbox/SupportOpenSource)**
**Tachibana3-Part2-DeepSeek-V3.2** 是一款聚焦于高难度代码生成任务的数据集,用于测试[DeepSeek V3.2](https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp)的代码推理能力上限!
该数据集包含9300条高难度代码生成提示词:
- 题目优先覆盖各类编程语言与主题下的真实、高挑战性编码任务。
- 核心覆盖领域包括前后端开发、移动应用开发、游戏开发、云计算、质量保证(QA)、自定义工具开发以及嵌入式系统。
- 涵盖的编程语言种类丰富,可提升跨语言开发能力:Python、C、C++、C#、TypeScript、Java、JavaScript、Go、Haskell、R、Ruby、SQL、Shell脚本、汇编代码等!
- 部分题目用于测试模型遵循极具体编码指令的能力,而其余题目仅提供通用业务需求,将具体实现交由模型完成。
- 数据集的回复展示了DeepSeek旗下6850亿参数V3.2模型在推理模式下的代码推理能力。
**所有回复均未经过任何编辑**:Tachibana数据集旨在精准呈现V3.1模型的表现。潜在问题可能包括答案不准确以及无限思维循环。Tachibana 3以原始状态发布,使用者可自行决定使用方式。
**Tachibana 3是一款多部分组成的数据集**;由DeepSeek-V3.1-Terminus生成的额外编码查询可[在此处获取](https://huggingface.co/datasets/sequelbox/Tachibana3-Part1-DeepSeek-V3.1-Terminus)。将数据划分为两个部分的方式为近似划分;约1%-3%的条目可能被错误归类(实际由V3.1-Terminus生成而非V3.2),但不会对使用者造成实质性影响。
使用者在将该数据集用于模型训练前,应考虑自行进行子筛选与人工核查。
请按需使用。
提供机构:
maas
创建时间:
2025-10-09



