ansulev/Deepseek-v4-pro-max-distill-1000x
收藏Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/ansulev/Deepseek-v4-pro-max-distill-1000x
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含由DeepSeek-V4-Pro生成的推理痕迹和最终答案,采样自`Jackrong/GLM-5.1-Reasoning-1M-Cleaned`数据集的`train`分割。数据集主要用于蒸馏任务,因为DeepSeek-V4-Pro提供了完整的推理链(Chain-of-Thought, CoT),而其他模型如OpenAI和Gemini只提供摘要。数据集包含1000个样本,格式为JSON Lines,每个样本包含id、domain、prompt、reasoning、response、model和usage等字段。
This dataset contains reasoning traces and final answers generated by DeepSeek-V4-Pro, sampled from the `train` split of the `Jackrong/GLM-5.1-Reasoning-1M-Cleaned` dataset. The dataset is primarily used for distillation tasks, as DeepSeek-V4-Pro provides the full chain-of-thought (CoT), while other models like OpenAI and Gemini only provide summaries. The dataset consists of 1000 samples in JSON Lines format, with each sample containing fields such as id, domain, prompt, reasoning, response, model, and usage.
提供机构:
ansulev



