ansulev/Deepseek-v4-pro-max-distill-1000x

Name: ansulev/Deepseek-v4-pro-max-distill-1000x
Creator: ansulev
Published: 2026-04-29 00:33:08
License: 暂无描述

Hugging Face2026-04-29 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/ansulev/Deepseek-v4-pro-max-distill-1000x

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含由DeepSeek-V4-Pro生成的推理痕迹和最终答案，采样自`Jackrong/GLM-5.1-Reasoning-1M-Cleaned`数据集的`train`分割。数据集主要用于蒸馏任务，因为DeepSeek-V4-Pro提供了完整的推理链（Chain-of-Thought, CoT），而其他模型如OpenAI和Gemini只提供摘要。数据集包含1000个样本，格式为JSON Lines，每个样本包含id、domain、prompt、reasoning、response、model和usage等字段。

This dataset contains reasoning traces and final answers generated by DeepSeek-V4-Pro, sampled from the `train` split of the `Jackrong/GLM-5.1-Reasoning-1M-Cleaned` dataset. The dataset is primarily used for distillation tasks, as DeepSeek-V4-Pro provides the full chain-of-thought (CoT), while other models like OpenAI and Gemini only provide summaries. The dataset consists of 1000 samples in JSON Lines format, with each sample containing fields such as id, domain, prompt, reasoning, response, model, and usage.

提供机构：

ansulev

5,000+

优质数据集

54 个

任务类型

进入经典数据集