five

Trilogix1/reasoning-distill-opus-4-7-max-sft

收藏
Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Trilogix1/reasoning-distill-opus-4-7-max-sft
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含7,823个单轮推理对话,这些对话来自Claude Opus 4.7模型,并经过重新格式化以适用于监督微调(SFT)。每个对话都包含一个完整的Qwen风格聊天模板对话,其中包含模型生成的思考过程和最终答案。数据集适用于使用`trl.SFTTrainer`和`train_on_responses_only`进行训练,特别关注模型生成的思考部分。数据集的平均每行约4k tokens,部分长尾推理链可达32k tokens。此外,该数据集已用于训练一个名为`lordx64/Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled`的模型。

This dataset contains 7,823 single-turn reasoning conversations from Claude Opus 4.7, reformatted for supervised fine-tuning (SFT). Each row is a complete Qwen-style chat-template conversation, including the models generated thinking process (`<think>...</think>` block) and final answer. The dataset is designed for use with `trl.SFTTrainer` and `train_on_responses_only`, focusing on the models generated reasoning. On average, each row contains ~4k tokens (Qwen3 tokenizer), with some long-tail reasoning chains extending up to 32k tokens. Additionally, this dataset has been used to train a model named `lordx64/Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled`.
提供机构:
Trilogix1
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作