CL-From-Nothing/RLVE-Eval20-Qwen3-1.7B-SSD-N20-SFT-Train
收藏Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/CL-From-Nothing/RLVE-Eval20-Qwen3-1.7B-SSD-N20-SFT-Train
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为RLVE-Eval20-Qwen3-1.7B-SSD-N20-SFT-Train,主要用于在简单自蒸馏(SSD)设置中进行监督微调(SFT)。数据集包含来自rlve_eval20_filtered的800个提示,每个提示有20个完成样本,总计16,000行数据。数据格式为聊天对话形式,适合多轮SFT训练。数据集适用于RLVE风格的长时任务的学生自训练和分析。
This dataset, named RLVE-Eval20-Qwen3-1.7B-SSD-N20-SFT-Train, is designed for supervised fine-tuning (SFT) in a Simple Self-Distillation (SSD) style setup. It contains 800 prompts from rlve_eval20_filtered, with 20 completions per prompt, totaling 16,000 rows. The data is formatted as chat turns, suitable for multi-turn SFT training. The dataset is intended for training and analysis of SSD-style student self-training on RLVE-style long-horizon tasks.
提供机构:
CL-From-Nothing



