CL-From-Nothing/RLVE-Eval20-Qwen3-4B-SSD-N20-SFT-Train
收藏Hugging Face2026-04-26 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/CL-From-Nothing/RLVE-Eval20-Qwen3-4B-SSD-N20-SFT-Train
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
language:
- en
task_categories:
- text-generation
tags:
- simple-self-distillation
- ssd
- rlve
- qwen3
configs:
- config_name: default
data_files:
- split: train
path: data/train.parquet
---
# RLVE-Eval20 Qwen3-4B SSD N=20 SFT Train
Self-generated SFT corpus for **Simple Self-Distillation (SSD)** with **Qwen/Qwen3-4B**.
- 800 RLVE Eval20 (filtered) prompts × 20 self-samples = **16,000 rows**
- Sampled from frozen Qwen3-4B (vLLM, max_tokens=16384, thinking enabled).
- Stored as VERL `MultiTurnSFTDataset` parquet with a `messages` column.
Companion 1.7B dataset: [CL-From-Nothing/RLVE-Eval20-Qwen3-1.7B-SSD-N20-SFT-Train](https://huggingface.co/datasets/CL-From-Nothing/RLVE-Eval20-Qwen3-1.7B-SSD-N20-SFT-Train).
提供机构:
CL-From-Nothing



