RLVR-SvS/Variational-DAPO

Name: RLVR-SvS/Variational-DAPO
Creator: RLVR-SvS
Published: 2025-08-23 05:40:39
License: 暂无描述

Hugging Face2025-08-23 更新2025-11-30 收录

下载链接：

https://hf-mirror.com/datasets/RLVR-SvS/Variational-DAPO

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了314k个通过Qwen2.5-32B-Instruct策略在DAPO-17k数据集上进行600步RLVR训练合成的变体问题，每个问题都配有参考答案。数据集使用了min_hash去重算法，阈值为0.85，以保证问题的多样性。

This dataset consists of 314k variational problems synthesized by the Qwen2.5-32B-Instruct policy during 600-step RLVR training on DAPO-17k, each accompanied by reference answers. The dataset uses min_hash deduplication with a threshold of 0.85 to ensure diversity among the problems.

提供机构：

RLVR-SvS

5,000+

优质数据集

54 个

任务类型

进入经典数据集