hkust-nlp/drkernel-rl-data
收藏Hugging Face2026-02-06 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/hkust-nlp/drkernel-rl-data
下载链接
链接失效反馈官方服务:
资源简介:
DR.Kernel RL数据集是一个用于强化学习训练的数据集,主要提供单轮查询池和参考代码元数据,以支持在线多轮滚动和奖励评估。数据集以Parquet格式存储,包含71,996行数据,每行代表一个优化任务。数据集结构包括数据源、提示、能力标签、奖励模型元数据和额外信息等字段。该数据集主要用于KernelGYM环境中的强化学习训练,支持多轮反馈生成和奖励评估。
The DR.Kernel RL Dataset is designed for reinforcement learning training, providing a single-turn query pool and reference code metadata for reward evaluation. The dataset is stored in Parquet format and contains 71,996 rows, each representing an optimization task. The dataset structure includes fields such as data source, prompt, ability tag, reward model metadata, and extra information. It is primarily used in the KernelGYM environment for RL training, supporting multi-turn feedback generation and reward evaluation.
提供机构:
hkust-nlp



