carbonteq/rg-rubiks_cube-instruct-100k
收藏Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/carbonteq/rg-rubiks_cube-instruct-100k
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
pretty_name: RLVR (verl) dataset
---
# RLVR generated dataset
Procedural rows from [reasoning-gym](https://github.com/open-thought/reasoning-gym), formatted for [verl](https://github.com/verl-project/verl) GRPO.
## Build metadata
```json
{
"config": "/home/owais/Projects/rlvr/rlvr/configs/datasets/rubiks_cube-instruct.yaml",
"template_type": "qwen-instruct",
"developer_prompt": null,
"data_source": "reasoning_gym",
"default_extract": "answer_tag",
"train_rows": 100000,
"test_rows": 4096,
"train_seed": 42,
"test_seed": 43,
"tasks": {
"rubiks_cube": {
"weight": 1,
"config": {}
}
},
"reasoning_gym_version": "0.1.19"
}
```
---
license: mit
pretty_name: RLVR(verl版)数据集
---
# RLVR 生成数据集
本数据集的程序化生成样本源自 reasoning-gym(开源仓库:https://github.com/open-thought/reasoning-gym),并针对 verl(开源仓库:https://github.com/verl-project/verl)的 GRPO 完成格式适配。
## 构建元数据
json
{
"config": "/home/owais/Projects/rlvr/rlvr/configs/datasets/rubiks_cube-instruct.yaml",
"template_type": "qwen-instruct",
"developer_prompt": null,
"data_source": "reasoning_gym",
"default_extract": "answer_tag",
"train_rows": 100000,
"test_rows": 4096,
"train_seed": 42,
"test_seed": 43,
"tasks": {
"rubiks_cube": {
"weight": 1,
"config": {}
}
},
"reasoning_gym_version": "0.1.19"
}
提供机构:
carbonteq



