five

open-thoughts/OpenThoughts-Agent-v1-RL

收藏
Hugging Face2026-01-27 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/open-thoughts/OpenThoughts-Agent-v1-RL
下载链接
链接失效反馈
官方服务:
资源简介:
OpenThoughts-Agent-v1-RL是一个精心策划的强化学习数据集,包含约720个任务,每个任务包括指令、环境和验证器,用于智能代理的训练。该数据集是OpenThoughts-Agent项目的一部分,旨在为训练智能代理提供最佳数据集。数据集分为两个阶段发布:监督微调(SFT)和强化学习(RL)。RL数据集包含来自nl2bash验证数据集的任务。为了稳定训练,数据集采用了三阶段筛选流程,包括去除不稳定的验证器、环境稳定性筛选和可选难度筛选。任务定义为由Markdown文件中的指令、DockerFile定义的环境和pytests形式的验证器组成的三元组。所有环境均为通用的Ubuntu DockerFiles。

OpenThoughts-Agent-v1-RL is a curated RL dataset of ~720 tasks with instructions, environments, and verifiers for agentic training. It is part of the OpenThoughts-Agent project, an open-source effort to curate the best datasets for training agents. The dataset is released in two stages: supervised fine-tuning (SFT) and reinforcement learning (RL). The RL dataset contains tasks drawn from the nl2bash verified dataset. To stabilize training, a three-stage filtration pipeline is used, including pruning tasks with flaky verifiers, ensuring environment stability, and optionally filtering by difficulty. A task is defined as a triplet of an instruction (markdown file), an environment (DockerFile), and a verifier (pytests). All environments in this release are generic Ubuntu DockerFiles.
提供机构:
open-thoughts
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作