five

open-thoughts/OpenThoughts-Agent-v1-SFT

收藏
Hugging Face2026-01-27 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/open-thoughts/OpenThoughts-Agent-v1-SFT
下载链接
链接失效反馈
官方服务:
资源简介:
OpenThinker-Agent-v1是一个开源项目,旨在为训练智能体(agents)提供最佳数据集。该项目的第一版包括了数据集、模型和研究代码库。OpenThinker-Agent-v1模型专为智能体任务设计,如Terminal-Bench 2.0和SWE-Bench。该模型基于Qwen/Qwen3-8B进行后训练,首先在OpenThoughts-Agent-v1-SFT数据集上进行监督微调(SFT),然后在OpenThoughts-Agent-v1-RL数据集上进行强化学习(RL)。OpenThoughts-Agent-v1-SFT数据集包含约15,200条轨迹,来源于两个数据源:nl2bash(简单合成生成的格式化shell命令任务)和InferredBugs(由微软收集的C#和Java中的bug集合)。OpenThoughts-Agent-v1-RL数据集包含约720个任务,来源于nl2bash验证数据集。项目还包括了训练超参数和框架版本的详细信息。

OpenThinker-Agent-v1 is an open-source effort to curate the best datasets for training agents. The first release includes datasets, models, and the research codebase. The OpenThinker-Agent-v1 model is trained for agentic tasks such as Terminal-Bench 2.0 and SWE-Bench. It is post-trained from Qwen/Qwen3-8B, SFT-ed on the OpenThoughts-Agent-v1-SFT dataset, and then RL-ed on the OpenThoughts-Agent-v1-RL dataset. The OpenThoughts-Agent-v1-SFT dataset contains approximately 15,200 traces from two data sources: nl2bash (simple synthetically generated tasks for formatting shell commands) and InferredBugs (a set of bugs in C# and Java collected by Microsoft). The OpenThoughts-Agent-v1-RL dataset contains ~720 tasks drawn from the nl2bash verified dataset. The project also details the training hyperparameters and framework versions used.
提供机构:
open-thoughts
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作