aisa-group/PostTrainBench-Trajectories

Name: aisa-group/PostTrainBench-Trajectories
Creator: aisa-group
Published: 2026-04-22 16:54:08
License: 暂无描述

Hugging Face2026-04-22 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/aisa-group/PostTrainBench-Trajectories

下载链接

链接失效反馈

官方服务：

资源简介：

PostTrainBench代理轨迹数据集包含来自PostTrainBench基准测试的代理轨迹数据，用于衡量CLI代理在预训练LLMs上进行后训练的能力。每个代理被赋予一个预训练的基础LLM、特定基准的评估脚本和10小时的NVIDIA H100 80GB GPU使用时间，代理需自主选择后训练策略（如SFT、LoRA、RLHF等）来提高模型在目标基准上的性能。数据集包含224条轨迹，涵盖8个代理运行，每个运行包含28个任务（7个基准x4个基础模型），每条轨迹记录最多10小时的自主代理活动。

The PostTrainBench Agent Traces dataset contains agent traces from the PostTrainBench benchmark, which measures CLI agents ability to post-train pre-trained LLMs. Each agent is given a pre-trained base LLM, an evaluation script for a specific benchmark, and 10 hours on an NVIDIA H100 80GB GPU. The agent must autonomously improve the models performance on the target benchmark using any post-training strategy it chooses (SFT, LoRA, RLHF, etc.). The dataset includes 224 traces across 8 agent runs, with 28 tasks per run (7 benchmarks x 4 base models), and each trace covers up to 10 hours of autonomous agent activity.

提供机构：

aisa-group

5,000+

优质数据集

54 个

任务类型

进入经典数据集