vex-0/mcparena-runs
收藏Hugging Face2026-04-26 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/vex-0/mcparena-runs
下载链接
链接失效反馈官方服务:
资源简介:
MCPArena运行日志数据集包含来自MCPArena平台的训练和评估JSONL文件,用于多智能体协作任务(MCP)的强化学习实验。数据集记录了不同训练阶段(如Phase 1和Phase 2)的episode日志,包括使用Qwen2.5系列模型(如3B、7B、1.5B和0.5B变体)的训练过程,以及评估结果。每个episode包含种子、阶段、代理信息,以及基于任务成功、成本效率、对抗避免、选择校准和良好行为等维度的评分细节,还包括奖励塑形和泄漏检测等机制。数据集旨在支持强化学习算法的复现和分析。
The MCPArena run logs dataset consists of training and evaluation JSONL files from the MCPArena platform, designed for reinforcement learning experiments in multi-agent collaborative tasks (MCP). It includes episode logs from various training phases (e.g., Phase 1 and Phase 2), covering training with Qwen2.5 model variants (such as 3B, 7B, 1.5B, and 0.5B) and evaluation outcomes. Each episode contains fields like seed, phase, agent, and detailed scoring based on rubrics such as task success, cost efficiency, adversarial avoidance, selection calibration, and well-formed acting, along with reward shaping and leak detection mechanisms. The dataset facilitates reproduction and analysis of reinforcement learning algorithms.
提供机构:
vex-0



