Transfer Set for Tool Use
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/fomorians/gym_tool_use
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了多个迁移集,用于衡量强化学习代理在各类任务中使用工具的泛化性能。此外,该数据集还用于量化代理在受到工具使用行为启发的所有可能迁移集上的成功率。该数据集的规模涉及5次独立的试验,每次试验使用不同的随机种子。研究任务是对在学习工具使用任务中得到的代理的泛化能力进行评估。
This dataset consists of multiple transfer sets, which are used to measure the generalization performance of reinforcement learning agents when utilizing tools across diverse tasks. Furthermore, this dataset enables the quantification of agents' success rates across all potential transfer sets inspired by tool-use behaviors. The dataset encompasses five independent trials, each employing a unique random seed. The core research objective of this dataset is to evaluate the generalization capabilities of agents trained on tool-use learning tasks.



