AgentEval
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/WooooDyy/AgentGym
下载链接
链接失效反馈官方服务:
资源简介:
该数据集提供了一个包含多种环境和任务的框架,旨在支持广泛的实时、单一格式和并发代理探索。此外,它还包含了跨环境的基准测试套件和高质量轨迹。该数据集的任务是研究基于LLM的代理在不同环境中的自我进化。
This dataset provides a framework encompassing diverse environments and tasks, aiming to support a wide range of real-time, single-format, and concurrent agent exploration. Furthermore, it includes cross-environment benchmark suites and high-quality trajectories. The core task of this dataset is to investigate the self-evolution of LLM-based agents across various environments.
提供机构:
WooooDyy



