X1AOX1A/WMDataDir
收藏Hugging Face2025-11-13 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/X1AOX1A/WMDataDir
下载链接
链接失效反馈官方服务:
资源简介:
AgentGym-RL是一个用于通过多轮强化学习训练大型语言模型(LLM)代理的数据集和基准。该框架支持在各种真实世界场景中进行多轮决策,包括网页导航、深度搜索、数字游戏、具身任务和科学任务。数据集旨在促进无需依赖监督微调(SFT)的长期决策制定。
AgentGym-RL is a dataset and benchmark for training large language model (LLM) agents through multi-turn reinforcement learning. The framework supports multi-turn decision-making in various real-world scenarios, including web navigation, deep search, digital games, embodied tasks, and scientific tasks. The dataset is designed to facilitate long-horizon decision-making without relying on supervised fine-tuning (SFT).
提供机构:
X1AOX1A



