Go-Right
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/LACE-Lab/bounding-box
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个简单易懂的问题场景,其中智能体需要在走廊中导航并赢得奖励,同时会遇到强度不断变化的状态指示器。该数据集不仅包含了理想化的手工编码模型,也包括了学习得到的模型;在训练过程中,采用统一的随机行为策略。每个互动包含500个步骤,任务类型为强化学习。
This dataset comprises a straightforward problem scenario where an AI Agent navigates a corridor to earn rewards while encountering state indicators with varying intensities. This dataset includes not only idealized hand-coded models but also learned models; a uniform random behavioral policy was adopted during the training process. Each interaction consists of 500 steps, and the task type is reinforcement learning.



