Puddle-Jump Gridworld
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/baicenxiao/Shaping-Advice
下载链接
链接失效反馈官方服务:
资源简介:
该数据集描述了一个具有离散状态和动作空间的环境,其中智能体在一个网格中导航以到达目标,同时避开障碍物——水坑。在此环境中,智能体每执行一个动作会获得-0.05的奖励,而到达目标则能获得+1000的奖励。值得注意的是,该环境存在一些无法区分的状态,这些状态会影响最优策略的制定。该数据集的规模为10x10的网格,任务则是具有稀疏奖励的导航任务。
This dataset describes an environment with discrete state and action spaces, which is built on a 10×10 grid. In this environment, an AI Agent navigates to reach a target while avoiding obstacles (puddles). The agent receives a reward of -0.05 for each action taken, and a reward of +1000 upon reaching the target successfully. Notably, this environment contains some indistinguishable states that complicate the formulation of the optimal policy. The overall task is a navigation task with sparse rewards.



