Tight Task Grid-World
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/SuReLI/llrl
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个用于终身强化学习实验的11x11格子世界任务的变体。它包括一个中心初始状态、四个基本方向的动作以及特定的奖励配置。目标单元格的奖励值从0.8到1之间随机抽取,而滑动的概率则在0到0.1之间。规模上,我们从5个马尔可夫决策过程(MDP)中采样创建了15个任务,每个任务运行2000集,每集长度为10步。这些任务都属于终身强化学习范畴。
This dataset is a variant of the 11×11 grid-world task intended for lifelong reinforcement learning experiments. It includes a central initial state, actions in four cardinal directions, and a specific reward configuration. The reward value of the target cell is randomly sampled between 0.8 and 1, while the action slip probability ranges from 0 to 0.1. In terms of scale, we sampled and created 15 tasks from 5 Markov Decision Processes (MDPs). Each task runs 2000 episodes, with each episode having a length of 10 steps. All these tasks fall under the scope of lifelong reinforcement learning.
提供机构:
Experimental setup described in the paper



