Go-Right

arXiv2025-09-30 收录

下载链接：

https://github.com/LACE-Lab/bounding-box

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个简单易懂的问题场景，其中智能体需要在走廊中导航并赢得奖励，同时会遇到强度不断变化的状态指示器。该数据集不仅包含了理想化的手工编码模型，也包括了学习得到的模型；在训练过程中，采用统一的随机行为策略。每个互动包含500个步骤，任务类型为强化学习。

This dataset comprises a straightforward problem scenario where an AI Agent navigates a corridor to earn rewards while encountering state indicators with varying intensities. This dataset includes not only idealized hand-coded models but also learned models; a uniform random behavioral policy was adopted during the training process. Each interaction consists of 500 steps, and the task type is reinforcement learning.

5,000+

优质数据集

54 个

任务类型

进入经典数据集