Gym-MiniGrid
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/akifumi-wachi-4/spolf
下载链接
链接失效反馈官方服务:
资源简介:
该数据集基于一个25x25的网格构建了一个模拟环境,每个网格关联着一个随机生成的特征向量以及真实的奖励/安全函数值。在该环境中,智能体观察邻近网格的特征向量,并接收样本以初始化关于安全性的广义线性模型(GLM)。这一数据集适用于处理大规模问题,并且任务涉及到带有安全约束的强化学习。
This dataset constructs a simulated environment based on a 25x25 grid. Each grid cell is associated with a randomly generated feature vector and the ground-truth reward/safety function values. In this environment, the agent observes the feature vectors of neighboring grid cells and receives samples to initialize the generalized linear model (GLM) for safety. This dataset is suitable for handling large-scale problems, and its tasks involve reinforcement learning with safety constraints.
提供机构:
Open-source



