Offline Reinforcement Learning Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/sled-group/Teachable_RL
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了代理轨迹,其中包含了任务描述以及相应的奖励、状态和动作元组。此外,在每一个时间步,数据集还提供了包含回顾和前瞻信息的语言反馈。该数据集的特点是包含了多样化的次优轨迹和经过GPT增强的语言反馈,这些特点有助于提升模型的泛化能力。任务的目的是实现带有语言反馈的强化学习。
This dataset contains agent trajectories that encompass task descriptions alongside corresponding tuples of rewards, states, and actions. Moreover, at each time step, the dataset provides language feedback integrating retrospective and prospective information. This dataset features diverse suboptimal trajectories and GPT-augmented language feedback, which can enhance the generalization capability of models. The target task aims to implement reinforcement learning with language feedback.
提供机构:
SLED Group



