B-Pref

Name: B-Pref
Creator: 加州大学伯克利分校
Published: 2021-11-05 01:32:06
License: 暂无描述

arXiv2021-11-05 更新2024-06-21 收录

下载链接：

https://github.com/rll-research/B-Pref

下载链接

链接失效反馈

官方服务：

资源简介：

B-Pref是一个专为基于偏好的强化学习设计的基准数据集，由加州大学伯克利分校的研究人员创建。该数据集包含多种移动和机器人操作任务，来自DeepMind控制套件和Meta-world。B-Pref通过模拟具有广泛非理性行为的教师，提出不仅评估性能，还评估对这些潜在非理性行为的鲁棒性的度量。数据集旨在解决在复杂任务中指定奖励函数的困难，通过使用教师的偏好来学习策略，无需预定义奖励。

B-Pref is a benchmark dataset tailored for preference-based reinforcement learning, created by researchers at the University of California, Berkeley. This dataset includes a variety of locomotion and robotic manipulation tasks sourced from the DeepMind Control Suite and Meta-world. B-Pref proposes evaluation metrics that assess both model performance and robustness against these potentially irrational behaviors, by simulating teachers that exhibit a broad spectrum of irrational behaviors. The dataset aims to address the challenge of specifying reward functions for complex tasks, enabling policy learning via teacher preferences without predefined reward signals.

提供机构：

加州大学伯克利分校

创建时间：

2021-11-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集