Robotic manipulation datasets for offline compositional reinforcement learning

DataONE2023-06-22 更新2024-06-08 收录

下载链接：

https://search.dataone.org/view/sha256:f763bb9e41904798f143b3d1f6a77dd84401092ce4fb801a93ade513c6d0c820

下载链接

链接失效反馈

官方服务：

资源简介：

Offline reinforcement learning (RL) is a promising direction that allows RL agents to be pre-trained from large datasets avoiding recurrence of expensive data collection. To advance the field, it is crucial to generate large-scale datasets. Compositional RL is particularly appealing for generating such large datasets, since 1) it permits creating many tasks from few components, and 2) the task structure may enable trained agents to solve new tasks by combining relevant learned components. This submission provides four offline RL datasets for simulated robotic manipulation created using the 256 tasks from CompoSuite [Mendez et al., 2022] (https://github.com/Lifelong-ML/CompoSuite). In every task in CompoSuite, a *robot* arm is used to manipulate an *object* to achieve an *objective* all while trying to avoid an *obstacle*. There are for components for each of these four axes that can be combined arbitrarily leading to a total of 256 tasks. The component choices areÂ * Robot: IIWA, Jaco, ..., The datasets were collected by using several deep reinforcement learning agents trained to the various degrees of performance described above on the CompoSuite benchmark (https://github.com/Lifelong-ML/CompoSuite) which builds on top of robosuite (https://github.com/ARISE-Initiative/robosuite) and uses the MuJoCo simulator (https://github.com/deepmind/mujoco). During reinforcement learning training, we stored the data that was collected by each agent in a separate buffer for post-processing. Then, after training, to collect the expert, medium and random dataset, we run the trained agents for 2000 trajectories of length 500 online in the CompoSuite benchmark and store the trajectories. These add up to a total of 1 million state-transitions tuples per dataset, totalling a full 256 million datapoints per dataset. The medium-replay-subsampled dataset contains trajectories from the stored training buffer of the medium agent instead. We uniformly sample trajectories from the training buffer u..., As mentioned before, the data was derived using a deep reinforcement learning algorithm (Proximal Policy Optimization) as well as CompoSuite (https://github.com/Lifelong-ML/CompoSuite) which builds on top of robosuite (https://github.com/ARISE-Initiative/robosuite) and uses the MuJoCo simulator (https://github.com/deepmind/mujoco). The datasets can, for instance, be recreated by using trained models from https://github.com/Lifelong-ML/CompoSuite-Data and our recreate_data.py python script at https://github.com/Lifelong-ML/offline-compositional-rl-datasets. The data comes in standard hdf5 format which is supported with most programming languages and can be read via standard tools. The most common language of we interest for now is expected to be python due to its widespread usage in the deep learning community. For these purposes, the files can easily be read using the h5py python package (https://pypi.org/project/h5py/). For convenience, we provide an implementation of an offline RL env...

创建时间：

2023-11-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集