compsciencelab/BricksRL-Datasets

Name: compsciencelab/BricksRL-Datasets
Creator: compsciencelab
Published: 2024-09-30 13:38:18
License: 暂无描述

Hugging Face2024-09-30 更新2025-04-26 收录

下载链接：

https://hf-mirror.com/datasets/compsciencelab/BricksRL-Datasets

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit --- # BricksRL Dataset Card ## Dataset Summary The BricksRL dataset contains curated data for three robotic configurations: 2Wheeler, Walker, and RoboArm. The dataset includes expert and random data for four key tasks: Walker-v0, RoboArm-v0, RunAway-v0, and Spinning-v0. The expert data was collected using a trained Soft Actor-Critic (SAC) agent, while the random data was generated by executing a random policy. This dataset is presented in the paper [BricksRL: A Platform for Democratizing Robotics and Reinforcement Learning Research and Education with LEGO](https://arxiv.org/abs/2406.17490) (NeurIPS 2024). For more information feel free to check out the project [website](https://bricksrl.github.io/ProjectPage/). ## Supported Tasks The dataset supports the following tasks across various robot configurations: - Walker-v0 - RoboArm-v0 - RunAway-v0 - Spinning-v0 ## Dataset Structure The dataset contains two types of data: - Expert Data: Collected by a trained SAC agent solving the tasks on the real robot. The agent was evaluated over 100 episodes for each task, recording all transitions. - Random Data: Generated by executing a random policy on the real robot for 100 episodes per task. The datasets are TensorDicts, which can be directly loaded into the replay buffer. When initiating (pre-)training, provide the path to the desired TensorDict when prompted to load the replay buffer. Table 1 shows the dataset statistics regarding mean reward (expert data), number of transitions collected, and collection episodes. <div style="text-align: center;"> <img src="imgs/offline_dataset_stats.png" alt="stats" width="600"/> </div> ## Results and Evaluation The dataset was used to train both online and offline RL algorithms (Table 2). Performance comparisons between these methods demonstrated the effectiveness of offline RL algorithms, particularly when using expert data. Online RL algorithms struggled to generalize or often overfit when provided with expert demonstrations. For more detailed information about the hyperparameters, please refer to the appendix of the [paper](https://arxiv.org/abs/2406.17490). <div style="text-align: center;"> <img src="imgs/offline_results.png" alt="stats" width="600"/> </div> ## Citation If you use the BricksRL dataset in your research, please cite the following paper: ``` @article{dittert2024bricksrl, title={BricksRL: A Platform for Democratizing Robotics and Reinforcement Learning Research and Education with LEGO}, author={Sebastian Dittert and Vincent Moens and Gianni De Fabritiis}, journal={arXiv preprint arXiv:2406.17490}, year={2024} } ```

提供机构：

compsciencelab

5,000+

优质数据集

54 个

任务类型

进入经典数据集