Experimental Results of EDRL in the DMControl-GB Environment

Name: Experimental Results of EDRL in the DMControl-GB Environment
Creator: Science Data Bank
Published: 2026-04-28 02:07:21
License: 暂无描述

DataCite Commons2026-04-28 更新2026-05-05 收录

下载链接：

https://www.scidb.cn/detail?dataSetId=3e95986f90df4322a60e3d693bfcac57

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset is the DMControl-GB (DeepMind Control Generalization Benchmark) visual reinforcement learning generalization ability evaluation result dataset, mainly aimed at providing data support for quantifying and comparing the robustness of different reinforcement learning algorithms in the face of significant visual differences. In the data generation and acquisition stage, this dataset relies on the continuous physical control benchmark suite developed by DeepMind, and is obtained by running automated testing scripts in a set simulation environment. All participating models completed 500000 exploration and training steps uniformly in the original benchmark environment without visual interference, and were then deployed to a testing environment lacking reward signals and prior knowledge for performance evaluation. To reduce the interference of occasional factors, all tests were independently run under 5 different random seeds and the final results were statistically analyzed. This dataset comprehensively records the experimental results of the EDRL algorithm and its comparison with seven mainstream baseline methods (SAC, DrQ, RAD, SODA, SVEA, SGQN, MaDi). Taking the core performance comparison table as an example, the records in the data table cover the performance of different algorithms under two levels of environmental error: one is the video_ easy mode that replaces dynamic backgrounds but retains ground shadows, and the other is the video_ hard mode that removes ground and shadow references and has severe distribution offset. The row labels of the data table represent specific combinations of algorithms and testing tasks, while the column labels detail key quantitative indicators such as environment name, interference mode, average round reward (Mean), and standard deviation (Std). The core unit of measurement involved is the "single round comprehensive return score", and the effective value range is strictly standardized between 0 and 1000. In addition, the dataset also includes a detailed table of ablation experiment records for the internal mechanism of the EDRL algorithm, including independent feature decoupling, reconstruction, state transition loss function, and multidimensional performance data when replacing the CNN/CNN-LSTM backbone network. In terms of error control in data, the inherent statistical error of data mainly comes from the randomness in the exploration process of reinforcement learning and the inherent variance of neural network initialization. To address this reasonable error, this dataset has been normalized by displaying the mean and standard deviation.

提供机构：

Science Data Bank

创建时间：

2026-04-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集