Dataset from ES-C51: Expected Sarsa Based C51 Distributional Reinforcement Learning Algorithm
收藏Research Data Australia2025-12-20 收录
下载链接:
https://researchdata.edu.au/dataset-es-c51-learning-algorithm/3952529
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains the results of experiments comparing the performance of the standard Q-learning based distributional deep reinforcement learning algorithm QL-C51, and a novel variant which uses Expected-Sarsa temporal difference updates (ES-C51). Each algorithm was executed for 10 separate runs with independent seeds on 22 environments (Acrobot, Cartpole, and the Atari-10 environments with and without stochasticity). Results are reported for each run in terms of the mean episodic reward over the last 10% of learning episodes. Full details are in the corresponding paper.
该数据集包含实验结果,用于比较标准的基于Q学习(Q-learning)的分布型深度强化学习算法QL-C51,与采用期望Sarsa(Expected-Sarsa)时间差分更新的新型变体算法ES-C51的性能。每种算法在22个环境(Acrobot、Cartpole,以及带随机性和无随机性的Atari-10环境)上使用独立种子执行10次独立运行。结果报告了每次运行中学习过程最后10%回合的平均回合奖励。完整细节参见相应论文。
提供机构:
Federation University Australia



