Data_Sheet_1_Curiosity model policy optimization for robotic manipulator tracking control with input saturation in uncertain environment.pdf
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Data_Sheet_1_Curiosity_model_policy_optimization_for_robotic_manipulator_tracking_control_with_input_saturation_in_uncertain_environment_pdf/25922356
下载链接
链接失效反馈官方服务:
资源简介:
In uncertain environments with robot input saturation, both model-based reinforcement learning (MBRL) and traditional controllers struggle to perform control tasks optimally. In this study, an algorithmic framework of Curiosity Model Policy Optimization (CMPO) is proposed by combining curiosity and model-based approach, where tracking errors are reduced via training agents on control gains for traditional model-free controllers. To begin with, a metric for judging positive and negative curiosity is proposed. Constrained optimization is employed to update the curiosity ratio, which improves the efficiency of agent training. Next, the novelty distance buffer ratio is defined to reduce bias between the environment and the model. Finally, CMPO is simulated with traditional controllers and baseline MBRL algorithms in the robotic environment designed with non-linear rewards. The experimental results illustrate that the algorithm achieves superior tracking performance and generalization capabilities.
在存在机器人输入饱和的不确定环境中,基于模型的强化学习(model-based reinforcement learning, MBRL)与传统控制器均难以最优地完成控制任务。本研究结合好奇心机制与基于模型的方法,提出了好奇心模型策略优化(Curiosity Model Policy Optimization, CMPO)算法框架,通过针对传统无模型控制器的控制增益训练智能体以降低跟踪误差。首先,本文提出了一种正负好奇心的判别度量标准;其次,采用约束优化方法更新好奇心权重比例,提升智能体训练效率;随后,定义新颖性距离缓存比例,以缩小真实环境与学习模型间的偏差。最后,在设计有非线性奖励的机器人仿真环境中,将CMPO与传统控制器及基准MBRL算法开展对比仿真实验。实验结果表明,所提算法具备更优异的跟踪性能与泛化能力。
创建时间:
2024-05-29



