Hindsight Proximal Policy Optimization based Deep Reinforcement Learning Manipulator Control
收藏DataCite Commons2024-12-27 更新2025-04-16 收录
下载链接:
https://ieee-dataport.org/documents/hindsight-proximal-policy-optimization-based-deep-reinforcement-learning-manipulator
下载链接
链接失效反馈官方服务:
资源简介:
The demand for intelligent automation in factories has been steadily increasing. While traditional robotic arms perform simple automated tasks, deep reinforcement learning enables them to execute more complex operations. However, deep reinforcement learning in the field of robotics often encounters challenging learning tasks, especially in three-dimensional and continuous environments where obtaining rewards becomes sparse. To address this issue, this article proposes the Hindsight Proximal Policy Optimization (HPPO) method for intelligent robotic control. HPPO combines the ideas of Proximal Policy Optimization (PPO) and Hindsight Experience Replay (HER) to enhance the adaptability and sample efficiency of PPO in sparse reward environments. In contrast to conventional reinforcement learning architectures, we introduce the Multi-goal concept, which provides the agent with clear objectives during interactions with the environment. Additionally, we incorporate the generation of synthetic data from the HER algorithm, enabling the agent to learn from failures and achieve goals more efficiently. A series of experiments were conducted in a simulated robotic arm control environment, comparing HPPO with other deep reinforcement learning algorithms. The results demonstrate significant improvements in HPPO, as it exhibits superior adaptability and increased sample efficiency in sparse reward environments. HPPO's practicality in robotic arm control is verified, and its potential applicability to various robotic control scenarios is established based on this approach.
提供机构:
IEEE DataPort
创建时间:
2024-12-27



