Adaptive optimal control of discrete-time systems by reinforcement learning: damping coefficients-based stabilizing policy iterations
收藏中国科学数据2026-02-12 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.1007/s11432-025-4737-8
下载链接
链接失效反馈官方服务:
资源简介:
Policy iteration is one of the classical frameworks of adaptive dynamic programming, which requires a known initial stabilizing control to start the iteration. To relax this requirement, two different stabilizing policy iteration algorithms based on variable damping coefficients are designed for unknown discrete-time linear systems. First, we design a stabilizing artificial system and then iterate it gradually to the original system by cumulating damping coefficients and thus obtaining a stabilizing control policy. Then, a data-driven version of the stabilizing policy iteration framework is designed, and the corresponding model-free scheme is proposed for determining the damping coefficients. To relax the same initial condition that exists in traditional policy iteration-based $\mathcal{Q}$-learning, another novel data-driven $\mathcal{Q}$-learning algorithm based on stabilizing policy iteration is developed. The proposed $\mathcal{Q}$-learning algorithm is equivalent to the stabilizing policy iteration framework by theoretical analysis. Ultimately, the effectiveness of the two proposed algorithms is verified by a numerical example.
创建时间:
2025-12-25



