Adaptive optimal control of discrete-time systems by reinforcement learning: damping coefficients-based stabilizing policy iterations

中国科学数据2026-02-12 更新2026-04-25 收录

下载链接：

https://www.sciengine.com/AA/doi/10.1007/s11432-025-4737-8

下载链接

链接失效反馈

官方服务：

资源简介：

Policy iteration is one of the classical frameworks of adaptive dynamic programming, which requires a known initial stabilizing control to start the iteration. To relax this requirement, two different stabilizing policy iteration algorithms based on variable damping coefficients are designed for unknown discrete-time linear systems. First, we design a stabilizing artificial system and then iterate it gradually to the original system by cumulating damping coefficients and thus obtaining a stabilizing control policy. Then, a data-driven version of the stabilizing policy iteration framework is designed, and the corresponding model-free scheme is proposed for determining the damping coefficients. To relax the same initial condition that exists in traditional policy iteration-based $\mathcal{Q}$-learning, another novel data-driven $\mathcal{Q}$-learning algorithm based on stabilizing policy iteration is developed. The proposed $\mathcal{Q}$-learning algorithm is equivalent to the stabilizing policy iteration framework by theoretical analysis. Ultimately, the effectiveness of the two proposed algorithms is verified by a numerical example.

创建时间：

2025-12-25

5,000+

优质数据集

54 个

任务类型

进入经典数据集