five

Adaptive optimal control of discrete-time systems by reinforcement learning: damping coefficients-based stabilizing policy iterations

收藏
中国科学数据2026-02-12 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.1007/s11432-025-4737-8
下载链接
链接失效反馈
官方服务:
资源简介:
Policy iteration is one of the classical frameworks of adaptive dynamic programming, which requires a known initial stabilizing control to start the iteration. To relax this requirement, two different stabilizing policy iteration algorithms based on variable damping coefficients are designed for unknown discrete-time linear systems. First, we design a stabilizing artificial system and then iterate it gradually to the original system by cumulating damping coefficients and thus obtaining a stabilizing control policy. Then, a data-driven version of the stabilizing policy iteration framework is designed, and the corresponding model-free scheme is proposed for determining the damping coefficients. To relax the same initial condition that exists in traditional policy iteration-based $\mathcal{Q}$-learning, another novel data-driven $\mathcal{Q}$-learning algorithm based on stabilizing policy iteration is developed. The proposed $\mathcal{Q}$-learning algorithm is equivalent to the stabilizing policy iteration framework by theoretical analysis. Ultimately, the effectiveness of the two proposed algorithms is verified by a numerical example.
创建时间:
2025-12-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作