A latent space control policy learning method based on identifiable latent dynamic models
收藏中国科学数据2026-02-09 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.1360/SST-2025-0203
下载链接
链接失效反馈官方服务:
资源简介:
Complex space manipulation tasks require intelligent spacecraft to perform planning and control tasks based on high-dimensional information in unstructured environments. Currently, control policies based on imitation learning or reinforcement learning can achieve end-to-end control, but they usually have poor interpretability, making it difficult to analyze the stability and robustness of the system. This limits their application in fields with extremely high safety requirements, such as aerospace. To obtain an interpretable end-to-end control policy, this paper proposes a latent space control policy learning method based on identifiable latent dynamic models, leveraging the characteristic of identifiable representation learning that can recover controllable latent variables from high-dimensional observations with theoretical guarantees. Firstly, an identifiable open-loop latent dynamic model is trained using open-loop data (high-dimensional observation variables and input variables) to recover the dynamic mechanism of the latent dynamic system. The representation function in the open-loop latent dynamic model can map the closed-loop data (trajectories of high-dimensional observation variables) in the expert control process to the latent space to obtain the trajectories of estimated latent variables. On this basis, we train the target encoder and controller model to recover the expert’s planning and control processes, respectively, and construct the latent space control policy. Theoretical analysis and simulation results show that the learned latent space control policy can control the latent dynamic system to successively reach the latent desired state planned by the expert. In addition, if an artificially designed controller is used instead of the learned controller model, better control effects than the expert controller can be achieved while retaining the expert’s planning results. This paper provides a new idea for improving the interpretability of end-to-end control policies and lays a certain theoretical foundation for the future realization of safer and more reliable intelligent decision-making and control of spacecraft.
创建时间:
2025-11-10



