A Decision Method for Orbital Game Based on Behavior Prediction and Strategy Fusion
收藏中国科学数据2026-04-02 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.16383/j.aas.c250268
下载链接
链接失效反馈官方服务:
资源简介:
The high uncertainty and behavioral diversity of evasion strategies in the orbital pursuit-evasion game pose significant challenges to the generalization capability of pursuit strategies. Although deep reinforcement learning can enhance the pursuer's performance, the policy network often produces suboptimal or even invalid decisions when facing evasion strategies that deviate from the training distribution. To address this issue, this paper proposes a decision method for orbital game based on behavior prediction and strategy fusion, named predictor-actor-critic with fusion. During the training phase, a set of diverse evasion strategies is modeled using a prediction-guided approach combined with the artificial potential field method. Based on the traditional actor-critic framework, a predictor-actor-critic algorithm is developed by introducing a prediction network, and a corresponding pursuit sub-policy is trained for each type of evasion strategy. The prediction network estimates the evader's actions, and the similarity between predicted and actual actions is used to quantify the matching degree between each sub-policy and the unknown evasion strategy. During the execution phase, the fusion module takes the evader's historical actions and pursuit sub-policies' prediction outputs as input, dynamically evaluates matching degree, and selects the most appropriate sub-policy for decision-making. Experimental results demonstrate that the prediction network effectively evaluates the adaptability of sub-policy to unknown evasion strategies, and the fusion module significantly enhances the generalization capability and reliability of the pursuer when confronted with diverse evasion strategies.
创建时间:
2026-04-01



