Robust model-based MARL via masked cross-agent completion under observation loss
收藏中国科学数据2026-03-16 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.1007/s11432-025-4808-x
下载链接
链接失效反馈官方服务:
资源简介:
Recent advances in model-based reinforcement learning (MBRL) have demonstrated potential for mitigating sample complexity in multi-agent reinforcement learning (MARL) through synthetic environment interaction generation. However, while conventional MBRL approaches typically assume agents maintain continuous observation access during inference, real-world implementations often face observation loss, where specific agents temporarily lose observational capabilities due to environmental interference or system failures. To deal with this challenge, we present RMIOv2, a novel model-based MARL framework that simultaneously delivers competitive performance in standard environments while maintaining robust decision-making capabilities under transient observation loss conditions. Specifically, RMIOv2 enhances the world model's capability to consistently represent agent states through cross-agent Transformer fusion modules. Furthermore, RMIOv2 uses dynamic reward trend modeling to mitigate reward prediction errors. On the basis of this pre-training, the framework employs masked fine-tuning to improve the world model's ability to reconstruct observations for agents experiencing observation loss, ensuring coordinated multi-agent decision-making. Our experiments demonstrate RMIOv2's superiority over state-of-the-art approaches in both final performance after convergence and robustness to observation loss when handling agents experiencing observation loss.
创建时间:
2026-02-27



