Fig. 5 | Performance of the generalized world model across various environments.
收藏Figshare2025-05-06 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_b_Fig_5_b_b_Performance_of_the_generalized_world_model_across_various_environments_b_/28464422
下载链接
链接失效反馈官方服务:
资源简介:
a. The plot presents the reward progression over training steps across nine distinct environments, including Empty, 4 Squares, Racetrack, Vascular, and various Maze configurations. Each line on the graph denotes the average reward for each environment, effectively illustrating the model's learning trajectory. Simpler environments achieve convergence faster, whereas more complex ones require additional time, but all eventually reach convergence. The region beyond 4.5 million steps, demarcated by a dashed vertical red line, marks the phase of pretraining and subsequent adaptation to a new environment. This transition point highlights the shift from pretraining across ten simulation environments to adaptation within a new multi-output tributary channel. The model demonstrates significant improvement over the initial 4.5 million steps of pretraining, followed by rapid adaptation, achieving stable performance within just 50,000 steps (approximately 30 minutes) in the new environment. b. The success rate of targets reached across different environments is plotted against training steps. Box plots illustrate the variability and distribution of the MBRL algorithm's performance in successfully reaching targets. While simpler environments facilitate quicker convergence, our MBRL model consistently attains convergence across all scenarios.
创建时间:
2025-05-06



