Learning to Navigate from Novel Origins: Enhancing Reinforcement Learning Agents with Multi-Instruction Training in Complex Space
收藏DataCite Commons2025-12-16 更新2026-05-07 收录
下载链接:
https://figshare.unimelb.edu.au/articles/dataset/Learning_to_Navigate_from_Novel_Origins_Enhancing_Reinforcement_Learning_Agents_with_Multi-Instruction_Training_in_Complex_Space/30888971/1
下载链接
链接失效反馈官方服务:
资源简介:
%[Importance]Humans can navigate to a known destination even from a starting point they have never seen before. For autonomous agents to be useful and reliable in real-world settings like warehouses or hospitals, they must also be able to handle such unpredictable starting conditions, a key challenge in AI development.%[Research Gap]It is currently impossible to test how these complex navigation skills are acquired in humans due to difficulties in controlling for their prior experience and abilities. While Reinforcement Learning (RL) agents are used to simulate navigation, their ability to adapt to situations that differ from their training, such as starting from a new location, is under-explored.% [Objective]This study investigates how an RL agent can learn to successfully navigate to a known destination from a novel, unseen origin. We test the hypothesis that training an agent on diverse routes to the same destination, augmented by spatial cues (bearing and distance), enables generalizable wayfinding skills.% [Methodology]We used a Proximal Policy Optimisation (PPO) agent in a simulated text-based environment. Our central method was a \emph{multi-instruction training paradigm,} where the agent was trained on diverse routes from multiple origins that all converged on the same destination. We evaluated the agent's performance when given different navigational aids, such as complete instructions, only spatial cues (azimuth and distance), or a combination of both, and tested its ability to generalise to unseen environments and novel origins.% [Key Findings]Results demonstrate that multi-origin training fosters transferable navigation skills, with combined instructions and spatial cues yielding the fastest learning and highest success rates, especially in complex environments—highlighting the importance of training data diversity over volume.% [Implications]These findings demonstrate that the key to building more adaptable autonomous agents lies in the structural diversity of the training data, not just the volume. This multi-instruction training method provides a clear pathway for developing more robust and trustworthy navigation systems that can function effectively without being tied to specific, pre-programmed starting locations.
提供机构:
The University of Melbourne
创建时间:
2025-12-16



