HyperAI

Vision Language Navigation

Vision-Language Navigation (VLN) is a task that enables embodied agents to execute natural language instructions in real 3D environments. The goal of this task is to achieve an agent's understanding and autonomous navigation of complex environments by integrating visual and linguistic information, which holds significant application value in fields such as intelligent robotics and virtual assistants.