HyperAI

Vision-Language Navigation (VLN) is a task that enables embodied agents to execute natural language instructions in real 3D environments. The goal of this task is to achieve an agent's understanding and autonomous navigation of complex environments by integrating visual and linguistic information, which holds significant application value in fields such as intelligent robotics and virtual assistants.

Room2Room

R2R+EnvDrop

HyperAI

Room2Room

R2R+EnvDrop

Command Palette

Vision-Language Navigation

Command Palette

Vision-Language Navigation

Command Palette

Vision-Language Navigation