Interactive Navigation in Environments with Traversable Obstacles Using Large Language and Vision-Language Models

Zhen Zhang,Anran Lin,Chun Wai Wong,Xiangyu Chu,Qi Dou,K. W. Samuel Au,Zhen Zhang,Anran Lin,Chun Wai Wong,Xiangyu Chu,Qi Dou,K. W. Samuel Au

This paper proposes an interactive navigation framework by using large language and vision-language models, allowing robots to navigate in environments with traversable obstacles. We utilize the large language model (GPT-3.5) and the open-set Vision-language Model (Grounding DINO) to create an action-aware costmap to perform effective path planning without fine-tuning. With the large models, we ca...