Multimodal Aggregation Approach for Memory Vision-Voice Indoor Navigation with Meta-Learning

robot,IROS 2020

Liqi Yan,Dongfang Liu,Yaoxian Song,Changbin Yu,Liqi Yan,Dongfang Liu,Yaoxian Song,Changbin Yu

Vision and voice are two vital keys for agents’ interaction and learning. In this paper, we present a novel indoor navigation model called Memory Vision-Voice Indoor Navigation (MVV-IN), which receives voice commands and analyzes multimodal information of visual observation in order to enhance robots’ environment understanding. We make use of single RGB images taken by a rst-view monocular camera....

Multimodal Aggregation Approach for Memory Vision-Voice Indoor Navigation with Meta-Learning

Liqi Yan,Dongfang Liu,Yaoxian Song,Changbin Yu,Liqi Yan,Dongfang Liu,Yaoxian Song,Changbin Yu

Discussion

Related Contents