VQA-based Robotic State Recognition Optimized with Genetic Algorithm

Kento Kawaharazuka,Yoshiki Obinata,Naoaki Kanazawa,Kei Okada,Masayuki Inaba,Kento Kawaharazuka,Yoshiki Obinata,Naoaki Kanazawa,Kei Okada,Masayuki Inaba

State recognition of objects and environment in robots has been conducted in various ways. In most cases, this is executed by processing point clouds, learning images with annotations, and using specialized sensors. In contrast, in this study, we propose a state recognition method that applies Visual Question Answering (VQA) in a Pre-Trained Vision-Language Model (PTVLM) trained from a large-scale...