Learning Visual-Audio Representations for Voice-Controlled Robots

Peixin Chang,Shuijing Liu,D. Livingston McPherson,Katherine Driggs-Campbell,Peixin Chang,Shuijing Liu,D. Livingston McPherson,Katherine Driggs-Campbell

Based on the recent advancements in representation learning, we propose a novel pipeline for task-oriented voice-controlled robots with raw sensor inputs. Previous methods rely on a large number of labels and task-specific reward functions. Not only can such an approach hardly be improved after the deployment, but also has limited generalization across robotic platforms and tasks. To address these...