Task-Oriented Grasp Prediction with Visual-Language Inputs
Chao Tang,Dehao Huang,Lingxiao Meng,Weiyu Liu,Hong Zhang,Chao Tang,Dehao Huang,Lingxiao Meng,Weiyu Liu,Hong Zhang
To perform household tasks, assistive robots receive commands in the form of user language instructions for tool manipulation. The initial stage involves selecting the intended tool (i.e., object grounding) and grasping it in a task-oriented manner (i.e., task grounding). Nevertheless, prior researches on visual-language grasping (VLG) focus on object grounding, while disregarding the fine-grained...