EgoPAT3Dv2: Predicting 3D Action Target from 2D Egocentric Vision for Human-Robot Interaction

Irving Fang,Yuzhong Chen,Yifan Wang,Jianghan Zhang,Qiushi Zhang,Jiali Xu,Xibo He,Weibo Gao,Hao Su,Yiming Li,Chen Feng,Irving Fang,Yuzhong Chen,Yifan Wang,Jianghan Zhang,Qiushi Zhang,Jiali Xu,Xibo He,Weibo Gao,Hao Su,Yiming Li,Chen Feng

A robot’s ability to anticipate the 3D action target location of a hand’s movement from egocentric videos can greatly improve safety and efficiency in human-robot interaction (HRI). While previous research predominantly focused on semantic action classification or 2D target region prediction, we argue that predicting the action target’s 3D coordinate could pave the way for more versatile downstrea...