Indirect Object-to-Robot Pose Estimation from an External Monocular RGB Camera

Jonathan Tremblay,Stephen Tyree,Terry Mosier,Stan Birchfield,Jonathan Tremblay,Stephen Tyree,Terry Mosier,Stan Birchfield

We present a robotic grasping system that uses a single external monocular RGB camera as input. The object-to-robot pose is computed indirectly by combining the output of two neural networks: one that estimates the object-to-camera pose, and another that estimates the robot-to-camera pose. Both networks are trained entirely on synthetic data, relying on domain randomization to bridge the sim-to-re...