GSMR-CNN: An End-to-End Trainable Architecture for Grasping Target Objects from Multi-Object Scenes

Valerija Holomjova,Andrew J. Starkey,Pascal Meißner,Valerija Holomjova,Andrew J. Starkey,Pascal Meißner

We present an end-to-end trainable multi-task model that locates and retrieves target objects from multi-object scenes. The model is an extension of the Siamese Mask R-CNN, which combines the components of Siamese Neural Networks (SNNs) and Mask R-CNN for performing one-shot instance segmentation. The proposed network, called Grasping Siamese Mask R-CNN (GSMR-CNN), extends Siamese Mask R-CNN by ad...