Weakly Supervised Referring Expression Grounding via Dynamic Self-Knowledge Distillation

Jinpeng Mi,Zhiqian Chen,Jianwei Zhang,Jinpeng Mi,Zhiqian Chen,Jianwei Zhang

Weakly supervised referring expression grounding (WREG) is an attractive and challenging task for grounding target regions in images by understanding given referring expressions. WREG learns to ground target objects without the manual annotations between image regions and referring expressions during the model training phase. Different from the predominant grounding pattern of existing models, whi...