DefFusion: Deformable Multimodal Representation Fusion for 3D Semantic Segmentation
Rongtao Xu,Changwei Wang,Duzhen Zhang,Man Zhang,Shibiao Xu,Weiliang Meng,Xiaopeng Zhang,Rongtao Xu,Changwei Wang,Duzhen Zhang,Man Zhang,Shibiao Xu,Weiliang Meng,Xiaopeng Zhang
The complementarity between camera and LiDAR data makes fusion methods a promising approach to improve 3D semantic segmentation performance. Recent transformer-based methods have also demonstrated superiority in segmentation. However, multimodal solutions incorporating transformers are underexplored and face two key inherent difficulties: over-attention and noise from different modal data. To over...