CrossDTR: Cross-view and Depth-guided Transformers for 3D Object Detection

Ching-Yu Tseng,Yi-Rong Chen,Hsin-Ying Lee,Tsung-Han Wu,Wen-Chin Chen,Winston H. Hsu,Ching-Yu Tseng,Yi-Rong Chen,Hsin-Ying Lee,Tsung-Han Wu,Wen-Chin Chen,Winston H. Hsu

To achieve accurate 3D object detection at a low cost for autonomous driving, many multi-camera methods have been proposed and solved the occlusion problem of monocular approaches. However, due to the lack of accurate estimated depth, existing multi-camera methods often generate multiple bounding boxes along a ray of depth direction for difficult small objects such as pedestrians, resulting in an ...