Learning Temporal Cues by Predicting Objects Move for Multi-camera 3D Object Detection

Seokha Moon,Hongbeen Park,Jaekoo Lee,Jinkyu Kim,Seokha Moon,Hongbeen Park,Jaekoo Lee,Jinkyu Kim

In autonomous driving and robotics, there is a growing interest in utilizing short-term historical data to enhance multi-camera 3D object detection, leveraging the continuous and correlated nature of input video streams. Recent work has focused on spatially aligning BEV-based features over timesteps. However, this is often limited as its gain does not scale well with long-term past observations. T...