Improving Generalization Ability for 3D Object Detection by Learning Sparsity-invariant Features
Hsin-Cheng Lu, ChungYi Lin, Winston H. Hsu
IEEE International Conference on Robotics and Automation (ICRA)
Publication year: 2025

LiDAR-based 3D object detection is crucial as it can provide accurate performance in the field of autonomous driving and robotics. Despite the continuous development of various technologies for this task, a significant drawback is observed in most of them—they experience substantial performance degradation when detecting objects in unseen domains due to the domain gap. In this work, we introduce a novel training strategy to improve the generalization ability of LiDAR-based 3D object detectors on a single source domain.

We primarily aim to generalize from a source domain equipped with high-beam LiDAR to target domains equipped with low-beam LiDAR. To learn sparsity-invariant features, we selectively subsample the source data to a specific beam, using confidence scores determined by the current detector to identify the density that holds utmost importance for the detector. Subsequently, we leverage the pretrained detector as the backbone and employ the teacher-student framework to align the Bird’s Eye View (BEV) features between the student model, which processes augmented data, and the teacher model, which processes the original data.

We utilize feature content alignment (FCA) and graph-based embedding relationship alignment (GERA) to instruct the student detector to be domain-agnostic, mitigating the impact of domain shift. The FCA serves to maintain content consistency for the BEV features. On the other hand, GERA ensures that the pairwise relationships between each ROI embedding within the same scene remain consistent. Extensive experiments prove that the model trained with our method exhibits better generalization capabilities compared to previous approaches. Furthermore, our approach, relying solely on access to the source domain, even outperforms certain domain adaptation methods that have access to the target data.