For practical applications, an object detection system requires huge number of classes to meet real world needs. Many successful object detection systems use part-based model which trains several filters (classifiers) for each class to perform multiclass object detection. However, these methods have linear computational complexity in regard to the number of classes and may lead to huge computing time. To solve the problem, some works learn a codebook for the filters and conduct operations only on the codebook to make computational complexity sublinear in regard to the number of classes. But the past studies missed to consider filter characteristics, e.g., filters are weights trained by Support Vector Machine, and rather they applied method such as sparse coding for visual signals’ optimization. This misuse results in huge accuracy loss when a large speedup is required. To remedy this shortcoming, we have developed a new method called Regularized Sparse Coding which is designed to reconstruct filter functionality. That is, it reconstructs the ability of filter to produce accurate score for classification. Our method can reconstruct filters by minimizing score map error, while sparse coding reconstructs filters by minimizing appearance error. This different optimization strategy makes our method be able to have small accuracy loss when a large speedup is achieved. On the ILSVRC 2013 dataset, which has 200 classes, this work represents a 16 times speedup using only 1.25% memory on single CPU with 0.04 mAP drop when compared with the original Deformable Part Model. Moreover, parallel computing on GPUs is also applicable for our work to achieve more speedup.
Leave a Reply