Yen-Liang Lin, Ming-Kuang Tsai, Winston H. Hsu, Chih-Wei Chen
IEEE Transactions on Circuits and Systems for Video Technology, Volume 23, No 3, Pages 401-413, March 2013
Publication year: 2013

Content-based vehicle retrieval in unconstrained environment plays an important role in surveillance systems. However, due to large variations in viewing angle/position, illumination, occlusion, and background, traditional vehicle retrieval is extremely challenging. We approach this problem in a different way by rectifying vehicles from disparate views into the same reference view and searching the vehicles based on informative parts such as grille, lamp, and wheel. To extract those parts, we fit 3-D vehicle models to a 2-D image using active shape model. In the experiments, we compare different 3-D model fitting approaches and verify that the impact of part rectification on the content-based vehicle retrieval performance is significant. We propose a model fitting approach with weighted Jacobian system which leverages the prior knowledge of part information and shows promising improvements. Then, we use pyramid histogram of orientation feature to describe rectified parts (e.g., grille, wheel, lamp). We compute mean average precision of vehicle retrieval with L1 distance on NetCarShow300 dataset, a new challenging dataset we construct. We conclude that it benefits more from the fusion of informative rectified parts (e.g., grille, lamp, wheel) than a whole vehicle image described by SIFT feature for content-based vehicle retrieval.