PASCAL VOC Classification Challenge 2009

Team: Fahad Shahbaz Khan, Joost van de Weijer, Andrew Bagdanov, Noha Elfiky, David Rojas, Marco Pedersoli, Xavier Boix, Pep Gonfaus, Hany salahEldeen, Robert Benavente, Jordi Gonzalez, Maria Vanrell.

Results: Our method obtained the best score in 2 of the 20 classes, thereby finishing second behind the NEC/UIUC submission. More results can be found on the VOC workshop page.

Method: We follow the standard bag-of-words approach. In the feature detection step, we use Harris Laplace, Color Boosted HarisLaplace, Dense Multi-scale Grid, Blob, and Color Boosted Blob detectors. For feature extraction stage, SIFT , Hue, Color names, Opp-SIFT, C-SIFT and RGSIFT, GIST have been used. The spatial information is incorporated by means of Spatial pyramids. In the Vocabulary construction step, we constructed vcoabularies using standard K-means algorithm and then compressed the vocabularies using agglomerative information bottleneck method. For learning, we used the intersection kernel SVM. We combined the results of our image classification with object localization scores obtained through HOG pyramids detector and ESS detector.

The main novelty in our whole pipeline is the introduction of a new method, called Top-Down Color Attention, which combines color and shape features where color is used to compute a top-down class-specific color attention map. This color attention map is then used to modulate the weight of the local shape features. Finally, a class-specific image histogram is constructed for each category. By using our top-down color attention approach significant improvement is obtained for all the categories in the VOC data set. We have also used the results of our classification method for Segmentation challenge. The color attention method is explained in detail in our ICCV 2009 paper.

Data: Please email at fahad@cvc(dot)uab(dot)es for the histograms used in the challenge and the final probabilities


F. Shahbaz Khan, J. van de Weijer, M. Vanrell, Top-Down Color Attention for Object Recognition, Proc. ICCV 2009, Kyoto, Japan. (webpage + code)

H. Harzallah, F. Jurie, and C. Schmid, Combining efficient object localization and image classification, Proc. ICCV 2009, Kyoto, Japan.

B. Fulkerson, A. Vedaldi, and S. Soatto, Localizing objects with smart dictionaries, Proc. ECCV 2008, France.

S. Lazebnik, C. Schmid, and J. Ponce, Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, Proc. CVPR 2006, USA.

S. Maji, A. C. Berg, and J. Malik, Classification using intersection kernel support vector machines is efficient, Proc. CVPR 2008, USA.

K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, , and L. V. Gool, A comparison of affine region detectors, IJCV, 65(1-2):43-72, 2005.

K. van de Sande, T. Gevers, and C. Snoek, Evaluating color descriptors for object and scene recognition, PAMI (in Press), 2010.

J. van de Weijer, C. Schmid, J. J. Verbeek, and D. Larlus, Learning color names for real-world applications, IEEE Transaction in Image Processing (TIP), 18(7):1512-1524, 2009.

A. Oliva and A. Torralba, Modeling the shape of the scene: a holistic representation of the spatial envelope, IJCV, Vol. 42(3): 145-175, 2001.

My Research Page

 © 2008 Colour in context Group | Computer Vision Center. All rights reserved | Contact webmaster |  Last updated: Monday 11 May 2009     eXTReMe Tracker