extended bag of words formalism for image classification
play

Extended Bag-of-Words Formalism for Image Classification Sandra - PowerPoint PPT Presentation

Extended Bag-of-Words Formalism for Image Classification Sandra Avila 1 , 2 (Cotutelle PhD Candidate), ujo 1 (Advisor), Matthieu Cord 2 (Advisor), Arnaldo de A. Ara Nicolas Thome 2 (Co-Advisor), Eduardo Valle 3 (Collaborator) 1 Federal


  1. Extended Bag-of-Words Formalism for Image Classification Sandra Avila 1 , 2 (Cotutelle PhD Candidate), ujo 1 (Advisor), Matthieu Cord 2 (Advisor), Arnaldo de A. Ara´ Nicolas Thome 2 (Co-Advisor), Eduardo Valle 3 (Collaborator) 1 Federal University of Minas Gerais, NPDI Lab – UFMG, Belo Horizonte, Brazil 2 Pierre and Marie Curie University, UPMC-Sorbonne Universities, LIP6, Paris, France 3 State University of Campinas, RECOD Lab, FEEC – UNICAMP, Campinas, Brazil Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 1 / 56

  2. Image Classification: Why do we care? Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 2 / 56

  3. Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 3 / 56

  4. Huge amount of image is available Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 4 / 56

  5. Why image classification is a hard problem? Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 5 / 56

  6. Many classes and concepts Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 6 / 56

  7. Viewpoint changes Illumination variations Occlusion Background clutter Inter-class similarity Intra-class diversity Much diversity in the data Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 6 / 56

  8. How do we classify images? Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 7 / 56

  9. Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 8 / 56

  10. Problem Statement Given an image dataset, how to represent their visual content information for a classification task? Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 9 / 56

  11. Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 10 / 56

  12. night scenes sunset scenes young people old people Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 11 / 56

  13. Bag-of-Visual-Words ( BoW ) [Sivic and Zisserman, 2003; Csurka et al., 2004] Slide credit: Ken Chatfield Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 12 / 56

  14. Low-level Visual Feature Extraction patch 1  l 1 , 1 . . . l 1 ,N  l 2 , 1 . . . l 2 ,N    . .  . .   . .   l M, 1 . . . l M,N patch M Local feature extraction Patch detection : interest points, dense sampling, . . . Feature extraction : SIFT [Lowe, 2004], SURF [Bay et al., 2008], . . . Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 13 / 56

  15. Visual Codebook Coding step Visual codebook learning : random, unsupervised (e.g., k -means, GMM), supervised [Perronnin et al., 2006; Goh et al., 2012], . . . Coding : hard-assignment, soft-assignment [van Gemert et al., 2008, 2010], sparse coding [Yang et al., 2009; Boureau et al., 2010], . . . Feature coding based on the vector difference : VLAD [J´ egou et al., 2010], SVC [Zhou et al., 2010], VLAT [Picard et al., 2011], . . . Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 14 / 56

  16. Pooling step Pooling : sum/average-pooling, max-pooling [Yang et al., 2009], . . . Spatial pooling : spatial pyramid matching [Lazebnik et al., 2006], [Jia et al., 2012], . . . Spatial Pyramid Matching Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 15 / 56

  17. Other Approaches Biologically-inspired Models Deep Learning Models [Fukushima and Miyake, 1982; LeCun et al., [Hinton and Salakhutdinov, 2006; 1990; Riesenhuber and Poggio, 1999; Serre Ranzato et al., 2007; Bengio, 2009] et al., 2007; Th´ eriault et al., 2012] Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 16 / 56

  18. BossaNova Representation Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 17 / 56

  19. Coding & Pooling Matrix Representation ... ... x 1 x j x N   α 1 , 1 . . . α 1 ,j . . . α 1 ,N c 1 . . . . . . . . . . . .     H = α m, 1 . . . α m,j . . . α m,N   c m   . . . . . . . .   . . . .   α M, 1 . . . α M,j . . . α M,N c M Notations : X = { x j } , j ∈ { 1 , . . . , N } : set of local descriptors (e.g., SIFT) C = { c m } , m ∈ { 1 , . . . , M } : visual codebook Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 18 / 56

  20. Coding & Pooling Matrix Representation ... ... x 1 x j x N   α 1 , 1 . . . α 1 ,j . . . α 1 ,N c 1 . . . . . . . . . . . .     H = α m, 1 . . . α m,j . . . α m,N   c m   . . . . . . . .   . . . .   α M, 1 . . . α M,j . . . α M,N c M ⇓ f : Coding � x j − c k � 2 Coding : x j → f ( x j ) = { α m,j } , α m,j = 1 iff m = arg min 2 k ∈{ 1 ,...,M } Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 18 / 56

  21. Coding & Pooling Matrix Representation ... ... x 1 x j x N   α 1 , 1 . . . α 1 ,j . . . α 1 ,N c 1 . . . . . . . . . . . .     H = α m, 1 . . . α m,j . . . α m,N ⇒ g : Pooling   c m   . . . . . . . .   . . . .   α M, 1 . . . α M,j . . . α M,N c M � x j − c k � 2 Coding : x j → f ( x j ) = { α m,j } , α m,j = 1 iff m = arg min 2 k ∈{ 1 ,...,M } N � Pooling : g ( { α j } ) = z : ∀ m, z m = α m,j j =1 Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 18 / 56

  22. Coding & Pooling Matrix Representation ... ... x 1 x j x N     z 1 α 1 , 1 . . . α 1 ,j . . . α 1 ,N c 1 . . . . . . . . . . . . . . .         z = H = α m, 1 . . . α m,j . . . α m,N z m     c m     . . . . . . . . . .     . . . . .     z M α M, 1 . . . α M,j . . . α M,N c M � x j − c k � 2 Coding : x j → f ( x j ) = { α m,j } , α m,j = 1 iff m = arg min 2 k ∈{ 1 ,...,M } N � Pooling : g ( { α j } ) = z : ∀ m, z m = α m,j j =1 BoW representation : z = [ z 1 , z 2 , · · · , z M ] T Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 18 / 56

  23. Early Ideas We pointed out the weakness in the standard pooling operation used in the BoW signature generation. Instead of averaging all the values from one row in the H matrix, we proposed to describe their distribution. BOSSA representation ( B ag O f S tatistical S ampling A nalysis) introduces our density function-based pooling strategy . Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 19 / 56

  24. Early Ideas We pointed out the weakness in the standard pooling operation used in the BoW signature generation. Instead of averaging all the values from one row in the H matrix, we proposed to describe their distribution. BOSSA representation ( B ag O f S tatistical S ampling A nalysis) introduces our density function-based pooling strategy . Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 19 / 56

  25. Early Ideas We pointed out the weakness in the standard pooling operation used in the BoW signature generation. Instead of averaging all the values from one row in the H matrix, we proposed to describe their distribution. BOSSA representation ( B ag O f S tatistical S ampling A nalysis) introduces our density function-based pooling strategy . Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 19 / 56

  26. Our Pooling Illustration Our Pooling BoW Pooling Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 20 / 56

  27. Our Pooling Formalism g : ❘ N ❘ B − → α m − → g ( α m ) = z m � b B ; b + 1 � �� z m,b = card x j | α m,j ∈ B b and b + 1 B ≥ α min ≤ α max m m B B denotes the number of bins of each histogram z m , and [ α min m ; α max ] limits the range of distances m Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 21 / 56

  28. BossaNova Representation Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 22 / 56

  29. BossaNova Representation ... ... x 1 x j x N   α 1 , 1 . . . α 1 ,j . . . α 1 ,N c 1 . . . . . . . . . . . .   exp − β m d 2 ( x j , c m )   α m,j = α m, 1 . . . α m,j . . . α m,N   c m � K   m ′ =1 exp − β m d 2 ( x j , c m ′ ) . . . . . . . .   . . . .   α M, 1 . . . α M,j . . . α M,N c M Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 22 / 56

  30. BossaNova Representation Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 22 / 56

  31. BossaNova Representation Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 22 / 56

  32. BossaNova Representation   z 1 , st 1 . . .     z m , st m     . .   .   z M , st M Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 22 / 56

  33. BossaNova Scheme Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 23 / 56

  34. BossaNova Scheme • SIFT descriptors on a dense spatial grid at multiple scales • Dimensionality reduction by applying PCA (128 → 64) Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 23 / 56

  35. BossaNova Scheme • k -means algorithm Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 23 / 56

  36. BossaNova Scheme Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 23 / 56

  37. BossaNova Scheme • SVM classifiers are applied by using a nonlinear Gauss– ℓ 2 kernel Sandra Avila (UFMG/UPMC) sandra@dcc.ufmg.br June 2013 23 / 56

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend