pictorial structures
play

Pictorial structures Laurens van der Maaten Introduction Object - PowerPoint PPT Presentation

Pictorial structures Laurens van der Maaten Introduction Object detection aims to find a particular object in an image Most popular object detectors are based on a discriminative model : Gather annotated image patches (positive and


  1. Pictorial structures Laurens van der Maaten

  2. Introduction • Object detection aims to find a particular object in an image • Most popular object detectors are based on a discriminative model : • Gather annotated image patches (positive and negative examples) • Extract your favorite image features from these image patches • Train a classifier on the features to discriminate object from everything else • Classifier is applied on candidate locations to determine object presence • The Dalal-Triggs detector is a commonly used object detector

  3. Dalal-Triggs detector • Extract histograms of oriented gradients (HOG) features from image patch: � � � � • HOG features divide an image into small (8x8) blocks , and measure the gradient orientations in each of the blocks using a histogram (almost like SIFT) * Dalal & Triggs, 2005

  4. Dalal-Triggs detector • Di ff erent objects have di ff erent HOG features:

  5. Dalal-Triggs detector • Train a linear SVM on annotated images to predict object presence: 
 w ∗ = argmin � 0 , 1 − y w T φ ( I ; x ) � Training: max w Detection: s ( I ; x ) = w ∗ T φ ( I ; x )

  6. Dalal-Triggs detector • Train a linear SVM on annotated images to predict object presence: 
 w ∗ = argmin � 0 , 1 − y w T φ ( I ; x ) � Training: max w Detection: s ( I ; x ) = w ∗ T φ ( I ; x ) � � � � • How do we get the negative examples to train the SVM?

  7. Dalal-Triggs detector • Train a linear SVM on annotated images to predict object presence: 
 w ∗ = argmin � 0 , 1 − y w T φ ( I ; x ) � Training: max w Detection: s ( I ; x ) = w ∗ T φ ( I ; x ) � � � � • How do we get the negative examples to train the SVM? Random patches!

  8. Dalal-Triggs detector • HOG visualization of the SVM weights for a pedestrian detector:

  9. 
 Dalal-Triggs detector • Applying the detector at each location leads to a confidence map : 
 � � � x � • Non-maxima suppression can be used to obtain the final detections

  10. Dalal-Triggs detector • Example of pedestrian detections using Dalal-Triggs detector:

  11. Pictorial structures • What can we do when a part of the object to be detected is occluded?

  12. Pictorial structures • What can we do when a part of the object to be detected is occluded? • Exploit the fact that other parts of the object are still visible!

  13. Pictorial structures • • What can we do when a part of the object to be detected is occluded? • Exploit the fact that other parts of the object are still visible! • Pictorial structures does this by modeling objects as a constellation of parts: * Fischler & Elschlager, 1973 Fischler ¡and ¡Elschlager ¡‘73

  14. Deformable template models • Defines a score function that involves parts and part deformations : s ( I ; x 0 , y 0 , . . . , x | V | , y | V | ) = w T X w T X 0 φ ( I ; x 0 , y 0 ) + i φ ( I ; x i , y i ) + d ij φ d ( x i − x j , y i − y j ) i ∈ V ( i,j ) ∈ E Global object model * Felzenszwalb et al. , 2010

  15. Deformable template models • Defines a score function that involves parts and part deformations : s ( I ; x 0 , y 0 , . . . , x | V | , y | V | ) = w T X w T X 0 φ ( I ; x 0 , y 0 ) + i φ ( I ; x i , y i ) + d ij φ d ( x i − x j , y i − y j ) i ∈ V ( i,j ) ∈ E Global object model Object part models * Felzenszwalb et al. , 2010

  16. Deformable template models • Defines a score function that involves parts and part deformations : s ( I ; x 0 , y 0 , . . . , x | V | , y | V | ) = w T X w T X 0 φ ( I ; x 0 , y 0 ) + i φ ( I ; x i , y i ) + d ij φ d ( x i − x j , y i − y j ) i ∈ V ( i,j ) ∈ E Global object model Object part models Deformation model * Felzenszwalb et al. , 2010

  17. Deformable template models • Defines a score function that involves parts and part deformations : 
 s ( I ; x 0 , y 0 , . . . , x | V | , y | V | ) = w T X w T X 0 φ ( I ; x 0 , y 0 ) + i φ ( I ; x i , y i ) + d ij φ d ( x i − x j , y i − y j ) i ∈ V ( i,j ) ∈ E � � � � Global object model Object part models Deformation model • Deformable template models are much more robust against partial occlusions and deformations of non-rigid objects * Felzenszwalb et al. , 2010

  18. Pictorial structures • Find the optimal configuration of a pictorial structures (detection) as follows: x 0 ,y 0 ,...,x | V | ,y | V | s ( I ; x 0 , y 0 , . . . , x | V | , y | V | ) max

  19. Pictorial structures • Find the optimal configuration of a pictorial structures (detection) as follows: x 0 ,y 0 ,...,x | V | ,y | V | s ( I ; x 0 , y 0 , . . . , x | V | , y | V | ) max � • For squared-error deformation models, this can be done very e ffi ciently: x j ( f ( x j ) + ( x i − x j ) 2 ) g ( x i ) = min

  20. Pictorial structures • Find the optimal configuration of a pictorial structures (detection) as follows: x 0 ,y 0 ,...,x | V | ,y | V | s ( I ; x 0 , y 0 , . . . , x | V | , y | V | ) max � • For squared-error deformation models, this can be done very e ffi ciently: x j ( f ( x j ) + ( x i − x j ) 2 ) g ( x i ) = min final score with deformations

  21. Pictorial structures • Find configuration of pict. structures model by maximizing over part locations: x 0 ,y 0 ,...,x | V | ,y | V | s ( I ; x 0 , y 0 , . . . , x | V | , y | V | ) max � • For squared-error deformation models, this can be done very e ffi ciently: x j ( f ( x j ) + ( x i − x j ) 2 ) g ( x i ) = min final score negative part with deformations model score

  22. Pictorial structures • Find the optimal configuration of a pictorial structures (detection) as follows: x 0 ,y 0 ,...,x | V | ,y | V | s ( I ; x 0 , y 0 , . . . , x | V | , y | V | ) max � • For squared-error deformation models, this can be done very e ffi ciently: x j ( f ( x j ) + ( x i − x j ) 2 ) g ( x i ) = min final score negative part deformation penalty with deformations model score

  23. Pictorial structures • Find the optimal configuration of a pictorial structures (detection) as follows: x 0 ,y 0 ,...,x | V | ,y | V | s ( I ; x 0 , y 0 , . . . , x | V | , y | V | ) max � • For squared-error deformation models, this can be done very e ffi ciently: 
 x j ( f ( x j ) + ( x i − x j ) 2 ) g ( x i ) = min � final score negative part deformation penalty with deformations model score � • Hence, we have a parabola for every pixel rooted at ( x j , f ( x j )) x j

  24. Pictorial structures f(1) f(2) f(n-1) f(0) . . . . . . . . . . . . . 0 1 2 n-1 * Felzenszwalb & Huttenlocher, 2004

  25. Pictorial structures � � f(1) � f(2) f(n-1) � f(0) � . . . . . . . . . . . . . 0 1 2 n-1 • It is straightforward to compute the intersection between two parabolas: i = ( f ( x i ) + x 2 i ) − ( f ( x j ) + x 2 j ) 2 x i − 2 x j * Felzenszwalb & Huttenlocher, 2004

  26. Pictorial structures • If : parabola corresponding to is below that of left of the x j < x i x i x j intersection, and above it right of the intersection f(1) f(2) f(n-1) f(0) . . . . . . . . . . . . . 0 1 2 n-1 * Felzenszwalb & Huttenlocher, 2004

  27. Pictorial structures • Maintain the lower envelope of the parabolas (parabolas and intersections) • When adding a new parabola, there are two possibilities: v[k-1] v[k] z[k] s q v[k-1] v[k] s z[k] q new intersection left of last intersection: 
 new intersection right of last intersection: 
 remove last parabola from the envelope maintain last parabola in the envelope

  28. Pictorial structures • This suggests a simple algorithm that is linear in the number of pixels: • Maintain list with the lower envelope of the parabolas (indices and intersections) • Move from left to right through all parabolas; and do for each parabola: • Find intersection of parabola with the last parabola in lower envelope • If intersection is left of last intersection in lower envelope: remove last parabola from lower envelope, and go back one step • Add parabola to lower envelope, starting from intersection * Felzenszwalb & Huttenlocher, 2004

  29. model feature map feature map at twice the resolution ... x x x ... response of part filters response of root filter ... transformed responses + color encoding of filter response values combined score of low value high value root locations

  30. Graph structure • One can define di ff erent graph structures, as long as they are trees: � Star-shaped tree Minimum spanning tree � � � � • The tree structure is fixed, but edge lengths and directions are learned

  31. Pictorial structures • Examples of object detections by pictorial-structures models: * Felzenszwalb et al. , 2010

  32. Results • Precision / recall curves for car detector on Pascal VOC: class: car, year 2006 1 0.9 0.8 0.7 0.6 precision 0.5 0.4 0.3 1 Root (0.48) 2 Root (0.58) 0.2 1 Root+Parts (0.55) 2 Root+Parts (0.62) 0.1 2 Root+Parts+BB (0.64) 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 recall * Felzenszwalb et al. , 2010

  33. Example detections person car horse sofa

  34. Pictorial structures • Use pictorial structures to prevent trackers from “switching” objects: * Zhang & van der Maaten, 2013

  35. Questions?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend