structured regression for efficient object detection
play

Structured Regression for Efficient Object Detection Christoph - PowerPoint PPT Presentation

Structured Regression for Efficient Object Detection Christoph Lampert www.christoph-lampert.org Max Planck Institute for Biological Cybernetics, Tbingen December 3rd, 2009 [ C.L. , Matthew B. Blaschko, Thomas Hofmann. CVPR 2008]


  1. Structured Regression for Efficient Object Detection Christoph Lampert www.christoph-lampert.org Max Planck Institute for Biological Cybernetics, Tübingen December 3rd, 2009 • [ C.L. , Matthew B. Blaschko, Thomas Hofmann. CVPR 2008] • [Matthew B. Blaschko, C.L. ECCV 2008] • [ C.L. , Matthew B. Blaschko, Thomas Hofmann. PAMI 2009]

  2. Category-Level Object Localization

  3. Category-Level Object Localization What objects are present? person , car

  4. Category-Level Object Localization Where are the objects?

  5. Object Localization ⇒ Scene Interpretation A man inside of a car A man outside of a car ⇒ He’s driving. ⇒ He’s passing by.

  6. Algorithmic Approach: Sliding Window f ( y 1 ) = 0 . 2 f ( y 2 ) = 0 . 8 f ( y 3 ) = 1 . 5 Use a (pre-trained) classifier function f : • Place candidate window on the image. • Iterate: ◮ Evaluate f and store result. ◮ Shift candidate window by k pixels. • Return position where f was largest.

  7. Algorithmic approach: Sliding Window f ( y 1 ) = 0 . 2 f ( y 2 ) = 0 . 8 f ( y 3 ) = 1 . 5 Drawbacks: • single scale, single aspect ratio → repeat with different window sizes/shapes • search on grid → speed–accuracy tradeoff • computationally expensive

  8. New view: Generalized Sliding Window Assumptions: • Objects are rectangular image regions of arbitrary size. • The score of f is largest at the correct object position. Mathematical Formulation: y opt = argmax f ( y ) y ∈Y with Y = { all rectangular regions in image }

  9. New view: Generalized Sliding Window Mathematical Formulation: y opt = argmax f ( y ) y ∈Y with Y = { all rectangular regions in image } • How to choose/construct/learn the function f ? • How to do the optimization efficiently and robustly? (exhaustive search is too slow, O ( w 2 h 2 ) elements).

  10. New view: Generalized Sliding Window Mathematical Formulation: y opt = argmax f ( y ) y ∈Y with Y = { all rectangular regions in image } • How to choose/construct/learn the function f ? • How to do the optimization efficiently and robustly? (exhaustive search is too slow, O ( w 2 h 2 ) elements).

  11. New view: Generalized Sliding Window Use the problem’s geometric structure :

  12. New view: Generalized Sliding Window Use the problem’s geometric structure : • Calculate scores for sets of boxes jointly. • If no element can contain the maximum, discard the box set. • Otherwise, split the box set and iterate. → Branch-and-bound optimization • finds global maximum y opt

  13. New view: Generalized Sliding Window Use the problem’s geometric structure : • Calculate scores for sets of boxes jointly. • If no element can contain the maximum, discard the box set. • Otherwise, split the box set and iterate. → Branch-and-bound optimization • finds global maximum y opt

  14. Representing Sets of Boxes • Boxes: [ l , t , r , b ] ∈ R 4 .

  15. Representing Sets of Boxes • Boxes: [ l , t , r , b ] ∈ R 4 . Boxsets: [ L , T , R , B ] ∈ ( R 2 ) 4

  16. Representing Sets of Boxes • Boxes: [ l , t , r , b ] ∈ R 4 . Boxsets: [ L , T , R , B ] ∈ ( R 2 ) 4 Splitting: • Identify largest interval.

  17. Representing Sets of Boxes • Boxes: [ l , t , r , b ] ∈ R 4 . Boxsets: [ L , T , R , B ] ∈ ( R 2 ) 4 Splitting: • Identify largest interval. Split at center: R �→ R 1 ∪ R 2 .

  18. Representing Sets of Boxes • Boxes: [ l , t , r , b ] ∈ R 4 . Boxsets: [ L , T , R , B ] ∈ ( R 2 ) 4 Splitting: • Identify largest interval. Split at center: R �→ R 1 ∪ R 2 . • New box sets: [ L , T , R 1 , B ]

  19. Representing Sets of Boxes • Boxes: [ l , t , r , b ] ∈ R 4 . Boxsets: [ L , T , R , B ] ∈ ( R 2 ) 4 Splitting: • Identify largest interval. Split at center: R �→ R 1 ∪ R 2 . • New box sets: [ L , T , R 1 , B ] and [ L , T , R 2 , B ] .

  20. Calculating Scores for Box Sets Example: Linear Support-Vector-Machine f ( y ) := � p i ∈ y w i . + f upper ( Y ) = � � min ( 0 , w i ) + max ( 0 , w i ) p i ∈ y ∩ p i ∈ y ∪ Can be computed in O ( 1 ) using integral images .

  21. Calculating Scores for Box Sets j , h y Histogram Intersection Similarity: f ( y ) := � J j = 1 min ( h ′ j ) . � J j , h y ∪ f upper ( Y ) = j = 1 min ( h ′ j ) As fast as for a single box: O ( J ) with integral histograms .

  22. Evaluation: Speed (on PASCAL VOC 2006) Sliding Window Runtime: • always: O ( w 2 h 2 ) Branch-and-Bound (ESS) Runtime: • worst-case: O ( w 2 h 2 ) • empirical: not more than O ( wh )

  23. Extensions: Action classification: ( y , t ) opt = argmax ( y , t ) ∈Y× T f x ( y , t ) • J. Yuan: Discriminative 3D Subvolume Search for Efficient Action Detection , CVPR 2009.

  24. Extensions: Localized image retrieval: ( x , y ) opt = argmax y ∈Y , x ∈D f x ( y ) • C.L.: Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval , ICCV 2009

  25. Extensions: Hybrid – Branch-and-Bound with Implicit Shape Model • A. Lehmann, B. Leibe, L. van Gool: Feature-Centric Efficient Subwindow Search , ICCV 2009

  26. Generalized Sliding Window y opt = argmax f ( y ) y ∈Y with Y = { all rectangular regions in image } • How to choose/construct/learn f ? • How to do the optimization efficiently and robustly?

  27. Traditional Approach: Binary Classifier Training images: • x + 1 , . . . , x + n show the object • x − 1 , . . . , x − m show something else Train a classifier, e.g. • support vector machine, • boosted cascade, • artificial neural network,. . . Decision function f : { images } → R • f > 0 means “ image shows the object .” • f < 0 means “ image does not show the object .”

  28. Traditional Approach: Binary Classifier Drawbacks: • Train distribution � = test distribution • No control over partial detections. • No guarantee to even find training examples again.

  29. Object Localization as Structured Output Regression Ideal setup: • function g : { all images } → { all boxes } to predict object boxes from images • train and test in the same way, end-to-end    = g car 

  30. Object Localization as Structured Output Regression Ideal setup: • function g : { all images } → { all boxes } to predict object boxes from images • train and test in the same way, end-to-end Regression problem: • training examples ( x 1 , y 1 ) , . . . , ( x n , y n ) ∈ X × Y ◮ x i are images, y i are bounding boxes • Learn a mapping g : X → Y that generalizes from the given examples: ◮ g ( x i ) ≈ y i , for i = 1 , . . . , n ,

  31. Structured Support Vector Machine SVM-like framework by Tsochantaridis et al. : • Positive definite kernel k : ( X × Y ) × ( X × Y ) → R . ϕ : X × Y → H : (implicit) feature map induced by k . • ∆ : Y × Y → R : loss function • Solve the convex optimization problem n 1 2 � w � 2 + C � min w ,ξ ξ i i =1 subject to margin constraints for i =1 , . . . , n : ∀ y ∈ Y \ { y i } : ∆ ( y , y i ) + � w , ϕ ( x i , y ) �−� w , ϕ ( x i , y i ) � ≤ ξ i , • unique solution: w ∗ ∈ H • I. Tsochantaridis, T. Joachims, T. Hofmann, Y. Altun: Large Margin Methods for Structured and Interdependent Output Variables, Journal of Machine Learning Research (JMLR), 2005.

  32. Structured Support Vector Machine • w ∗ defines compatiblity function F ( x , y )= � w ∗ , ϕ ( x , y ) � • best prediction for x is the most compatible y : g ( x ) := argmax F ( x , y ) . y ∈Y • evaluating g : X → Y is like generalized Sliding Window: ◮ for fixed x , evaluate quality function for every box y ∈ Y . ◮ for example, use previous branch-and-bound procedure!

  33. Joint Image/Box-Kernel: Example Joint kernel: how to compare one (image,box)-pair ( x , y ) with another (image,box)-pair ( x ′ , y ′ ) ? � � � � , = k , is large. k joint � � � � , = k , is small. k joint � � � � k joint , = k image , could also be large.

  34. Loss Function: Example Loss function: how to compare two boxes y and y ′ ? ∆ ( y , y ′ ) := 1 − area overlap between y and y ′ = 1 − area( y ∩ y ′ ) area( y ∪ y ′ )

  35. Structured Support Vector Machine n 2 � w � 2 + C 1 • S-SVM Optimization: � min w ,ξ i =1 ξ i subject to for i =1 , . . . , n : ∀ y ∈ Y \ { y i } : ∆ ( y , y i ) + � w , ϕ ( x i , y ) �−� w , ϕ ( x i , y i ) � ≤ ξ i ,

  36. Structured Support Vector Machine n 2 � w � 2 + C 1 • S-SVM Optimization: � min w ,ξ i =1 ξ i subject to for i =1 , . . . , n : ∀ y ∈ Y \ { y i } : ∆ ( y , y i ) + � w , ϕ ( x i , y ) �−� w , ϕ ( x i , y i ) � ≤ ξ i , • Solve via constraint generation : • Iterate: ◮ Solve minimization with working set of contraints ◮ Identify argmax y ∈Y ∆ ( y , y i ) + � w , ϕ ( x i , y ) � ◮ Add violated constraints to working set and iterate • Polynomial time convergence to any precision ε • Similar to bootstrap training, but with a margin.

  37. Evaluation: PASCAL VOC 2006 Example detections for VOC 2006 bicycle , bus and cat . Precision–recall curves for VOC 2006 bicycle , bus and cat . • Structured regression improves detection accuracy. • New best scores (at that time) in 6 of 10 classes.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend