pictorial structures revisited people detection and
play

Pictorial Structures Revisited: People Detection and Articulated - PowerPoint PPT Presentation

Pictorial Structures Revisited: People Detection and Articulated Pose Estimation Mykhaylo Andriluka Stefan Roth Bernt Schiele Department of Computer Science TU Darmstadt Pictorial Structures Revisited: People Detection and Articulated Pose


  1. Pictorial Structures Revisited: People Detection and Articulated Pose Estimation Mykhaylo Andriluka Stefan Roth Bernt Schiele Department of Computer Science TU Darmstadt Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009

  2. Generic model for human detection and pose estimation Human pose estimation [Felzenszwalb&Huttenlocher, ICCV’05], [Ren et al., ICCV’05], [Sigal&Black, CVPR’06], [Zhang et al., CVPR’06], [Jiang&Marin, CVPR’08], [Ramanan, NIPS’06], [Ferrari et al., CVPR’08], [Ferrari et al., CVPR’09] often rather simple appearance model focus on finding optimal assembly of parts People Detection [Viola et al., ICCV’03], [Dalal&Triggs, CVPR’05], [Leibe et al., CVPR’05], [Andriluka et al., CVPR’08] complex appearance model no pose model or limited to walking motion Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 2

  3. Generic model for human detection and pose estimation Human pose estimation [Felzenszwalb&Huttenlocher, ICCV’05], [Ren et al., ICCV’05], [Sigal&Black, CVPR’06], [Zhang et al., CVPR’06], [Jiang&Marin, CVPR’08], [Ramanan, NIPS’06], [Ferrari et al., CVPR’08], [Ferrari et al., CVPR’09] often rather simple appearance model focus on finding optimal assembly of parts People Detection [Viola et al., ICCV’03], [Dalal&Triggs, CVPR’05], [Leibe et al., CVPR’05], [Andriluka et al., CVPR’08] complex appearance model no pose model or limited to walking motion Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 3

  4. Can we make pictorial structures model effective for these tasks? [Fischler&Elschlager, 1973] Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 4

  5. Can we make pictorial structures model effective for these tasks? Yes... if the model components are chosen right. Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 5

  6. Pictorial Structures Model • Body is represented as flexible L configuration of body parts - configuration of parts L = { l 0 , l 1 , . . . , l N } D = { d 0 , d 1 , . . . , d N } - part evidence d i d i posterior over body poses p ( L | D ) ∝ p ( D | L ) p ( L ) prior on body poses likelihood of observations Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 6

  7. Pictorial Structures Model Pictorial structures allow exact and efficient inference. - Gaussian pairwise part - tree-structured prior relationships - independent part appearance model - discretized part locations posterior marginals sum- product BP � p ( l i | D ) ∝ p ( L | D ) L \ l i l 9 l 5 l 6 l 7 l 8 l 10 l 2 l 3 l 1 l 4 Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 7

  8. Can we make pictorial structures model effective for these tasks? So... what are the right components? Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 8

  9. Model Components Appearance Model: Prior and Inference: likelihood estimated orientation 1 of part 1 pose . − 60 . − 40 − 20 . 0 20 Local 40 . 60 Features AdaBoost 80 100 − 50 0 50 ... ... part likelihood posteriors orientation K of part N . . . . Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 9

  10. Model Components Appearance Model: Prior and Inference: likelihood estimated orientation 1 of part 1 pose . − 60 . − 40 − 20 . 0 20 Local 40 . 60 Features AdaBoost 80 100 − 50 0 50 ... ... part likelihood posteriors orientation K of part N . . . . Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 10

  11. Likelihood Model • Build on recent advances in object detection: ‣ state-of-the-art image descriptor: Shape Context [Belongie et al., PAMI’02; Mikolajczyk&Schmid, PAMI’05] ‣ dense representation ‣ discriminative model: AdaBoost classifier for each body part - Shape Context: 96 dimensions (4 angular, 3 radial, 8 gradient orientations) - Feature Vector: concatenate the descriptors inside part bounding box - head: 4032 dimensions - torso: 8448 dimensions Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 11

  12. Likelihood Model • Part likelihood derived from the boosting score: decision stump weight decision stump output �� � t α i,t h t ( x ( l i )) p ( d i | l i ) = max ˜ , ε 0 � t α i,t small constant to deal with part part location occlusions Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 12

  13. Likelihood Model Head Torso Upper leg Input image Our part likelihoods . . . . [Ramanan, NIPS’06] Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 13

  14. Likelihood Model Head Torso Upper leg Input image Our part likelihoods . . . . [Ramanan, NIPS’06] Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 14

  15. Likelihood Model Head Torso Upper leg Input image Our part likelihoods . . . . [Ramanan, NIPS’06] Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 15

  16. Model Components Appearance Model: Prior and Inference: likelihood estimated orientation 1 of part 1 pose . − 60 . − 40 − 20 . 0 20 Local 40 . 60 Features AdaBoost 80 100 − 50 0 50 ... ... part likelihood posteriors orientation K of part N . . . . Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 16

  17. Kinematic Tree Prior • Represent pairwise part relations l 2 [Felzenszwalb & Huttenlocher, IJCV’05] l 1 � p ( L ) = p ( l 0 ) p ( l i | l j ) , ( i,j ) ∈ E p ( l 2 | l 1 ) = N ( T 12 ( l 2 ) | T 21 ( l 1 ) , Σ 12 ) part locations relative transformed to the joint part locations − 50 − 50 − 40 − 40 − 30 − 30 − 20 l 2 − 20 − 10 − 10 + 0 0 l 1 10 10 20 20 30 30 40 40 50 50 − 50 0 50 − 50 0 50 Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 17

  18. Kinematic Tree Prior • Prior parameters: { T ij , Σ ij } • Parameters of the prior are estimated with maximum likelihood mean pose several independent samples − 60 − 60 − 60 − 60 − 40 − 40 − 40 − 40 − 20 − 20 − 20 0 0 0 20 20 20 − 20 40 40 40 60 60 60 0 80 80 80 100 100 100 20 120 120 120 − 80 − 60 − 40 − 20 0 20 40 60 80 − 80 − 60 − 40 − 20 0 20 40 60 80 − 80 − 60 − 40 − 20 0 20 40 60 80 − 60 − 60 − 60 40 − 40 − 40 − 40 − 20 − 20 − 20 60 0 0 0 20 20 20 40 40 40 80 60 60 60 80 80 80 100 100 100 100 120 120 120 − 50 0 50 − 80 − 60 − 40 − 20 0 20 40 60 80 − 80 − 60 − 40 − 20 0 20 40 60 80 − 80 − 60 − 40 − 20 0 20 40 60 80 Figure 2. (left) Kinematic prior learned on the multi-view and Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 18

  19. Evaluation Scenarios 1. Human Pose Estimation “People” dataset [Ramanan, NIPS’06] 2. Upper-body Pose Estimation “Buffy” dataset [Ferrari et al., CVPR’08] 3. Pedestrian Detection “TUD Pedestrians” dataset [Andriluka et al., CVPR’08] Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 19

  20. Evaluation Scenarios 1. Human Pose Estimation “People” dataset [Ramanan, NIPS’06] 2. Upper-body Pose Estimation “Buffy” dataset [Ferrari et al., CVPR’08] 3. Pedestrian Detection “TUD Pedestrians” dataset [Andriluka et al., CVPR’08] Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 20

  21. Scenario 1: Qualitative Results (g) (a) (d) Our model 8/10 8/10 7/10 [Ramanan, NIPS’06] 7/10 0/10 3/10 (l) (k) (i) Our model 6/10 7/10 8/10 [Ramanan, NIPS’06] 3/10 3/10 4/10 (bottom). The numbers on the left of Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 21

  22. Scenario 1: Quantitative Results Upper Lower Upper Method Torso Forearm Head Total legs legs arm [Ramanan, NIPS’06] 52 30 29 17 13 37 27 2nd parse Our inference, edge features from 63 48 37 26 20 45 37 [Ramanan, NIPS’06] Our part detectors 29 12 18 3 4 40 14 (SC) Our prior, our part 81 63 55 47 31 75 55 detectors (SC) Our prior, our part 78 58 54 44 31 66 52 detectors (SIFT) Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 22

  23. Scenario 1: Quantitative Results Upper Lower Upper Method Torso Forearm Head Total legs legs arm [Ramanan, NIPS’06] 52 30 29 17 13 37 27 2nd parse Our prior, edge features from 63 48 37 26 20 45 37 [Ramanan, NIPS’06] Our part detectors 29 12 18 3 4 40 14 (SC) Our prior, our part 81 63 55 47 31 75 55 detectors (SC) Our prior, our part 78 58 54 44 31 66 52 detectors (SIFT) Pictorial Structures Revisited: People Detection and Articulated Pose Estimation - CVPR 2009 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend