people tracking by detection and people detection by
play

People-Tracking-by-Detection and People-Detection-by-Tracking - PowerPoint PPT Presentation

People-Tracking-by-Detection and People-Detection-by-Tracking Mykhaylo Andriluka Stefan Roth Bernt Schiele Department of Computer Science TU Darmstadt People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 Motivation


  1. People-Tracking-by-Detection and People-Detection-by-Tracking Mykhaylo Andriluka Stefan Roth Bernt Schiele Department of Computer Science TU Darmstadt People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008

  2. Motivation • Goal: Detection and tracking of people in complex scenes • Challenges for detection: ‣ Partial occlusions ‣ Appearance variation ‣ Data association difficult • Challenges for tracking: ‣ Dynamic backgrounds ‣ Multiple people ‣ Frequent long term occlusions People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 2

  3. Motivation • Goal: Detection and tracking of people in complex scenes • Challenges for detection: ‣ Partial occlusions ‣ Appearance variation ‣ Data association difficult • Challenges for tracking: ‣ Dynamic backgrounds ‣ Multiple people ‣ Frequent long term occlusions People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 3

  4. Overview Three stages of our multi-person detection and tracking system: 1. Single-frame detection People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 4

  5. Overview Three stages of our multi-person detection and tracking system: 1. Single-frame 2. Tracklet detection detection People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 4

  6. Overview Three stages of our multi-person detection and tracking system: 1. Single-frame 3. Tracking through 2. Tracklet detection detection occlusion People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 4

  7. Previous Work • People Detection & Tracking: ‣ [Fossati et al., CVPR 2007]: 3D articulated tracking aided by detection, single person, ground plane needed. ‣ [Leibe et al., ICCV 2007]: Detection of tracking of multiple people, high viewpoint → no full-body occlusions. ‣ [Ramanan et al., PAMI 2007]: Appearance model learned from people detection, then used for tracking and data association. ‣ [Wu & Nevatia, IJCV 2007]: Use detection for tracking, works for multiple people → no articulations, detector not aided by tracking. • Here: ‣ More people ‣ Significant, long-term full-body occlusions ‣ However: more restricted scenario (2-D, people in side views) People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 5

  8. Overview Three stages of our multi-person detection and tracking system: 1. Single-frame 3. Tracking through 2. Tracklet detection detection occlusion People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 6

  9. Single-frame Detector: partISM • Appearance of parts: Implicit Shape Model (ISM) [Leibe, Seemann & Schiele, CVPR 2005] People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 7

  10. Single-frame Detector: partISM • Appearance of parts: Implicit Shape Model (ISM) [Leibe, Seemann & Schiele, CVPR 2005] x o People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 7

  11. Single-frame Detector: partISM • Appearance of parts: Implicit Shape Model (ISM) [Leibe, Seemann & Schiele, CVPR 2005] x o People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 7

  12. Single-frame Detector: partISM • Appearance of parts: x 8 Implicit Shape Model (ISM) [Leibe, Seemann & Schiele, CVPR 2005] x 7 x o x 3 x 6 x 2 x 5 x 4 x 1 People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 8

  13. Single-frame Detector: partISM • Appearance of parts: x 8 Implicit Shape Model (ISM) [Leibe, Seemann & Schiele, CVPR 2005] x 7 • Part decomposition and inference: x o Pictorial structures model [Felzenszwalb & Huttenlocher, IJCV 2005] x 3 x 6 x 2 x 5 x 4 x 1 People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 8

  14. Single-frame Detector: partISM • Appearance of parts: x 8 Implicit Shape Model (ISM) [Leibe, Seemann & Schiele, CVPR 2005] x 7 • Part decomposition and inference: x o Pictorial structures model [Felzenszwalb & Huttenlocher, IJCV 2005] x 3 x 6 x 2 x 5 x 4 x 1 People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 8

  15. Single-frame Detector: partISM • Appearance of parts: x 8 Implicit Shape Model (ISM) [Leibe, Seemann & Schiele, CVPR 2005] x 7 • Part decomposition and inference: x o Pictorial structures model [Felzenszwalb & Huttenlocher, IJCV 2005] x 3 x 6 x 2 x 5 p ( L | E ) ∝ p ( E | L ) p ( L ) x 4 x 1 Body-part positions Image evidence People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 8

  16. Part Decomposition • - configuration of L = { x o , x 1 , . . . , x 8 } x 8 body parts • Structure of the prior distribution : p ( L ) x 7 ‣ Articulation variable models correlations a x o between part positions. ‣ Given articulation, prior on configuration x 3 x 6 becomes a star model. x 2 x 5 articulation x 4 x 1 part position a x i x o object center People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 9

  17. Part Decomposition • - configuration of L = { x o , x 1 , . . . , x 8 } x 8 body parts • Structure of the prior distribution : p ( L ) x 7 ‣ Articulation variable models correlations a x o between part positions. ‣ Given articulation, prior on configuration x 3 x 6 becomes a star model. x 2 x 5 articulation x 4 x 1 part position a x i x o object center People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 9

  18. Part Decomposition • - configuration of L = { x o , x 1 , . . . , x 8 } x 8 body parts • Structure of the prior distribution : p ( L ) x 7 ‣ Articulation variable models correlations a x o between part positions. ‣ Given articulation, prior on configuration x 3 x 6 becomes a star model. x 2 x 5 articulation x 4 x 1 part position a x i x o object center People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 9

  19. Part Decomposition • - configuration of L = { x o , x 1 , . . . , x 8 } body parts • Structure of the prior distribution : p ( L ) ‣ Articulation variable models correlations a between part positions. ‣ Given articulation, prior on configuration becomes a star model. articulation part position Covariance and mean part a p ( x i | x o ) positions for . x i x o object center People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 10

  20. Single Frame Detection • Detections at equal error rate: HOG 4D-ISM partISM People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 11

  21. Single-frame Detection Results TUD pedestrians data No occlusions • partISM clearly outperforms 4D-ISM [Seemann et al, DAGM’06] . • Outperforms HOG [Dalal&Triggs, CVPR’05] with much less training data (Note: we only use sideviews). People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 12

  22. Overview Three stages of our multi-person detection and tracking system: 1. Single-frame 3. Tracking through 2. Tracklet detection detection occlusion People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 13

  23. Tracklet Detection in Short Subsequences frame 2 frame m frame 1 • Given: E = [ E 1 , . . . , E m ] ... • Want: overlapping subsequences x 8 x 7 x o x 3 x 6 x 2 x 5 x 4 x 1 • Posterior over positions and configurations: People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 14

  24. Tracklet Detection in Short Subsequences frame 2 frame m frame 1 • Given: E = [ E 1 , . . . , E m ] ... • Want: overlapping subsequences x 8 x 7 X o ∗ = [ x o ∗ x o x o 1 , . . . , x o ∗ m ] x 3 x 6 body positions x 2 x 5 x 4 x 1 • Posterior over positions and configurations: People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 14

  25. Tracklet Detection in Short Subsequences frame 2 frame m frame 1 • Given: E = [ E 1 , . . . , E m ] ... • Want: overlapping subsequences x 8 x 7 0 Y ∗ = [ y ∗ X o ∗ = [ x o ∗ 50 1 , . . . , y ∗ m ] x o x o 1 , . . . , x o ∗ m ] 100 x 3 x 6 body configurations 150 body positions x 2 x 5 200 x 4 250 x 1 − 200 − 150 − 100 − 50 0 50 100 • Posterior over positions and configurations: People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 14

  26. Tracklet Detection in Short Subsequences frame 2 frame m frame 1 • Given: E = [ E 1 , . . . , E m ] ... • Want: overlapping subsequences x 8 x 7 0 Y ∗ = [ y ∗ X o ∗ = [ x o ∗ 50 1 , . . . , y ∗ m ] x o x o 1 , . . . , x o ∗ m ] 100 x 3 x 6 body configurations 150 body positions x 2 x 5 200 x 4 250 x 1 − 200 − 150 − 100 − 50 0 50 100 • Posterior over positions and configurations: p ( X o ∗ , Y ∗ | E ) ∝ p ( E | X o ∗ , Y ∗ ) p ( X o ∗ ) p ( Y ∗ ) . People-Tracking-by-Detection and People-Detection-by-Tracking - CVPR 2008 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend