cs 4495 computer vision activity recognition
play

CS 4495 Computer Vision Activity Recognition Aaron Bobick School - PowerPoint PPT Presentation

Activity Recognition 1 CS 4495 Computer Vision A. Bobick CS 4495 Computer Vision Activity Recognition Aaron Bobick School of Interactive Computing Activity Recognition 1 CS 4495 Computer Vision A. Bobick Administrivia PS6


  1. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick CS 4495 Computer Vision Activity Recognition Aaron Bobick School of Interactive Computing

  2. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Administrivia • PS6 – should be working on it! Due Sunday Nov 24 th . • Exam: Tues November 26 th . • Short answer and multiple choice (mostly short answer) • Study guide is posted in calendar. • PS7 – we hope to have out by 11/26. Will be straight forward implementation of Motion History Images

  3. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Video • A video is a sequence of frames captured over time • Now our image data is a function of space (x, y) and time (t)

  4. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Video as an “Image Stack” 255 time 0 t • Can look at video data as a spatio-temporal volume • If camera is stationary, each line through time corresponds to a single ray in space Alyosha Efros, CMU

  5. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Aside: Epipolar Plane (“EPI”) images

  6. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Aside: Epipolar Plane (“EPI”) images

  7. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick EPI images and activity

  8. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick EPI images and activity

  9. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Processing video: object detection • If the goal of “activity recognition” is to recognize the activity of the objects… • … you (may) have to find the objects….

  10. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Background subtraction Slide credit: Birgi Tamersoy

  11. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Background subtraction • Simple techniques can do ok with static camera • …But hard to do perfectly • Widely used: • Traffic monitoring (counting vehicles, detecting & tracking vehicles, pedestrians), • Human action recognition (run, walk, jump, squat), • Human-computer interaction • Object tracking

  12. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Simple approach: background subtraction Slide credit: Birgi Tamersoy

  13. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Frame differencing Slide credit: Birgi Tamersoy

  14. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Frame differencing Slide credit: Birgi Tamersoy

  15. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Mean filtering Slide credit: Birgi Tamersoy

  16. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Frame differences vs. background subtraction • Toyama et al. 1999

  17. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Median Filtering Slide credit: Birgi Tamersoy

  18. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Average/Median Image Alyosha Efros, CMU

  19. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Background Subtraction - = Alyosha Efros, CMU

  20. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Pros and cons Advantages: • Extremely easy to implement and use! • All pretty fast. • Corresponding background models need not be constant, they change over time. Disadvantages: • Accuracy of frame differencing depends on object speed and frame rate • Median background model: relatively high memory requirements. • Setting global threshold Th… When will this basic approach fail? Slide credit: Birgi Tamersoy

  21. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Background mixture models Idea : model each background pixel with a mixture of Gaussians; update its parameters over time. • Adaptive Background Mixture Models for Real-Time Tracking, Chris Stauer & W.E.L. Grimson

  22. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Background subtraction with depth How can we select foreground pixels based on depth information?

  23. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Human activity in video No universal terminology, but approximately: • “ Event ”: a single instant in time detection. • “ Actions ” or “Movements” : atomic motion patterns -- often gesture-like, single clear-cut trajectory, single nameable behavior (e.g., sit, wave arms) • “ Activity ”: series or composition of actions (e.g., interactions between people) Adapted from Venu Govindaraju and A.Bobick

  24. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Surveillance http://users.isr.ist.utl.pt/~etienne/mypubs/Auvinetal06PETS.pdf

  25. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Human activity in video: basic approaches • Model-based action recognition: • Use human body tracking and pose estimation techniques, relate to action descriptions (or learn) • Major challenge: accurate tracks in spite of occlusion, ambiguity, low resolution • Model-based activity recognition: • Given some lower level detection of actions (or events) recognize the activity by comparing to some structural representation of the activity • Needs to handle uncertainty. • Activity as motion, space-time appearance patterns • Describe overall patterns, but no explicit body tracking • Typically learn a classifier • Recently: “Activity-recognition” from static image • Imagine a picture of a person holding a flute. What are they doing?

  26. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Motion and perceptual organization • Even “impoverished” motion data can evoke a strong percept

  27. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Motion and perceptual organization • Even “impoverished” motion data can evoke a strong percept

  28. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Example • Even “impoverished” motion data can evoke a strong percept Video from Davis & Bobick

  29. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Motion energy images • Spatial accumulation of motion. • Collapse over specific time window. • Motion measurement method not critical (e.g. motion differencing). Time

  30. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Motion history images • Motion history images are a different Moved function of temporal volume. t-15 • Pixel operator is replacement decay: if moving I τ (x,y,t) = τ otherwise I τ (x,y,t) = max( I τ (x,y,t-1)-1 ,0) • Trivial to construct I τ− k (x,y,t) from I τ (x,y,t) so can process multiple time Moved window lengths without more search. t-1 • MEI is thresholded MHI

  31. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Temporal-templates • MEI+ MHI = Temporal template motion energy motion history image image

  32. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Aerobics examples

  33. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Motion Energy Images Davis & Bobick 1999: The Representation and Recognition of Action Using Temporal Templates

  34. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick How to recognize these images? • These are gray scale blob like images. • 100 years of computer vision for recognizing gray blobs (for small values of a hundred). • Old style computer vision: compute some summarization statistics of the pattern 1. construct generative model 2. recognize based upon those statistics. 3.

  35. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Image moments Moments summarize a shape given image I(x,y) = ∑∑ i j ( , ) M x y I x y ij x y Central moments are translation invariant: ∑∑ µ = − − p q ( ) ( ) ( , ) x x y y I x y pq x y M M = = 10 01 x y M M 00 00

  36. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Hu moments • Set of 7 moments • Apply to Motion History Image for global space-time “shape” descriptor • Translation and rotation and scale invariant [ , , , , , , ] h h h h h h h 1 2 3 4 5 6 7

  37. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Hu moments = h 1 = h 2 = h 3 = h 4 = h 5 = h 6

  38. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick = h 7

  39. Activity Recognition 1 CS 4495 Computer Vision – A. Bobick Build a classifier • Generative or Discriminative? • Generative – builds model of each class; compare all • Discriminative – builds model of the boundary between classes • How would you build decent generative models of each class of action? • Use a Gaussian in Hu-moment feature space • Compare likelihoods p(data | model of action i) • If have priors, use them by Bayes rule ∝ (model | data) p(data | model ) p(model ) p i i i • Otherwise just use likelihood. • Or use NN? (Problem Set!) • More on classification on Dec 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend