Keypoint-Based Action Keypoint-Based Action Recognition - - PowerPoint PPT Presentation
Keypoint-Based Action Keypoint-Based Action Recognition - - PowerPoint PPT Presentation
Keypoint-Based Action Keypoint-Based Action Recognition Recognition Presenter: Jianchao Yang Presenter: Jianchao Yang Course Instructor: Prof. Derek Hoiem Papers to discuss Papers to discuss Behavior recognition via sparse Behavior
Papers to discuss Papers to discuss
- Behavior recognition via sparse
- Behavior recognition via sparse
spatio-temporal features.
- Learning realistic human
actions from movies. actions from movies.
Behavior Recognition via Sparse Spatio-Temporal Features Spatio-Temporal Features
- Motivated by the success application of key points in
- bject recognition
- Designed a spatio-temporal feature for behavior
recognition recognition
Approach Approach
- Similar to what seen in object recognition
- Similar to what seen in object recognition
Key Points Detection Feature Extraction Histogram Classifier Prototypes Prototypes
Keypoints detection Keypoints detection
- Extension from 2D
- Extension from 2D
- Localization proceeds along
the spatial dimensions x and the spatial dimensions x and y, as well as the temporal dimension t. dimension t.
- 3D corners too rare
Keypoints detection (cont’) Keypoints detection (cont’)
- Response function:
- Response function:
– Spatial kernel is 2D Gaussian – Temporal kernel
Keypoints detection (cont’) Keypoints detection (cont’)
- Keypoints
- Keypoints
– By pooling maxima of the filter responses – Emphasize temporal information other than spatial information – Strong response to periodic motions – Strong response to periodic motions – Does not respond to pure translation motion – Totally unsupervised – Totally unsupervised
Cuboid descriptor Cuboid descriptor
Key points Cuboids Spatio-temporal Key points Cuboids Spatio-temporal descriptor Descriptor: Transform Descriptor: Normalized pixel values; Gradients; Windowed optical flow, etc. Transform Windowed optical flow, etc. Transform: Vectorize directly; Vectorize directly; Histogram (global or local).
Cuboid descriptor (cont’) Cuboid descriptor (cont’)
Gradient is best! Gradient is best! Vectorize directly is best! ??
Behavior descriptor Behavior descriptor
Key points Cuboids Spatio-temporal Behavior Key points Cuboids Spatio-temporal descriptor Behavior descriptor Prototypes Transform Prototypes Transform
Experiment results Experiment results
- Datasets: facial expression, mouse, human actions
- Datasets: facial expression, mouse, human actions
Experiments results (cont’) Experiments results (cont’)
Mouse Database. Human Facial Expression Database.
Learning realistic human actions from movies movies
- Automatic annotation of human actions in video.
- Automatic annotation of human actions in video.
- Video classification by space-time features.
- Video classification by space-time features.
Bag-of-feature approach Bag-of-feature approach
- Extension of recent advances in bag-of-feature
- Extension of recent advances in bag-of-feature
approaches
Spatial pyramid more general spatial grids – Spatial pyramid more general spatial grids – Fixed weights for each pyramid level optimized – Spatial grid space-time grids
Space-time features Space-time features
- Interest point detection: Harris operator
- Interest point detection: Harris operator
Key points Cuboids Space-time feature Histogram of oriented gradient (HoG) Local histogram gradient (HoG) Histogram of optical flow Histogram of optical flow (HoF)
Spatio-temporal bag-of-features Spatio-temporal bag-of-features
- Hierarchical structure
- Hierarchical structure
Key points Cuboids Space-time feature Spatio-temporal feature Local Local Local histogram Local histogram
Spatio-temporal grids Spatio-temporal grids
Experiment results Experiment results
- Evaluation of spatio-temporal grids
- Evaluation of spatio-temporal grids
Experiments results (cont’) Experiments results (cont’)
Experiment results (cont’) Experiment results (cont’)
- KTH action database
- KTH action database
Experiment results (cont’) Experiment results (cont’)
Comments Comments
- The two methods are extensions of key-points based
- The two methods are extensions of key-points based
image classification. Will dense descriptors be better?
- Key-points based methods work surprisingly well for
image and sequence classification, why? image and sequence classification, why?
- Issues needed to address: