Human Action Recognition using Pose-based Discriminant Embedding - - PowerPoint PPT Presentation

human action recognition using
SMART_READER_LITE
LIVE PREVIEW

Human Action Recognition using Pose-based Discriminant Embedding - - PowerPoint PPT Presentation

Human Action Recognition using Pose-based Discriminant Embedding Behrouz Saghafi Advantages of Silhouettes Informative features for describing actions Capture the spatio-temporal characteristics of motion with lower computational cost No need


slide-1
SLIDE 1

Human Action Recognition using Pose-based Discriminant Embedding

Behrouz Saghafi

slide-2
SLIDE 2

Advantages of Silhouettes

Informative features for describing actions Capture the spatio-temporal characteristics

  • f motion with lower computational cost

No need for an explicit human body model

slide-3
SLIDE 3

General frameworks for using silhouettes in action recognition

Frame recognition framework

  • Classify sequences on a frame-by-frame basis
  • Label for query sequence is obtained based on a voting

scheme

  • Ignore the temporal information and kinematics

Sequence recognition framework

  • Classify the sequence as a whole
  • Compare actions based on distances defined between

sequences of points

  • Kinematics is involved
slide-4
SLIDE 4

Why embedding into lower dimensional space?

  • Recognition methods operating in high-

dimensional space suffer from curse of dimensionality

  • Information provided in high-dimensional

image space is too much to describe an action

  • Structure of human body imposes a

constraint on possible postures

slide-5
SLIDE 5

Examples of postures for run and its trajectory in a possible action space

slide-6
SLIDE 6

Embeddings used to find the underlying action space

  • PCA (Principal Components Analysis)
  • LDA (Linear Discriminant Analysis)
  • LPP (Locality Preserving Projections)
  • LLE (Locally Linear Embedding)
  • LE (Laplacian Eigenmaps)
  • Kernel PCA
  • LSTDE (Local Spatio-Temporal Discriminant Embedding)

>In all these methods, Embedding is defined based on the distance between points rather than sequences. >Thus they are not guaranteed to give optimum results in sequence recognition framework

slide-7
SLIDE 7

Distances between sets of points

Median Hausdorff Distance (MHD): Spatiotemporal Correlation Distance (SCD):

slide-8
SLIDE 8

Optimal Embedding Computation

We propose an embedding such that in the embedded space (action space), based on SCD as the distance metric:

  • The intra-class sequences are as close as possible
  • The inter-class sequences are as far apart as

possible.

slide-9
SLIDE 9

Optimal Embedding Computation

Intra-class sequences be as close as possible in the action space The sum of all pairwise SCD between embedded intra-class sequences should be minimized with respect to A

slide-10
SLIDE 10

Optimal Embedding Computation

slide-11
SLIDE 11

Optimal Embedding Computation

slide-12
SLIDE 12

Optimal Embedding Computation

  • For the optimization of inter-class sequences:
slide-13
SLIDE 13

Optimal Embedding Computation

Generalized eigenvalue problem

slide-14
SLIDE 14

Overview of Approach

> Action recognition is done by comparing the similarity between test and train sequences in the low-dimensional action space in the nearest neighbor framework.

slide-15
SLIDE 15

Period Estimation (1)

  • Actions can be considered semantically periodic.
  • Using a single period is more computationally efficient than

using the entire length.

  • To estimate the action period, we have used the method based
  • n absolute correlation between frames and improved it:

The object’s self-similarity is computed by:

slide-16
SLIDE 16

Period Estimation (2)

S

A column Linearly detrend

ˆ z

z

autocorrelation

ˆˆ( ) zz

R m  

ˆˆ

1 ˆ ˆ ( )

zz n m n

R m E z z N m

 

S

ˆ z

ˆˆ( ) zz

R m

slide-17
SLIDE 17

Period Estimation (3)

False peak detections by zero-derivative method specified by red vertical lines

slide-18
SLIDE 18

Warping

Bicubic interpolation technique

slide-19
SLIDE 19

Aligning

slide-20
SLIDE 20

Experimental Results (Datasets)

Weizmann database Maryland database KTH database

slide-21
SLIDE 21

Weizmann database

9 subjects. 10 actions:

bending (bend), jumping jack (jack), jumping-forward-on-two-legs (jump), jumping-in- place-on-two-legs (pjump), running (run), galloping sideways (side), skipping (skip), walking (walk), waving-one-hand (wave1), and waving-two-hands (wave2)

slide-22
SLIDE 22

Recognition accuracy vs. dimension for different values of T for SCD (Weizmann)

slide-23
SLIDE 23

Studying the effect of T: Comparing the maximum and mean of recognition rate for different values of T for SCD(Weizmann)

slide-24
SLIDE 24

Recognition accuracy vs. dimension for different values of T for MHD (Weizmann)

slide-25
SLIDE 25

Studying the effect of T: Comparing the maximum and mean of recognition rate for different values of T for MHD(Weizmann)

slide-26
SLIDE 26

Recognition accuracy vs. dimension, using test sequences without warping (Weizmann)

T=6 using MHD Best accuracy: 98.89%

slide-27
SLIDE 27

Comparison of different dimension reduction methods (Weizmann)

slide-28
SLIDE 28

Experimental Results (Weizmann)

Comparison with different dimension reduction methods

  • 1
  • 0.5

0.5 1

  • 0.5

0.5

  • 0.6
  • 0.4
  • 0.2

0.2 0.4 LDA

  • 1
  • 0.5

0.5 1

  • 0.5

0.5 1

  • 1
  • 0.5

0.5 1 PCA

  • 1
  • 0.5

0.5 1

  • 0.5

0.5

  • 0.4
  • 0.2

0.2 0.4 Supervised LPP

  • 1
  • 0.5

0.5

  • 0.5

0.5

  • 0.6
  • 0.4
  • 0.2

0.2 0.4 PDE

LDA PCA SLPP PDE

slide-29
SLIDE 29

Comparison with other results on Weizmann dataset

slide-30
SLIDE 30

Results of Robustness to Noise (Weizmann)

slide-31
SLIDE 31

Weizmann’s robustness database for deformations

slide-32
SLIDE 32

Results of Weizmann’s Deformation robustness test

slide-33
SLIDE 33

Weizmann’s robustness database for viewpoint

slide-34
SLIDE 34

Results of Weizmann’s viewpoint robustness test

slide-35
SLIDE 35

Experimental Results (Maryland)

Result: 100% recognition rate

10 actions: pick up object, jog in place, push, squat, wave, kick, bend to the side, throw, turn around and talk on cell phone

slide-36
SLIDE 36

Comparison of different dimension reduction methods (Maryland)

slide-37
SLIDE 37

KTH dataset

6 actions, 25 subjects, 4 scenarios

s1 s2 s3 s4

slide-38
SLIDE 38

Examples of computed edge maps for in- place actions of KTH dataset

slide-39
SLIDE 39

Comparison with other methods on KTH dataset for the in-place actions