Human Action Recognition using Pose-based Discriminant Embedding - - PowerPoint PPT Presentation

▶

Oct 06, 2022 151 likes •556 views

Human Action Recognition using Pose-based Discriminant Embedding Behrouz Saghafi Advantages of Silhouettes Informative features for describing actions Capture the spatio-temporal characteristics of motion with lower computational cost No need

SLIDE 1

Human Action Recognition using Pose-based Discriminant Embedding

Behrouz Saghafi

SLIDE 2

Advantages of Silhouettes

Informative features for describing actions Capture the spatio-temporal characteristics

f motion with lower computational cost

No need for an explicit human body model

SLIDE 3

General frameworks for using silhouettes in action recognition

Frame recognition framework

Classify sequences on a frame-by-frame basis
Label for query sequence is obtained based on a voting

scheme

Ignore the temporal information and kinematics

Sequence recognition framework

Classify the sequence as a whole
Compare actions based on distances defined between

sequences of points

Kinematics is involved

SLIDE 4

Why embedding into lower dimensional space?

Recognition methods operating in high-

dimensional space suffer from curse of dimensionality

Information provided in high-dimensional

image space is too much to describe an action

Structure of human body imposes a

constraint on possible postures

SLIDE 5

Examples of postures for run and its trajectory in a possible action space

SLIDE 6

Embeddings used to find the underlying action space

PCA (Principal Components Analysis)
LDA (Linear Discriminant Analysis)
LPP (Locality Preserving Projections)
LLE (Locally Linear Embedding)
LE (Laplacian Eigenmaps)
Kernel PCA
LSTDE (Local Spatio-Temporal Discriminant Embedding)

>In all these methods, Embedding is defined based on the distance between points rather than sequences. >Thus they are not guaranteed to give optimum results in sequence recognition framework

SLIDE 7

Distances between sets of points

Median Hausdorff Distance (MHD): Spatiotemporal Correlation Distance (SCD):

SLIDE 8

Optimal Embedding Computation

We propose an embedding such that in the embedded space (action space), based on SCD as the distance metric:

The intra-class sequences are as close as possible
The inter-class sequences are as far apart as

possible.

SLIDE 9

Optimal Embedding Computation

Intra-class sequences be as close as possible in the action space The sum of all pairwise SCD between embedded intra-class sequences should be minimized with respect to A

SLIDE 10

Optimal Embedding Computation

SLIDE 11

Optimal Embedding Computation

SLIDE 12

Optimal Embedding Computation

For the optimization of inter-class sequences:

SLIDE 13

Optimal Embedding Computation

Generalized eigenvalue problem

SLIDE 14

Overview of Approach

> Action recognition is done by comparing the similarity between test and train sequences in the low-dimensional action space in the nearest neighbor framework.

SLIDE 15

Period Estimation (1)

Actions can be considered semantically periodic.
Using a single period is more computationally efficient than

using the entire length.

To estimate the action period, we have used the method based
n absolute correlation between frames and improved it:

The object’s self-similarity is computed by:

SLIDE 16

Period Estimation (2)

A column Linearly detrend

ˆ z

autocorrelation

ˆˆ( ) zz

R m  

ˆˆ

1 ˆ ˆ ( )

zz n m n

R m E z z N m



 

ˆ z

ˆˆ( ) zz

R m

SLIDE 17

Period Estimation (3)

False peak detections by zero-derivative method specified by red vertical lines

SLIDE 18

Warping

Bicubic interpolation technique

SLIDE 19

Aligning

SLIDE 20

Experimental Results (Datasets)

Weizmann database Maryland database KTH database

SLIDE 21

Weizmann database

9 subjects. 10 actions:

bending (bend), jumping jack (jack), jumping-forward-on-two-legs (jump), jumping-in- place-on-two-legs (pjump), running (run), galloping sideways (side), skipping (skip), walking (walk), waving-one-hand (wave1), and waving-two-hands (wave2)

SLIDE 22

Recognition accuracy vs. dimension for different values of T for SCD (Weizmann)

SLIDE 23

Studying the effect of T: Comparing the maximum and mean of recognition rate for different values of T for SCD(Weizmann)

SLIDE 24

Recognition accuracy vs. dimension for different values of T for MHD (Weizmann)

SLIDE 25

Studying the effect of T: Comparing the maximum and mean of recognition rate for different values of T for MHD(Weizmann)

SLIDE 26

Recognition accuracy vs. dimension, using test sequences without warping (Weizmann)

T=6 using MHD Best accuracy: 98.89%

SLIDE 27

Comparison of different dimension reduction methods (Weizmann)

SLIDE 28

Experimental Results (Weizmann)

Comparison with different dimension reduction methods

0.5 1

0.5

0.2 0.4 LDA

0.5 1

0.5 1 PCA

0.5 1

0.5

0.2 0.4 Supervised LPP

0.5

0.2 0.4 PDE

LDA PCA SLPP PDE

SLIDE 29

Comparison with other results on Weizmann dataset

SLIDE 30

Results of Robustness to Noise (Weizmann)

SLIDE 31

Weizmann’s robustness database for deformations

SLIDE 32

Results of Weizmann’s Deformation robustness test

SLIDE 33

Weizmann’s robustness database for viewpoint

SLIDE 34

Results of Weizmann’s viewpoint robustness test

SLIDE 35

Experimental Results (Maryland)

Result: 100% recognition rate

10 actions: pick up object, jog in place, push, squat, wave, kick, bend to the side, throw, turn around and talk on cell phone

SLIDE 36

Comparison of different dimension reduction methods (Maryland)

SLIDE 37

KTH dataset

6 actions, 25 subjects, 4 scenarios

s1 s2 s3 s4

SLIDE 38

Examples of computed edge maps for in- place actions of KTH dataset

SLIDE 39

Human Action Recognition using Pose-based Discriminant Embedding

Behrouz Saghafi

Advantages of Silhouettes

Informative features for describing actions Capture the spatio-temporal characteristics

No need for an explicit human body model

General frameworks for using silhouettes in action recognition

Frame recognition framework

scheme

Sequence recognition framework

sequences of points

Why embedding into lower dimensional space?

dimensional space suffer from curse of dimensionality

image space is too much to describe an action

constraint on possible postures

Examples of postures for run and its trajectory in a possible action space

Embeddings used to find the underlying action space

>In all these methods, Embedding is defined based on the distance between points rather than sequences. >Thus they are not guaranteed to give optimum results in sequence recognition framework

Distances between sets of points

Median Hausdorff Distance (MHD): Spatiotemporal Correlation Distance (SCD):

Optimal Embedding Computation

We propose an embedding such that in the embedded space (action space), based on SCD as the distance metric:

possible.

Optimal Embedding Computation

Intra-class sequences be as close as possible in the action space The sum of all pairwise SCD between embedded intra-class sequences should be minimized with respect to A

Optimal Embedding Computation

Optimal Embedding Computation

Optimal Embedding Computation

Optimal Embedding Computation

Overview of Approach

> Action recognition is done by comparing the similarity between test and train sequences in the low-dimensional action space in the nearest neighbor framework.

Period Estimation (1)

using the entire length.

Period Estimation (2)

ˆ z

ˆˆ( ) zz

R m  

1 ˆ ˆ ( )

R m E z z N m

 

ˆ z

R m

Period Estimation (3)

Warping

Aligning

Experimental Results (Datasets)

Weizmann database Maryland database KTH database

Weizmann database

Recognition accuracy vs. dimension for different values of T for SCD (Weizmann)

Studying the effect of T: Comparing the maximum and mean of recognition rate for different values of T for SCD(Weizmann)

Recognition accuracy vs. dimension for different values of T for MHD (Weizmann)

Studying the effect of T: Comparing the maximum and mean of recognition rate for different values of T for MHD(Weizmann)

Recognition accuracy vs. dimension, using test sequences without warping (Weizmann)

Comparison of different dimension reduction methods (Weizmann)

Experimental Results (Weizmann)

Comparison with different dimension reduction methods

Comparison with other results on Weizmann dataset

Results of Robustness to Noise (Weizmann)

Weizmann’s robustness database for deformations

Results of Weizmann’s Deformation robustness test

Weizmann’s robustness database for viewpoint

Results of Weizmann’s viewpoint robustness test

Experimental Results (Maryland)

Comparison of different dimension reduction methods (Maryland)

KTH dataset

6 actions, 25 subjects, 4 scenarios

Examples of computed edge maps for in- place actions of KTH dataset

Comparison with other methods on KTH dataset for the in-place actions