Learning Transferable Distance Functions For Human Action - - PowerPoint PPT Presentation

learning transferable distance functions for human action
SMART_READER_LITE
LIVE PREVIEW

Learning Transferable Distance Functions For Human Action - - PowerPoint PPT Presentation

Learning Transferable Distance Functions For Human Action Recognition and Detection Weilong Yang Simon Fraser University 1 Action Recognition and Detection Walking Running Jogging Boxing Waving T X Y 2 Applications Action related


slide-1
SLIDE 1

Weilong Yang Simon Fraser University

1

Learning Transferable Distance Functions For Human Action Recognition and Detection

slide-2
SLIDE 2

Action Recognition and Detection

2

Walking Running Jogging Boxing Waving T X Y

slide-3
SLIDE 3

3

Applications

Action related video search

Sports and Dancing video search

Event Detection

Automatic abnormality detection in surveillance

videos

slide-4
SLIDE 4

Motivation

On KTH & Weizmann action datasets, almost

100% accurancy is achieved. [Jhuang et al. ICCV07,

Fathi & Mori CVPR08 ]

Most of methods rely on a large amout of

training set.

Half-half split or Leave-one-out cross validation

It is unrealistic to collect this many training

samples for some action.

4

One Clip Many Clips

slide-5
SLIDE 5

Query Action Template Set

Template A Template B Template C Template D Throwing R-dancing M-dancing Kicking Label:

slide-6
SLIDE 6

Related Works

One shot learning of object categories [Fei-Fei et al.

ICCV03]

Visual Object Identification [Ferencz et al. IJCV07]

Transfer Learning:

The ability of a system to recognize and apply knowledge and skills learned in previous tasks to novel tasks .[Pan & Yang, TKDE 2009]

6

slide-7
SLIDE 7

7

Query

A B C Hyper- Features Fq

Learning Action Recognition Distance

Templates

Action Detection

slide-8
SLIDE 8

Patch based Action comparison

Query Template

Frame-to-Frame Distance

8

Motion Descriptor

[Efros et al. ICCV 03]

slide-9
SLIDE 9

Patch based Action comparison

Query Template

Frame Correspondence Frame-to-Frame Distance

9

Elementary Patch-to-Patch Distance

slide-10
SLIDE 10

10

Query

A B C Hyper- Features Fq

Learning Action Recognition Distance

Templates

Action Detection

slide-11
SLIDE 11

Local Distance Function

[Frome et al. NIPS06]

11

slide-12
SLIDE 12

Local Distance Function

[Frome et al. NIPS06]

Triplet

12

Large Training set required

slide-13
SLIDE 13

Transferable Distance Function

13

slide-14
SLIDE 14

Transferable Distance Function

14

slide-15
SLIDE 15

Transferable Distance Function

Hyper- Feature Transferable

15

slide-16
SLIDE 16

Max-Margin Formulation

  • Triplet
  • It is convex and similar to the primal problem of SVM

16

slide-17
SLIDE 17

Hyper-Features

Codebook representation

Descriptor for each patch

○ HOG + Positions

Obtaining codebook with the size of

○ K-means clustering

Hyper-feature for each patch

○ A dimensional vector

17

slide-18
SLIDE 18

Summary of Features

18

Motion Cue Shape Cue & Positions

Patch Matching

Hyper- Feature

Patch Weighting

slide-19
SLIDE 19

19

Query

A B C Hyper- Features Fq

Learning Action Recognition Distance

Templates

Action Detection

slide-20
SLIDE 20

Recognizing an Action

20

Query Template A Template B Template C

Hyper- Features Fq

slide-21
SLIDE 21

21

Query

A B C Hyper- Features Fq

Learning Action Recognition Distance

Templates

Action Detection

slide-22
SLIDE 22

Experiments on Action Recognition

Train the transferable distance function on

Weizmann, and test on KTH.

  • The source training set does not contain the actions of the

template set

  • Each Action in the testing set has only one clip as template

transfer

22

skip jack jump side bend wave1 pjump

Source Training Set

Weizmann walk run jog clap wave2

Testing set

KTH

slide-23
SLIDE 23

Visualization

23

Codeword Ranking Learnt Weights on Testing Actions

slide-24
SLIDE 24

Five Rounds of Experiments

For each round, we randomly select one

actor, then choose one clip per action from this actor as the template.

5% improvement

24

Dc : Direct Comparison (W = 1) Tr : Transferable Distance Function

slide-25
SLIDE 25

Confusion Matrix of the Round 2

Direct Comparison Avg: 70.9% Transfer Avg: 76.7%

25

Clpping vs. Waving Jogging vs. Running

slide-26
SLIDE 26

Efficiency

With the learnt distance function, we can sort the

patches on each frame by their saliency.

Instead of using all patches, we can choose the

top N patches with high weights for matching.

10 Patches on Each Frame

26

slide-27
SLIDE 27

27

Query

A B C Hyper- Features Fq

Learning Action Recognition Distance

Templates

Action Detection

slide-28
SLIDE 28

28

Human Action Detection

slide-29
SLIDE 29

29

Cascade Structure

Reject Reject Reject Cascade Stage 1 Cascade Stage 2 Cascade Stage N

slide-30
SLIDE 30

30

Cascade Structure

All Sub- Windows

Decision

Reject Reject Reject Hyper- Features Fq

slide-31
SLIDE 31

Efficient Action Detection

31

slide-32
SLIDE 32

Contributions

Transferable distance function Learning

Hyper-features based on appearance and

positions

Max-margin Learning framework

Action recognition from one clip

Template Matching based on motion

Efficient action detection from one clip

Cascade structure

32

slide-33
SLIDE 33

33

Thank You !