Actom Sequence Models for Efficient Action Detection LEAR INRIA - - PowerPoint PPT Presentation

actom sequence models for efficient action detection
SMART_READER_LITE
LIVE PREVIEW

Actom Sequence Models for Efficient Action Detection LEAR INRIA - - PowerPoint PPT Presentation

Actom Sequence Models for Efficient Action Detection LEAR INRIA Grenoble Adrien Gaidon Zaid Harchaoui Cordelia Schmid Presentation by Benoit Mass Introduction Video : Big Data Automatisation ? Semantic analysis Retrieval


slide-1
SLIDE 1

Actom Sequence Models for Efficient Action Detection

LEAR – INRIA Grenoble Adrien Gaidon Zaid Harchaoui Cordelia Schmid Presentation by Benoit Massé

slide-2
SLIDE 2

Introduction

  • Video : Big Data
  • Automatisation ?

– Semantic analysis – Retrieval

Problem :

Find if and when a specific action happen

slide-3
SLIDE 3

State of the art

  • Training

– Define the action – Choose the features – Train

  • Retrieval

– Classification – Detection

slide-4
SLIDE 4

State of the art

  • Training

– Define the action – Choose the features – Train

  • Retrieval

– Classification – Detection

=> Spatio-temporal extent => HoG, HoF, SP interest Point => Bag-of-Feature => SVM, Bayesian Network => ?

slide-5
SLIDE 5

Actoms

  • Actom : short atomic action
slide-6
SLIDE 6

Actoms

An actom has

– A location t – A radius r

Actom descriptors : Set of visual words

– Bag of Features applied on HoG, HoF, Harris Interest points... – Ponderated sum from t - r to t + r

slide-7
SLIDE 7

Interest of Actoms

  • An action is composed of several actoms

– New goal : find an ordered sequence of actoms – No temporal dependance inside an action

  • Gap between actoms
  • Overlap
  • An action can be composed of very different parts

=> Classic methods compute the average

slide-8
SLIDE 8

Actom Sequence Model (ASM)

One Action = One Actom Sequence

– The radius r i of actom i depends on its distance to the

closer other actoms : min(t i - t i-1, t i+1 - t i)

– ASM : concatenation of actoms words

(x11, …, x1k, x21, …, x2k, x31, …, x3k)

slide-9
SLIDE 9

Classification

  • Given a new ASM (x11, ... xnk), does it corresponds to

the trained action ? (for instance : « drinking »)

– Classic machine learning problem – Chosen solution : SVM – Including negative examples improves the classifier

slide-10
SLIDE 10

Detection

  • Given a video, find all the occurences of the trained
  • action. (for instance : « drinking »)

For every 5 frames Set the current frame as the middle actom Generate candidates for other actoms Apply classification on the result End Delete non-maximal overlapping actions

slide-11
SLIDE 11

Detection

Tricky step : Generating the other actoms We must estimate the distance between actoms

– Training : Build the multivariate distribution {t i+1 – t i }

Remove the outliers

– Estimation : Try all the possible combinations

(starting from the middle limit the error propagation)

slide-12
SLIDE 12

Experiments

4 kind of actions

Drinking

Smoking

Open a door

Sit down

Criteria

OV20 (20 % Overlap)

OVAA (All Actoms Overlap)

State of the art Comparison

Bag of Features

Bag of Features with a grid

Other published methods

slide-13
SLIDE 13

Results

slide-14
SLIDE 14

Conclusion

ASM gives better result than state-of-the-art, using the same data sets. => Actoms are particularly adapted for representing the temporal structure of actions into videos

slide-15
SLIDE 15

QUESTIONS ?