Motion Capturing and Machine Learning for Gesture Recognition - - PowerPoint PPT Presentation

motion capturing and machine learning for gesture
SMART_READER_LITE
LIVE PREVIEW

Motion Capturing and Machine Learning for Gesture Recognition - - PowerPoint PPT Presentation

Motion Capturing and Machine Learning for Gesture Recognition Sotiris Manitsaris Centre for Robotics | MINES ParisTech | PSL Research University Interactive Systems Gestural interaction Perception Interaction Gesture Knowledge Methodology


slide-1
SLIDE 1

Motion Capturing and Machine Learning for Gesture Recognition

Sotiris Manitsaris

Centre for Robotics | MINES ParisTech | PSL Research University

slide-2
SLIDE 2

Interactive Systems Gestural interaction

Interaction

Perception

Gesture

Knowledge

slide-3
SLIDE 3

Methodology Overview Capturing-Modelling-Recognition

tracking body joints or segments motion description

modelling

stochastic modelling HMMs GMMs DTW machine learning

recognition & alignment

gesture 2

gesture recognition temporal alignment between input and reference distance gesture learning-recognition gesture inertial sensors or accelerometers

capturing & analysis

  • ptical or depth

camera

sensorimotor feedback

colocalisation affordances sound

slide-4
SLIDE 4

Motion Capture

slide-5
SLIDE 5

Motion Capture Computer Vision – Sensors

slide-6
SLIDE 6

Motion Capture Wearable or embedded sensors

Sensors

  • Inertial sensors
  • Magnetometers
  • Gyroscopes
  • Accelerometers
  • Electromyographs (EMG)

Gestural descriptors

  • Rotations
  • Euler angles
  • Axis/Angle
  • Quaternions
  • Exponential map
  • Rotation matrices
  • Accelerations
slide-7
SLIDE 7

Motion Capture Wearable or embedded sensors

slide-8
SLIDE 8

Sensors

  • Retroreflective markers
  • Light emitting diodes
  • Overlapping projections

Gestural descriptors

  • Cartesian coordinates

Motion Capture Wearable or embedded sensors

slide-9
SLIDE 9

Motion Capture Markerless computer vision

Sensors

  • RGB cameras
  • Depths cameras

Gestural descriptors

  • Cartesian coordinates
slide-10
SLIDE 10

Feature Extraction & Tracking

slide-11
SLIDE 11

Finger Tracking with RGB Cameras (Musical Interaction) Skin model and mathematic morphology

Skin modeling Mathematic morphology and contour detection

BIP

RI

RGB[ m,n]

∀pj

RI ∈RIRGB

N : RIRGB → RIrg, N([ Rj,Gj, Bj ] T ) = [ rj, gj ] T

RIrg[ m,n]

Normalisation de la RI Modèle de la peau (r, g) graphique Échantillonnage Obtention d’échantillons de couleur de peau et d’ongles Pi

Pi( pj) = Rj,Gj, Bj ⎡ ⎣ ⎤ ⎦

T

∀pj

RI ∈RIrg

rpeau = [ r

min ,r max ],

gpeau = [ gmin , gmax ] Détermination de la RI Création d’une image à partir des échantillons Pi

slide-12
SLIDE 12

Finger Tracking with RGB Cameras (Musical Interaction) Fingertip Detection

slide-13
SLIDE 13

Finger Tracking with RGB Cameras (Musical Interaction) Real-time finger tracking

slide-14
SLIDE 14

Seuillages pour extraire le torse et la position de la tête Construction d’un graph 2D connectant les pixels du torse Le poids de chaque arrête est égal à la différence de profondeur entre les deux pixels

Pour chaque point du torse on calcule le chemin « le plus court » reliant le pixel à la tête Algorithme de Dijkstra : Trouver le chemin le plus court i.e. le chemin ayant le poids le plus faible possible Poids du chemin = Somme des poids des arrêtes parcourues par le chemin

Distance géodésique d’un point du torse à la tête = Poids le chemin le plus court reliant ce point à la tête

Seuillage pour obtenir les parties les plus éloignées de la tête Positions des mains et chemins les plus courts reliant la tête aux mains

Body Tracking with Depth Cameras (Human-Robot Collaboration) Geodesic distances

slide-15
SLIDE 15

Body Tracking with Depth Cameras (Human-Robot Collaboration) Real-time body tracking with geodesic distances

slide-16
SLIDE 16

Machine Learning

slide-17
SLIDE 17

Machine Learning in Gesture Recognition Introduction

Credits: Jules Françoise

slide-18
SLIDE 18

Machine Learning in Gesture Recognition Introduction

Credits: Jules Françoise

slide-19
SLIDE 19

Machine Learning in Gesture Recognition Introduction

Credits: Jules Françoise

slide-20
SLIDE 20

How does the depth at that pixel compare to this pixel? Example of pre- planned questions of a decision tree Random Decision Forest

  • Use a random selection of questions each time
  • Learn multiple trees
  • Add probability distributions as outputs of the trees

to classify Tracking the body parts Depth images Body parts 3D joint proposals Training the RDF with synthetic images

Feature Extraction & Tracking using Machine Learning Random Decision Forest

slide-21
SLIDE 21

Body Tracking with Depth Cameras (Musical Interaction) Random Decision Forest

slide-22
SLIDE 22

Body Tracking with Depth Cameras (Professional Gestures) Hierarchical Random Decision Forests

Purpose & Challenges

  • Classification of complex scene segments based on machine learning
  • The object is Moving, Revolving, Deformable
slide-23
SLIDE 23

Body Tracking with Depth Cameras (Professional Gestures) Hierarchical Random Decision Forests

Training Set RDF Training Pre-processing Testing Set Pre-processing RDF Model Scene Segmentation RDF Model

slide-24
SLIDE 24

Body Tracking with Depth Cameras (Professional Gestures) Hierarchical Random Decision Forests

Labels of Parent RDF Maximum probabilities of labels Tracking of segments

slide-25
SLIDE 25

Full Upper-Body Tracking with Depth Cameras (Intangible Musical Instrument) Interactive Space & Surface

Purpose & Challenges

  • Natural-User Interfacing the gestural expression and emotion elicitation in music
  • Learning, performing and composing with gestures as a first-person experience
  • Augmenting the music score to facilitate the access to musical ICH
slide-26
SLIDE 26

MICRO BB The Leap motions bounding box (red) is associated with fingers interaction MACRO BB The Kinect bounding box (blue) is associated with upper-body interaction

Full Upper-Body Tracking with Depth Cameras (Intangible Musical Instrument) Gestures & Embodiment

slide-27
SLIDE 27

Full Upper-Body Tracking with Depth Cameras (Intangible Musical Instrument) Explicit Gesture Sonification – Deterministic Modelling

slide-28
SLIDE 28

Full Upper-Body Tracking with Depth Cameras (Intangible Musical Instrument) Explicit Gesture Sonification – Deterministic Modelling

slide-29
SLIDE 29

Full Upper-Body Tracking with Depth Cameras (Intangible Musical Instrument) Explicit Gesture Sonification – Deterministic Modelling

Kite-flying control:

triangle plane’ orientation (green) vs. Kinect’ xy plane provides a sense of how much left or right your body is rotating (red arrow). xz vs. triangle plane reacts if the body is going backward or forward and/or the hands are going higher or lower (yellow arrow) n Head Right Hand Left Hand

n = RightHandHead × LeftHandHead = a,b,c

[ ]

n = RightHandHead × LeftHandHead = a,b,c

[ ]

slide-30
SLIDE 30

« The future is independent of the past, given the present »

Andreï Andreïevitch Markov Андрей Андреевич Марков 2 June 1856 - 20 July 1921

The concept of Hidden Markov Models Introduction

slide-31
SLIDE 31

The concept of Hidden Markov Models Introduction

Credits: Lane Votapka

slide-32
SLIDE 32
  • We want to reason about a sequence of observations
  • Gesture recognition in Human-Robot Collaboration
  • Visual-speech recognition
  • Gesture control of robots
  • Need introduce time or space into our models

The concept of Hidden Markov Models Reasoning over time and space

slide-33
SLIDE 33
  • Set of N States, {S1, S2,… SN}
  • Sequence of states Q ={q1, q2,…}
  • Initial probabilities π={π1, π2,… πN}
  • πi=P(q1=Si)

Markov Chains Model definition

  • Transition matrix A NxN
  • aij=P(qt+1=Sj | qt=Si)
slide-34
SLIDE 34

Weather model:

  • 3 states {sunny, rainy, cloudy}

S1 S1 S2 S1 S2

Markov Chains Example in weather forecasting

Problem:

  • Forecast weather state, based on

the current weather state

slide-35
SLIDE 35

Markov Chain Example in musical gestures

Let’s assume a set of 5 musical states, {S1, S2, S3, S4, S5} S1 = fingering_1, S2 = fingering_2, S3 = fingering_3, S4 = fingering_4, S5 = fingering_5 S1 S2 S3 S4 S5

slide-36
SLIDE 36

Markov Chain Example in musical gestures

slide-37
SLIDE 37

Markov Chain Example in musical gestures

0,2 0,4 S1 S2 S3 S5 S4 0,4 0,4 0,4 0,4 0,4 0,4 0,4 0,4 0,4 0,2 0,2 0,2 0,2 Question 1 Given that now the performer is playing an S2, what’s the probability that his/her next fingering is an S3 and the fingering after is an S4? Question 2 Given that now the performer is playing an S2, what’s the probability that s/he will be playing an S4 in three fingerings from now?

slide-38
SLIDE 38

Markov Chain Example in musical gestures

Question 1 This translates into: You can also think this as moving through the automaton, multiplying the probabilities S2 S3 S4

slide-39
SLIDE 39

Markov Chain Example in musical gestures

Question 2 S2 S3 S4 S2 S3 S4

we need observations to update our beliefs

This translates into:

slide-40
SLIDE 40

λ=(A, B, π): Hidden Markov Model

  • A={aij}: Transition probabilistic distribution
  • aij=P(qt+1=Sj | qt=Si)
  • Β={bi(x)}: Emission probabilistic distribution
  • bi(Οt)=P(Οt=x | qt=Si)
  • π={πi}: Initial state probabilistic distribution
  • πi=P(q1=Si)

Hidden

Observed

Hidden Markov Model Model definition

slide-41
SLIDE 41
  • Basic conditional independence:
  • Past and future are independent of the present
  • Each time step only depends on the previous
  • This is called the first order Markov property

Hidden Markov Model Conditional independence

slide-42
SLIDE 42

Hidden Markov Model Model representation – Treilis graph

slide-43
SLIDE 43

S1 S2 S3 Left to right (A) Left to right (B) Left to right (C) Ergodic

Hidden Markov Model Model topologies

slide-44
SLIDE 44

Weather model:

  • 2 “hidden” states
  • {rainy, cloudy}
  • Measure weather-related

variables (e.g. humidity) Problem: Forecast the weather state, given the current weather variables

10% 70% t humidity

Hidden Markov Model Example in weather forecasting

slide-45
SLIDE 45

Hidden Markov Model Example in human-robot collaboration

Suppose that you want to program a robot to provide to the worker with components for assembling motor hoses The only input to the robot is whether there are available components in the box or not The possible states of technical gestures of the worker are: S1 = Take two components, S2 = Join the components, S3 = Screw the components Probability of having available components in the box Take 0,8 Join 0,1 Screw 0,4 0,1 0,85 S1 S2 S3 0,05 0,1 0,75 0,6 0,2 0,15 0,2

slide-46
SLIDE 46

Hidden Markov Model Example in human-robot collaboration

where Oi is true if the box has a component inside at the moment i and false if not

slide-47
SLIDE 47

Hidden Markov Model Example in human-robot collaboration

Question 1 Suppose that the worker is currently joining the components and at the next time stamp, there were available components into the box. Assuming that the prior probability of having available components on the box at any time is 0,5, what’s the probability that at the next time stamp the worker was screwing the components?

slide-48
SLIDE 48

Hidden Markov Model Example in human-robot collaboration

slide-49
SLIDE 49

Hidden Markov Model Example in human-robot collaboration

Question 2 Suppose that the worker is currently joining the components while there were available components into the box in the time stamp 2 but not in the time stamp 3. Assuming that the prior probability of having available components on the box at any time is 0,5, what’s the probability that at the time stamp 3 the worker was screwing the components?

slide-50
SLIDE 50

Hidden Markov Model Example in human-robot collaboration

slide-51
SLIDE 51
  • Evaluation
  • O, λ → P(O|λ)
  • Uncover the hidden part
  • O, λ → Q that P(Q|O, λ) is maximum
  • Learning
  • {Ο} → λ that P(O|λ) is maximum

Hidden Markov Model Basic problems

slide-52
SLIDE 52

Hidden Markov Model Basic problems

Credits: Aaron Bobick

slide-53
SLIDE 53

O, λ → P(O|λ)

  • Solved by the Forward algorithm

Applications

  • Find some likely samples
  • Evaluation of a sequence of
  • bservations
  • Change detection

conditionally independent

Hidden Markov Model Basic problems- Evaluation

Initialisation Induction Termination

slide-54
SLIDE 54

O, λ → Q that P(Q|O, λ) is maximum

  • Solved by Viterbi algorithm
  • No « correct » sequence to be found

How to solve it:

  • Use an optimality criterion that depends on

the use of the uncovered state sequence Possible uses:

  • Learn about the structure of the model
  • Get average statistics of the states

Applications

  • Find the real states by maximising the

likelihood until a given state

  • Find some recursion given an arbitrary state
  • Used in the learning problem

recursion given a state

Hidden Markov Model Basic problems – Uncover the hidden path

Initialisation Induction Termination Backtracking

slide-55
SLIDE 55
  • {Ο} → λ that P(O|λ) is maximum
  • No analytic solution
  • Solved by Baum-Welch (EM variation) when

some data is missing (the states)

  • Applications
  • Unsupervised Learning (single HMM)
  • Supervised Learning (multiple HMM)

η θ g

max

Hidden Markov Model Basic problems - Learning

slide-56
SLIDE 56

K-Means Model definition

K-Means is an Euclidean-based clustering algorithm Select initial centroids at random Assign each object to the cluster with the nearest centroid Compute each centroid as the mean of the

  • bjects assigned to it

Repeat previous 2 steps until no change

slide-57
SLIDE 57

Weather model:

  • 3 “hidden” states
  • {rainy, cloudy, sunny}
  • Measure weather-related variables

(e.g. temperature, humidity, barometric pressure)

Problem:

  • Given the values of the weather variables, what is the state?

H i d d e n Observed

Continuous Hidden Markov Model Example in weather forecasting

slide-58
SLIDE 58
  • n states observed through an
  • bservation x
  • Model parameter

Θ={Θ1, Θ2.., Θn}

H i d d e n Observed

Gaussian Mixture Model Model definition

Model

slide-59
SLIDE 59

ascending scale descending scale ascending arpeggio descending arpeggio

  • Let’s consider a gesture dictionnary GD with the following gestures:
  • A set of ergodic HMMs, one per gesture:
  • The parameters λi = (Ai , Bi , πi ) of all the HMMs

Example in Gesture Recognition Case study

slide-60
SLIDE 60
  • We want to recognize
  • It is an ascending arpeggio

with its inversion

Example in Gesture Recognition What to recognize

slide-61
SLIDE 61

State S1 DO with 1st fingering State S2 MI with 2nd fingering State S3 SOL with 3rd fingering State S4 DO with 5th fingering

  • We consider an alphabet of fingerings
  • We assume:
  • A={aij} and
  • That Q={q1, q2, q3, q4, q5, q6,q7} constitutes the ascending

arpeggio with its inversion

  • π1=P(q1)=1

S4 S2 S3 S1

Example in Gesture Recognition How to model the gesture

slide-62
SLIDE 62

Other modeling could lead to a better physical meaning?

Rest state Start state Attack state

Example in Gesture Recognition How to model the gesture

slide-63
SLIDE 63

With Gaussian distributions. How many for M3?

Example in Gesture Recognition How to model the obervations

slide-64
SLIDE 64
  • That the sequence of observations O(t)1:7 (visible sequence) is the following:
  • We assume that M3 has the maximum likelihood since it is the only ergodic model
  • That S(t)1:7 is the state sequence (hidden sequence) that generated O(t)1:7 :

Example in Gesture Recognition How to model the obervations

slide-65
SLIDE 65

q1 q2 q3 q4 x1 x2 x7 x6

O(t)1:7 S(t)1:7

P(q2=S2 | q1=S1) P(Ο6=x6 | q2=S6)

Example in Gesture Recognition How to represent the model

slide-66
SLIDE 66

We know:

  • the model M3
  • the sequence O(t)1:7

Which are:

  • the λ=(A, B, π) of M3 that maximize P(O|λ)

Example in Gesture Recognition How to learn the model

slide-67
SLIDE 67

Viterbi q1 q2 q3 q4 x1 x2 x7 x6

O(t)1:7 Q(t)1:7 We know:

  • the model M3
  • the sequence O(t)1:7

Which are:

  • the Q(t)1:7 that generated O(t)1:7

and maximizes P(Q|O, λ)?

Example in Gesture Recognition How to uncover the hidden path

slide-68
SLIDE 68

q1 q2 q3 q4 x1 x2 x7 x6

O(t)1:7 Q(t)1:7

Forward-Backward

We know:

  • the model M3
  • the sequence O(t)1:7

How to:

  • calculate P(O(t)1:7 | M3)?

Example in Gesture Recognition How to evaluate a sequence of observations

slide-69
SLIDE 69

Sequence of

  • bservations

Μ1 Μ2 Μ4

…. ….

Gesture recognition Likelihood computation Maximum likelihood computation Likelihood computation Likelihood computation

O(t)1:7

Example in Gesture Recognition How to compute the recognize the gestures

slide-70
SLIDE 70

Repeat for n times

statistic

learning

statistics

estimate by

t

t

t Set1,Set2,…,Setn

( )

1

t∗

2

t∗

n

t∗

Setk, k=1, 2,..,n left-out

recognition

t Set2,...,Setn

( )

t Set1,Set3...,Setn

( )

t Set1,...,Setn-1

( )

Example in Gesture Recognition How to evaluate the system

slide-71
SLIDE 71

Example in Gesture Recognition Statistics to be computed

slide-72
SLIDE 72

Example in Gesture Recognition How to create the confusion matrix