Modeling time series with hidden Markov models Advanced Machine - - PowerPoint PPT Presentation

modeling time series with hidden markov models
SMART_READER_LITE
LIVE PREVIEW

Modeling time series with hidden Markov models Advanced Machine - - PowerPoint PPT Presentation

Modeling time series with hidden Markov models Advanced Machine learning 2017 Nadia Figueroa, Jose Medina and Aude Billard Time series data Barometric pressure Temperature Data Humidity Time Whats going on here? Modeling time series


slide-1
SLIDE 1

Modeling time series with hidden Markov models

Advanced Machine learning 2017

Nadia Figueroa, Jose Medina and Aude Billard

slide-2
SLIDE 2

Time series data

Modeling time series with HMMs 2

Time Data

What‘s going on here?

Humidity Barometric pressure Temperature

slide-3
SLIDE 3

Time series data

Modeling time series with HMMs 3

What‘s the problem setting?

Time Data

We don’t care about time …

Time Data

We have several trajectories with identical duration

Explicit time dependency

Hum.

We have unstructured trajectory(ies)!

Consider dependency on the past

Too complex!

slide-4
SLIDE 4

Unstructured time series data

Modeling time series with HMMs 4

How to simplify this problem?

Consider dependency on the past

Markov assumption

Rainy Cloudy Sunny

slide-5
SLIDE 5

Outline

Modeling time series with HMMs 5

Second part (11:15 – 12:00):

  • Time series segmentation
  • Bayesian nonparametrics for HMMs

https://github.com/epfl-lasa/ML_toolbox First part (10:15 – 11:00):

  • Recap on Markov chains
  • Hidden Markov Model (HMM)
  • Recognition of time series
  • ML Parameter estimation

Time Data

slide-6
SLIDE 6

Outline first part

Modeling time series with HMMs 6 Rainy Cloudy Sunny

slide-7
SLIDE 7

Markov chains

Modeling time series with HMMs 7 Sunny Cloudy Rainy

Transition matrix

Sunny Cloudy Rainy Sunny Cloudy Rainy

Initial probabilities

Sunny Cloudy Rainy

slide-8
SLIDE 8

Modeling time series with HMMs 8

Likelihood of a Markov chain

Sunny Sunny Cloudy

Transition matrix

Sunny Cloudy Rainy Sunny Cloudy Rainy

Initial probabilities

Sunny Cloudy Rainy

slide-9
SLIDE 9

Modeling time series with HMMs 9

Learning Markov chains

Sunny Sunny Cloudy Periodic Left-to-right Ergodic

Topologies

slide-10
SLIDE 10

Outline first part

Modeling time series with HMMs 10 Rainy Cloudy Sunny

slide-11
SLIDE 11

Hidden Markov model

Modeling time series with HMMs 11 Sunny Cloudy Rainy

Transition matrix

Sunny Cloudy Rainy Sunny Cloudy Rainy

Initial probabilities

Sunny Cloudy Rainy

slide-12
SLIDE 12

Likelihood of an HMM

Modeling time series with HMMs 12

Transition matrix

Sunny Cloudy Rainy Sunny Cloudy Rainy

Initial probabilities

Sunny Cloudy Rainy Forward variable

slide-13
SLIDE 13

Likelihood of an HMM

Modeling time series with HMMs 13

Transition matrix

Sunny Cloudy Rainy Sunny Cloudy Rainy

Initial probabilities

Sunny Cloudy Rainy Forward variable

slide-14
SLIDE 14

Likelihood of an HMM

Modeling time series with HMMs 14

Transition matrix

Sunny Cloudy Rainy Sunny Cloudy Rainy

Initial probabilities

Sunny Cloudy Rainy Backward variable

slide-15
SLIDE 15

Learning an HMM

Modeling time series with HMMs 15

Baum-Welch algorithm (Expectation-Maximization for HMMs)

  • Iterative solution
  • Converges to local minimum

Starting from an initial find a such that

  • E-step: Given an observation sequence and a model, find the

probabilities of the states to have produced those observations.

  • M-step: Given the output of the E-step, update the model parameters to

better fit the observations.

slide-16
SLIDE 16

Learning an HMM

Modeling time series with HMMs 16 Probability of being in state i at time k Probability of being in state i at time k and transition to state j

E-step:

slide-17
SLIDE 17

Learning an HMM

Modeling time series with HMMs 17 Probability of being in state i at time k Probability of being in state i at time k and transition to state j

M-step:

slide-18
SLIDE 18

Learning an HMM

Modeling time series with HMMs 18

  • HMM is a parametric technique (Fixed number of states, fixed topology)
  •  Heuristics to determining the optimal number of states

( )

: dataset; : number of datapoints; : number of free parameters

  • Aikaike Information Criterion: AIC=

2ln 2

  • Bayesian Information Criterion:

2ln ln L: maximum likelihood of the model giv X N K L K BIC L K N − + = − + en K parameters Lower BIC implies either fewer explanatory variables, better fit, or both. As the number of datapoints (observations) increase, BIC assigns more weights to simpler models than AIC.

Choosing AIC versus BIC depends on the application: Is the purpose of the analysis to make predictions, or to decide which model best represents reality? AIC may have better predictive ability than BIC, but BIC finds a computationally more efficient solution.

slide-19
SLIDE 19

Applications of HMMs

Modeling time series with HMMs 19

State estimation: What is the most probable state/state sequence of the system? Prediction: What are the most probable next

  • bservations/state of the system?

Model selection: What is the most likely model that represents these observations?

slide-20
SLIDE 20

Examples

Modeling time series with HMMs 20

Speech recognition:

  • Left-to-right model
  • States are phonemes
  • Observations in

frequency domain

D.B. Paul., Speech Recognition Using Hidden Markov Models, The Lincoln laboratory journal, 1990

slide-21
SLIDE 21

Examples

Modeling time series with HMMs 21

Motion prediction:

  • Periodic model
  • Observations are
  • bserved joints
  • Simulate/predict

walking patterns

Karg, Michelle, et al. "Human movement analysis: Extension of the f-statistic to time series using hmm." Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on. IEEE, 2013.

slide-22
SLIDE 22

Examples

Modeling time series with HMMs 22

Motion prediction:

  • Left-to-right

models

  • Autonomous

segmentation

  • Recognition +

prediction

slide-23
SLIDE 23

Examples

Modeling time series with HMMs 23

Motion prediction:

  • Left-to-right

models

  • Autonomous

segmentation

  • Recognition +

prediction

slide-24
SLIDE 24

Examples

Modeling time series with HMMs 24

Motion prediction:

  • Left-to-right model
  • Each state is a

dynamical system

slide-25
SLIDE 25

Examples

Modeling time series with HMMs 25

Motion recognition:

  • Recognition of

most likely motion and prediction of next step.

MATLAB demo

Toy training set

  • 1 player
  • 7 actions
  • 1 Hidden Markov model per action
slide-26
SLIDE 26

Outline

Modeling time series with HMMs 26

Second part (11:15 – 12:00):

  • Time series segmentation
  • Bayesian non-parametrics for HMMs

https://github.com/epfl-lasa/ML_toolbox First part (10:15 – 11:00):

  • Recap on Markov chains
  • Hidden Markov Model (HMM)
  • Recognition of time series
  • ML Parameter estimation

Time Data

slide-27
SLIDE 27

Time series Segmentation

Modeling time series with HMMs 27

Times-series = Sequence of discrete segments

Why is this an important problem?

slide-28
SLIDE 28

Segmentation of Speech Signals

Modeling time series with HMMs 28

Segmenting a continuous speech signal into sets of distinct words.

slide-29
SLIDE 29

Modeling time series with HMMs 29

I am seafood

  • n a

diet. I see food and I eat it!

Segmentation of Speech Signals

Segmenting a continuous speech signal into sets of distinct words.

slide-30
SLIDE 30

Segmentation of Human Motion Data

Modeling time series with HMMs 30

Emily Fox et al., Sharing Features among Dynamical Systems with Beta Processes, NIPS, 2009

Segmention of Continuous Motion Capture data from exercise routines into motion categories

Jumping Jacks Arm Circles Squats Knee Raises

slide-31
SLIDE 31

Segmentation in Human Motion Data

Modeling time series with HMMs 31

Emily Fox et al.., Sharing Features among Dynamical Systems with Beta Processes, NIPS, 2009

12 Variables

  • Torso position
  • Waist Angles (2)
  • Neck Angle
  • Shoulder Angles
  • ..

Emily Fox et al., Sharing Features among Dynamical Systems with Beta Processes, NIPS, 2009

slide-32
SLIDE 32

Segmentation in Robotics

Modeling time series with HMMs 32

Learning Complex Sequential Tasks from Demonstration

7 Variables

  • Position
  • Orientation

Reach Grate Trash

slide-33
SLIDE 33

HMM for Time series Segmentation

Modeling time series with HMMs 33

Assumptions:

  • The time-series has been generated by a system that transitions between a set of

hidden states:

  • At each time step, a sample is drawn from an emission model associated to the

current hidden state:

slide-34
SLIDE 34

HMM for Time series Segmentation

Modeling time series with HMMs 34

How do we find these segments?

slide-35
SLIDE 35

HMM for Time series Segmentation

Modeling time series with HMMs 35

Initial State Probabilities Transition Matrix Emission Model Parameters Baum-Welch algorithm (Expectation-Maximization for HMMs)

  • Iterative solution
  • Converges to local minimum

Steps for Segmentation with HMM:

  • 1. Learn the HMM parameters through Maximum Likelihood

Estimate (MLE):

HMM Likelihood Hyper-parameter: Number of states possible K

slide-36
SLIDE 36

HMM for Time series Segmentation

Modeling time series with HMMs 36

Steps for Segmentation with HMM:

  • 2. Find the most probable sequence of states generating the
  • bservations through the Viterbi algorithm:

HMM Joint Probability Distribution

slide-37
SLIDE 37

HMM for Time series Segmentation

Modeling time series with HMMs 37

slide-38
SLIDE 38

HMM for Time series Segmentation

Modeling time series with HMMs 38

slide-39
SLIDE 39

Model Selection for HMMs

Modeling time series with HMMs

slide-40
SLIDE 40

Model Selection for HMMs

Modeling time series with HMMs

?

slide-41
SLIDE 41

Limitations of classical finite HMMs for Segmentation

Modeling time series with HMMs 41

Cardinality: Choice of hidden states is based on Model Selection heuristics, there is little understanding of the strengths and weaknesses of such methods in this setting [1].

[1] Emily Fox et al., An HDP-HMM for Systems with State Persistence, ICML, 2008 [2] Emily Fox et al., Sharing Features among Dynamical Systems with Beta Processes, NIPS, 2009

Topology: We assume that all time series share the same set of emission models and switch among them in exactly the same manner [2]. Undefined # hidden states Fixed Transition Matrix

Solution: Bayesian Non-Parametrics

slide-42
SLIDE 42

Bayesian Non-Parametrics

Modeling time series with HMMs

  • Bayesian: Use Bayesian inference to estimate the parameters; i.e.

priors on model parameters!

  • Non-parametric: Does NOT mean methods with “no parameters”,

rather models whose complexity (# of states, # Gaussians) is inferred from the data.

  • 1. Number of parameters grows with sample size.
  • 2. Infinite-dimensional parameter space!

Prior Prior

slide-43
SLIDE 43

BNP for HMMs: HDP-HMM

Modeling time series with HMMs

 Cardinality

  • Hierarchical Dirichlet Process (HDP) prior on the transition Matrix!
  • Normal Inverse Wishart (NIW) prior on emission parameters!

Hyper- parameters!

Emily Fox et al., An HDP-HMM for Systems with State Persistence, ICML, 2008

slide-44
SLIDE 44

BNP for HMMs: HDP-HMM

Modeling time series with HMMs

 Cardinality

  • The Dirichlet Process (DP) is a prior distribution over distributions.
  • Used for clustering with infinite mixture models; i.e. instead of setting K in

a GMM, the K is learned from data. This only gives us an estimate of the K clusters! Cannot be use directly on transition matrix:

  • The Hierarchical Dirichlet Process (HDP) is a hierarchy of DPs!
slide-45
SLIDE 45

Segmentation with HDP-HMM

Modeling time series with HMMs

Emily Fox et al., An HDP-HMM for Systems with State Persistence, ICML, 2008

slide-46
SLIDE 46

Compare to Model Selection with Classical HMMs

Modeling time series with HMMs

slide-47
SLIDE 47

BNP for HMMs: BP-HMM

Modeling time series with HMMs

 Cardinality  Topology

  • The Beta Process (BP) prior on the transition Matrix!
  • Normal Inverse Wishart (NIW) prior on emission parameters!

Emily Fox et al., Sharing Features among Dynamical Systems with Beta Processes, NIPS, 2009

slide-48
SLIDE 48

BNP for HMMs: BP-HMM

Modeling time series with HMMs

 Cardinality  Topology

  • The Beta Process (BP) prior on the transition Matrix!

Features (i.e. shared HMM States) Time-Series

slide-49
SLIDE 49

Segmentation in Human Motion Data

Modeling time series with HMMs 49

Emily Fox et al.., Sharing Features among Dynamical Systems with Beta Processes, NIPS, 2009

12 Variables

  • Torso position
  • Waist Angles (2)
  • Neck Angle
  • Shoulder Angles
  • ..

Emily Fox et al., Sharing Features among Dynamical Systems with Beta Processes, NIPS, 2009

slide-50
SLIDE 50

Applications in Robotics

Modeling time series with HMMs 50

Learning Complex Sequential Tasks from Demonstration

slide-51
SLIDE 51
slide-52
SLIDE 52
slide-53
SLIDE 53
slide-54
SLIDE 54