exact inference for hidden markov models
play

Exact Inference for Hidden Markov Models Michael Gutmann - PowerPoint PPT Presentation

Exact Inference for Hidden Markov Models Michael Gutmann Probabilistic Modelling and Reasoning (INFR11134) School of Informatics, University of Edinburgh Spring Semester 2020 Recap Assuming a factorisation / set of statistical


  1. Exact Inference for Hidden Markov Models Michael Gutmann Probabilistic Modelling and Reasoning (INFR11134) School of Informatics, University of Edinburgh Spring Semester 2020

  2. Recap ◮ Assuming a factorisation / set of statistical independencies allowed us to efficiently represent the pdf or pmf of random variables ◮ Factorisation can be exploited for inference ◮ by using the distributive law ◮ by re-using already computed quantities ◮ Inference for general factor graphs (variable elimination) ◮ Inference for factor trees ◮ Sum-product and max-product message passing Michael Gutmann HMM Exact Inference 2 / 32

  3. Program 1. Markov models 2. Inference by message passing Michael Gutmann HMM Exact Inference 3 / 32

  4. Program 1. Markov models Markov chains Transition distribution Hidden Markov models Emission distribution Mixture of Gaussians as special case 2. Inference by message passing Michael Gutmann HMM Exact Inference 4 / 32

  5. Applications of (hidden) Markov models Markov and hidden Markov models have many applications, e.g. ◮ speech modelling (speech recognition) ◮ text modelling (natural language processing) ◮ gene sequence modelling (bioinformatics) ◮ spike train modelling (neuroscience) ◮ object tracking (robotics) Michael Gutmann HMM Exact Inference 5 / 32

  6. Markov chains ◮ Chain rule with ordering x 1 , . . . , x d d � p ( x 1 , . . . , x d ) = p ( x i | x 1 , . . . , x i − 1 ) i =1 ◮ If p satisfies ordered Markov property, the number of variables in the conditioning set can be reduced to a subset π i ⊆ { x 1 , . . . , x i − 1 } ◮ Not all predecessors but only subset π i is “relevant” for x i . ◮ L -th order Markov chain: π i = { x i − L , . . . , x i − 1 } d � p ( x 1 , . . . , x d ) = p ( x i | x i − L , . . . , x i − 1 ) i =1 ◮ 1st order Markov chain: π i = { x i − 1 } d � p ( x 1 , . . . , x d ) = p ( x i | x i − 1 ) i =1 Michael Gutmann HMM Exact Inference 6 / 32

  7. Markov chain — DAGs Chain rule x 1 x 2 x 3 x 4 Second-order Markov chain x 1 x 2 x 3 x 4 First-order Markov chain x 1 x 2 x 3 x 4 Michael Gutmann HMM Exact Inference 7 / 32

  8. Vector-valued Markov chains ◮ While not explicitly discussed, the graphical models extend to vector-valued variables ◮ Chain rule with ordering x 1 , . . . , x d d � p ( x 1 , . . . , x d ) = p ( x i | x 1 , . . . , x i − 1 ) i =1 x 1 x 2 x 3 x 4 ◮ 1st order Markov chain: d � p ( x 1 , . . . , x d ) = p ( x i | x i − 1 ) i =1 x 1 x 2 x 3 x 4 Michael Gutmann HMM Exact Inference 8 / 32

  9. Modelling time series ◮ Index i may refer to time t ◮ L -th order Markov chain of length T : T � p ( x 1 , . . . , x T ) = p ( x t | x t − L , . . . , x t − 1 ) t =1 Only the recent past of L time points x t − L , . . . , x t − 1 is relevant for x t ◮ 1st order Markov chain of length T : T � p ( x 1 , . . . , x T ) = p ( x t | x t − 1 ) t =1 Only the last time point x t − 1 is relevant for x t . Michael Gutmann HMM Exact Inference 9 / 32

  10. Transition distribution (Consider 1st order Markov chain.) ◮ p ( x i | x i − 1 ) is called the transition distribution ◮ For discrete random variables, p ( x i | x i − 1 ) is defined by a transition matrix A i p ( x i = k | x i − 1 = k ′ ) = A i k , k ′ ◮ For continuous random variables, p ( x i | x i − 1 ) is a conditional pdf, e.g. � � − ( x i − f i ( x i − 1 )) 2 1 p ( x i | x i − 1 ) = exp � 2 σ 2 2 πσ 2 i i for some function f i ◮ Homogeneous Markov chain: p ( x i | x i − 1 ) does not depend on i , e.g. A i = A σ i = σ, f i = f ◮ Inhomogeneous Markov chain: p ( x i | x i − 1 ) does depend on i Michael Gutmann HMM Exact Inference 10 / 32

  11. Hidden Markov model DAG: h 1 h 2 h 3 h 4 v 1 v 2 v 3 v 4 ◮ 1st order Markov chain on hidden (latent) variables h i . ◮ Each visible (observed) variable v i only depends on the corresponding hidden variable h i ◮ Factorisation d � p ( h 1: d , v 1: d ) = p ( v 1 | h 1 ) p ( h 1 ) p ( v i | h i ) p ( h i | h i − 1 ) i =2 ◮ The visibles are d-connected if hiddens are not observed ◮ Visibles are d-separated (independent) given the hiddens ◮ The h i model/explain all dependencies between the v i Michael Gutmann HMM Exact Inference 11 / 32

  12. Emission distribution ◮ p ( v i | h i ) is called the emission distribution ◮ Discrete-valued v i and h i : p ( v i | h i ) can be represented as a matrix ◮ Discrete-valued v i and continuous-valued h i : p ( v i | h i ) is a conditional pmf. ◮ Continuous-valued v i : p ( v i | h i ) is a density ◮ As for the transition distribution, the emission distribution p ( v i | h i ) may depend on i or not. ◮ If neither the transition nor the emission distribution depend on i , we have a stationary (or homogeneous) hidden Markov model. Michael Gutmann HMM Exact Inference 12 / 32

  13. Gaussian emission model with discrete-valued latents ⊥ h i − 1 , and v i ∈ R m , h i ∈ { 1 , . . . , K } ◮ Special case: h i ⊥ p ( h = k ) = p k � � 1 − 1 Σ − 1 µ k ) ⊤ Σ p ( v | h = k ) = Σ k | 1 / 2 exp 2( v − µ Σ k ( v − µ µ k ) µ µ Σ | det 2 π Σ for all h i and v i . ◮ DAG h 1 h 2 h d . . . v d v 1 v 2 ◮ Corresponds to d iid draws from a Gaussian mixture model with K mixture components ◮ Mean E [ v | h = k ] = µ µ µ k ◮ Covariance matrix V [ v | h = k ] = Σ Σ Σ k Michael Gutmann HMM Exact Inference 13 / 32

  14. Gaussian emission model with discrete-valued latents The HMM is a generalisation of the Gaussian mixture model where cluster membership at “time” i (the value of h i ) generally depends on cluster membership at “time” i − 1 (the value of h i − 1 ). 1 1 0.5 0.5 k = 1 k = 3 k = 2 0 0 0 0.5 1 0 0.5 1 Example for v i ∈ R 2 , h i ∈ { 1 , 2 , 3 } . Left: p ( v | h = k ). Right: samples (Bishop, Figure 13.8) Michael Gutmann HMM Exact Inference 14 / 32

  15. Program 1. Markov models Markov chains Transition distribution Hidden Markov models Emission distribution Mixture of Gaussians as special case 2. Inference by message passing Michael Gutmann HMM Exact Inference 15 / 32

  16. Program 1. Markov models 2. Inference by message passing Inference: filtering, prediction, smoothing, Viterbi Filtering: Sum-product message passing yields the alpha-recursion from the HMM literature Smoothing: Sum-product message passing yields the alpha-beta recursion from the HMM literature Sum-product message passing for prediction, inference of most likely hidden path, and for inference of joint distributions Michael Gutmann HMM Exact Inference 16 / 32

  17. The classical inference problems (Considering the index i to refer to time t ) Filtering (Inferring the present) p ( h t | v 1: t ) Smoothing (Inferring the past) p ( h t | v 1: u ) t < u Prediction (Inferring the future) p ( h t | v 1: u ) t > u Most likely (Viterbi alignment) argmax h 1: t p ( h 1: t | v 1: t ) Hidden path For prediction, one is also often interested in p ( v t | v 1: u ) for t > u . (slide courtesy of David Barber) Michael Gutmann HMM Exact Inference 17 / 32

  18. The classical inference problems filtering ������������ ������������ ������������ ������������ ������������ ������������ ������������ ������������ t smoothing �������������������� �������������������� �������������������� �������������������� �������������������� �������������������� ����������������������� ����������������������� �������������������� �������������������� t prediction �������� �������� �������� �������� �������� �������� ����������������������� ����������������������� �������� �������� t ���� ���� denotes the extent of data ���� ���� ���� ���� ���� ���� available (slide courtesy of Chris Williams) Michael Gutmann HMM Exact Inference 18 / 32

  19. Factor graph for hidden Markov model (see tutorial 4) DAG: h 1 h 2 h 3 h 4 v 1 v 2 v 3 v 4 Factor graph: p ( h 2 | h 1 ) p ( h 3 | h 2 ) p ( h 4 | h 3 ) p ( h 1 ) h 1 h 2 h 3 h 4 p ( v 1 | h 1 ) p ( v 2 | h 2 ) p ( v 3 | h 3 ) p ( v 4 | h 4 ) v 1 v 2 v 3 v 4 Michael Gutmann HMM Exact Inference 19 / 32

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend