Hidden Markov Model (HMM) Sensor Markov assumption: P ( E t | X 0: t - PDF document

Hidden Markov Model (HMM) Sensor Markov assumption: P ( E t | X 0: t , E 1: t − 1 ) = P ( E t | X t ) Stationary process: transition model P ( X t | X t − 1 ) and Hidden Markov Models sensor model P ( E t | X t ) fixed for all t HMM is a special type of Bayes net, X t is single discrete random variable: AIMA Chapter 15, Sections 1–5 with joint probability distribution P ( X 0: t , E 1: t ) = P ( X 0 ) Π t i =1 P ( X i | X i − 1 ) P ( E i | X i ) AIMA Chapter 15, Sections 1–5 1 AIMA Chapter 15, Sections 1–5 4 Time and uncertainty Example Consider a target tracking problem R t −1 P(R ) t t 0.7 f 0.3 X t = set of unobservable state variables at time t Rain t −1 Rain Rain t +1 t e.g., Position t , Appearance t , etc. R P(U ) t t E t = set of observable evidence variables at time t t 0.9 f 0.2 e.g., Imagepixels t Umbrella t −1 Umbrella Umbrella t +1 t This assumes discrete time ; step size depends on problem First-order Markov assumption not exactly true in real world! Notation: X a : b = X a , X a +1 , . . . , X b − 1 , X b Possible fixes: 1. Increase order of Markov process 2. Augment state , e.g., add Temp t , Pressure t Example: robot motion. Augment position and velocity with Battery t AIMA Chapter 15, Sections 1–5 2 AIMA Chapter 15, Sections 1–5 5 Markov processes (Markov chains) Inference tasks Filtering: P ( X t | e 1: t ) Construct a Bayes net from these variables: belief state—input to the decision process of a rational agent Markov assumption: X t depends on bounded subset of X 0: t − 1 Prediction: P ( X t + k | e 1: t ) for k > 0 First-order Markov process: P ( X t | X 0: t − 1 ) = P ( X t | X t − 1 ) evaluation of possible action sequences; Second-order Markov process: P ( X t | X 0: t − 1 ) = P ( X t | X t − 2 , X t − 1 ) like filtering without the evidence X t −2 X t −1 X t X t +1 X t +2 Smoothing: P ( X k | e 1: t ) for 0 ≤ k < t First−order better estimate of past states, essential for learning Most likely explanation: arg max x 1: t P ( x 1: t | e 1: t ) speech recognition, decoding with a noisy channel X t −2 X t −1 X t X t +1 X t +2 Second−order Stationary process: transition model P ( X t | X t − 1 ) fixed for all t AIMA Chapter 15, Sections 1–5 3 AIMA Chapter 15, Sections 1–5 6

Smoothing Most likely explanation X 0 X 1 X k X t Most likely sequence � = sequence of most likely states!!!! Most likely path to each x t +1 E E k E t 1 = most likely path to some x t plus one more step Divide evidence e 1: t into e 1: k , e k +1: t : x 1 ... x t P ( x 1 , . . . , x t , X t +1 | e 1: t +1 ) max P ( X k | e 1: t ) = P ( X k | e 1: k , e k +1: t )   = P ( e t +1 | X t +1 ) max  P ( X t +1 | x t ) max x 1 ... x t − 1 P ( x 1 , . . . , x t − 1 , x t | e 1: t )  x t = α P ( X k | e 1: k ) P ( e k +1: t | X k , e 1: k ) Identical to filtering, except f 1: t replaced by = α P ( X k | e 1: k ) P ( e k +1: t | X k ) = α f 1: k b k +1: t m 1: t = x 1 ... x t − 1 P ( x 1 , . . . , x t − 1 , X t | e 1: t ) , max Backward message computed by a backwards recursion: I.e., m 1: t ( i ) gives the probability of the most likely path to state i . P ( e k +1: t | X k ) = Σ x k +1 P ( e k +1: t | X k , x k +1 ) P ( x k +1 | X k ) Update has sum replaced by max, giving the Viterbi algorithm: = Σ x k +1 P ( e k +1: t | x k +1 ) P ( x k +1 | X k ) m 1: t +1 = P ( e t +1 | X t +1 ) max x t ( P ( X t +1 | x t ) m 1: t ) = Σ x k +1 P ( e k +1 | x k +1 ) P ( e k +2: t | x k +1 ) P ( x k +1 | X k ) AIMA Chapter 15, Sections 1–5 13 AIMA Chapter 15, Sections 1–5 16 Smoothing example Viterbi example 0.500 0.627 0.500 0.373 Rain 1 Rain 2 Rain 3 Rain 4 Rain 5 True 0.500 0.818 0.883 forward False 0.500 0.182 0.117 true true true true true state space 0.883 0.883 smoothed paths 0.117 0.117 false false false false false true true false true true 0.690 1.000 umbrella backward 0.410 1.000 .8182 .5155 .0361 .0334 .0210 Rain 0 Rain 1 Rain 2 most likely paths .1818 .0491 .1237 .0173 .0024 m 1:1 m 1:2 m 1:3 m 1:4 m 1:5 Umbrella 1 Umbrella 2 Forward–backward algorithm: cache forward messages along the way Time linear in t (polytree inference), space O ( t | f | ) AIMA Chapter 15, Sections 1–5 14 AIMA Chapter 15, Sections 1–5 17 Most likely explanation Example Umbrella Problems Filtering: P ( X t +1 | e 1: t +1 ) = α P ( e t +1 | X t +1 ) Σ x t P ( X t +1 | x t ) P ( x t | e 1: t ) =: f 1: t +1 Smoothing: P ( X k | e 1: t ) = α f 1: k b k +1: t P ( e k +1: t | X k ) = Σ x k +1 P ( e k +1 | x k +1 ) P ( e k +2: t | x k +1 ) P ( x k +1 | X k ) =: b k +1: t R t − 1 P ( R t ) R t P ( U t ) t 0.7 t 0.9 f 0.3 f 0.2 P ( R 3 |¬ u 1 , u 2 , ¬ u 3 ) = ? arg max R 1:3 P ( R 1:3 |¬ u 1 , u 2 , ¬ u 3 ) = ? P ( R 2 |¬ u 1 , u 2 , ¬ u 3 ) = ? AIMA Chapter 15, Sections 1–5 15 AIMA Chapter 15, Sections 1–5 18

Hidden Markov Model (HMM) Sensor Markov assumption: P ( E t | X 0: t - PDF document

Hidden Markov Model (HMM) Sensor Markov assumption: P ( E t | X 0: t , E 1: t 1 ) = P ( E t | X t ) Stationary process: transition model P ( X t | X t 1 ) and Hidden Markov Models sensor model P ( E t | X t ) fixed for all t HMM is a

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

The Hidden Markov The Hidden Markov Model (HMM) Model (HMM) 1 Lecture Outline Lecture Outline

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Hidden Markov Models Pratik Lahiri Introduction A hidden Markov model (HMM) is a

Introduction to Hmm Introduction to Hmm Joe Wu Nov 4 th 2011 Agenda The applications of HMM.

Cell implementation HMM (HMM hidden Markov model) Authors: Jakub Hork Ji Hona

Hidden Markov Models Steven J Zeil Old Dominion Univ. Fall 2010 1 Discrete Markov Processes

Outline depmixS4: an R-package for hidden Markov models Hidden Markov Models Ingmar Visser 1

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

Hidden Markov Models (HMM) Many slides from Michael Collins and HMMs Overview I The Tagging

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Lecture 9: Hidden Markov Model Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse

1 Real HMM Examples Real HMM Examples Speech recognition HMMs: Machine translation HMMs:

1 Real HMM Examples Real HMM Examples Speech recognition HMMs: Machine translation HMMs:

Markov Models Kunsch, H.R., State Space and Hidden Markov Models . ETH- Zurich, Zurich;

HMM Review Lecture Outline 1. Markov models 2. Hidden Markov

Prob obab abil ilit ity y an and d Tim Time: H Hid idde den Ma Marko kov v Mo Mode

Poverty and Inequality Dynamics. Ira N. Gang, Rutgers University Ksenia Gatskova, IOS-Regensburg

Randomized Algorithms Lecture 3: Occupancy, Moments and deviations, Randomized selection

Ground state expansion and the spectral gap of local Hamiltonians Elizabeth Crosson California

The simplex method is strongly polynomial for deterministic Markov decision processes Ian Post

Markov Chains and Hidden Markov Models COMP 571 - Spring 2015 Luay Nakhleh, Rice University

18.175: Lecture 32 More Markov chains Scott Sheffield MIT 1 18.175 Lecture 32 Outline General

Markov Chains Toolbox Search: uninformed/heuristic Adversarial search Probability