CZECH TECHNICAL UNIVERSITY IN PRAGUE Faculty of Electrical Engineering Department of Cybernetics Hidden Markov Models. Petr Pošík Czech Technical University in Prague Faculty of Electrical Engineering Dept. of Cybernetics P. Pošík c � 2017 Artificial Intelligence – 1 / 34
Markov Models P. Pošík c � 2017 Artificial Intelligence – 2 / 34
Reasoning over Time or Space In areas like ■ speech recognition, Markov Models ■ robot localization, • Time and space ■ medical monitoring, • Markov models • Joint ■ language modeling, • MC Example • Prediction ■ DNA analysis, • Stationary distribution ■ . . . , • PageRank we want to reason about a sequence of observations. HMM Summary P. Pošík c � 2017 Artificial Intelligence – 3 / 34
Reasoning over Time or Space In areas like ■ speech recognition, Markov Models ■ robot localization, • Time and space ■ medical monitoring, • Markov models • Joint ■ language modeling, • MC Example • Prediction ■ DNA analysis, • Stationary distribution ■ . . . , • PageRank we want to reason about a sequence of observations. HMM We need to introduce time (or space) into our models: Summary ■ A static world is modeled using a variable for each of its aspects which are of interest. ■ A changing world is modeled using these variables at each point in time . The world is viewed as a sequence of time slices . ■ Random variables form sequences in time or space. P. Pošík c � 2017 Artificial Intelligence – 3 / 34
Reasoning over Time or Space In areas like ■ speech recognition, Markov Models ■ robot localization, • Time and space ■ medical monitoring, • Markov models • Joint ■ language modeling, • MC Example • Prediction ■ DNA analysis, • Stationary distribution ■ . . . , • PageRank we want to reason about a sequence of observations. HMM We need to introduce time (or space) into our models: Summary ■ A static world is modeled using a variable for each of its aspects which are of interest. ■ A changing world is modeled using these variables at each point in time . The world is viewed as a sequence of time slices . ■ Random variables form sequences in time or space. Notation: X t is the set of variables describing the world state at time t . ■ X b a is the set of variables from X a to X b . ■ ■ E.g., X t 1 corresponds to variables X 1 , . . . , X t . P. Pošík c � 2017 Artificial Intelligence – 3 / 34
Reasoning over Time or Space In areas like ■ speech recognition, Markov Models ■ robot localization, • Time and space ■ medical monitoring, • Markov models • Joint ■ language modeling, • MC Example • Prediction ■ DNA analysis, • Stationary distribution ■ . . . , • PageRank we want to reason about a sequence of observations. HMM We need to introduce time (or space) into our models: Summary ■ A static world is modeled using a variable for each of its aspects which are of interest. ■ A changing world is modeled using these variables at each point in time . The world is viewed as a sequence of time slices . ■ Random variables form sequences in time or space. Notation: X t is the set of variables describing the world state at time t . ■ X b a is the set of variables from X a to X b . ■ ■ E.g., X t 1 corresponds to variables X 1 , . . . , X t . We need a way to specify joint distribution over a large number of random variables using assumptions suitable for the fields mentioned above. P. Pošík c � 2017 Artificial Intelligence – 3 / 34
Markov models Transition model ■ In general, it specifies the probability distribution over the current state X t Markov Models given all the previous states X t − 1 : • Time and space 0 • Markov models . . . X 0 X 1 X 2 X 3 • Joint P ( X t | X t − 1 ) • MC Example 0 • Prediction • Stationary distribution • PageRank HMM Summary P. Pošík c � 2017 Artificial Intelligence – 4 / 34
Markov models Transition model ■ In general, it specifies the probability distribution over the current state X t Markov Models given all the previous states X t − 1 : • Time and space 0 • Markov models . . . X 0 X 1 X 2 X 3 • Joint P ( X t | X t − 1 ) • MC Example 0 • Prediction • Stationary ■ Problem 1: X t − 1 distribution is unbounded in size as t increases. 0 • PageRank ■ Solution: Markov assumption — the current state depends only on a finite fixed HMM number of previous states. Such processes are called Markov processes or Markov Summary chains . P. Pošík c � 2017 Artificial Intelligence – 4 / 34
Markov models Transition model ■ In general, it specifies the probability distribution over the current state X t Markov Models given all the previous states X t − 1 : • Time and space 0 • Markov models . . . X 0 X 1 X 2 X 3 • Joint P ( X t | X t − 1 ) • MC Example 0 • Prediction • Stationary ■ Problem 1: X t − 1 distribution is unbounded in size as t increases. 0 • PageRank ■ Solution: Markov assumption — the current state depends only on a finite fixed HMM number of previous states. Such processes are called Markov processes or Markov Summary chains . ■ First-order Markov process: X 0 X 1 X 2 X 3 . . . P ( X t | X t − 1 ) = P ( X t | X t − 1 ) 0 ■ Second-order Markov process: P ( X t | X t − 1 ) = P ( X t | X t − 2 . . . X 0 X 1 X 2 X 3 t − 1 ) 0 P. Pošík c � 2017 Artificial Intelligence – 4 / 34
Markov models Transition model ■ In general, it specifies the probability distribution over the current state X t Markov Models given all the previous states X t − 1 : • Time and space 0 • Markov models . . . X 0 X 1 X 2 X 3 • Joint P ( X t | X t − 1 ) • MC Example 0 • Prediction • Stationary ■ Problem 1: X t − 1 distribution is unbounded in size as t increases. 0 • PageRank ■ Solution: Markov assumption — the current state depends only on a finite fixed HMM number of previous states. Such processes are called Markov processes or Markov Summary chains . ■ First-order Markov process: X 0 X 1 X 2 X 3 . . . P ( X t | X t − 1 ) = P ( X t | X t − 1 ) 0 ■ Second-order Markov process: P ( X t | X t − 1 ) = P ( X t | X t − 2 . . . X 0 X 1 X 2 X 3 t − 1 ) 0 ■ Problem 2: Even with Markov assumption, there are infinitely many values of t . Do we have to specify a different distribution in each time step? ■ Solution: assume a stationary process , i.e. the transition model does not change over time: t − k ) = P ( X t ′ | X t ′ − 1 P ( X t | X t − 1 t � = t ′ . t ′ − k ) for P. Pošík c � 2017 Artificial Intelligence – 4 / 34
Joint distribution of a Markov model Assuming a stationary first-order Markov chain, . . . X 0 X 1 X 2 X 3 Markov Models • Time and space • Markov models • Joint the MC joint distribution is factorized as • MC Example • Prediction T • Stationary P ( X T ∏ 0 ) = P ( X 0 ) P ( X t | X t − 1 ) . distribution • PageRank t = 1 HMM Summary P. Pošík c � 2017 Artificial Intelligence – 5 / 34
Joint distribution of a Markov model Assuming a stationary first-order Markov chain, . . . X 0 X 1 X 2 X 3 Markov Models • Time and space • Markov models • Joint the MC joint distribution is factorized as • MC Example • Prediction T • Stationary P ( X T ∏ 0 ) = P ( X 0 ) P ( X t | X t − 1 ) . distribution • PageRank t = 1 HMM Summary This factorization is possible due to the following assumptions: ⊥ X t − 2 X t ⊥ | X t − 1 0 ■ Past X are conditionally independent of future X given present X . ■ In many cases, these assumptions are reasonable. ■ They simplify things a lot: we can do reasoning in polynomial time and space ! P. Pošík c � 2017 Artificial Intelligence – 5 / 34
Joint distribution of a Markov model Assuming a stationary first-order Markov chain, . . . X 0 X 1 X 2 X 3 Markov Models • Time and space • Markov models • Joint the MC joint distribution is factorized as • MC Example • Prediction T • Stationary P ( X T ∏ 0 ) = P ( X 0 ) P ( X t | X t − 1 ) . distribution • PageRank t = 1 HMM Summary This factorization is possible due to the following assumptions: ⊥ X t − 2 X t ⊥ | X t − 1 0 ■ Past X are conditionally independent of future X given present X . ■ In many cases, these assumptions are reasonable. ■ They simplify things a lot: we can do reasoning in polynomial time and space ! Just a growing Bayesian network with a very simple structure. P. Pošík c � 2017 Artificial Intelligence – 5 / 34
MC Example ■ States: X = { rain , sun } = { r , s } As a state transition diagram (automaton): 0.3 ■ Initial distribution: sun 100% 0.7 0.9 ■ Transition model: P ( X t | X t − 1 ) sun rain 0.1 As a conditional prob. table: X t − 1 X t P ( X t | X t − 1 ) As a state trellis: sun sun 0.9 0.7 rain rain sun rain 0.1 0.1 rain sun 0.3 0.3 rain rain 0.7 0.9 sun sun X t − 1 X t P. Pošík c � 2017 Artificial Intelligence – 6 / 34
Recommend
More recommend