The Hidden Markov The Hidden Markov Model (HMM) Model (HMM) 1 - PowerPoint PPT Presentation

Digital Speech Processing— — Digital Speech Processing Lecture 20 Lecture 20 The Hidden Markov The Hidden Markov Model (HMM) Model (HMM) 1

Lecture Outline Lecture Outline • Theory of Markov Models – discrete Markov processes – hidden Markov processes • Solutions to the Three Basic Problems of HMM’s – computation of observation probability – determination of optimal state sequence – optimal training of model • Variations of elements of the HMM – model types – densities • Implementation Issues – scaling – multiple observation sequences – initial parameter estimates – insufficient training data • Implementation of Isolated Word Recognizer Using HMM’s 2

Stochastic Signal Modeling Stochastic Signal Modeling • Reasons for Interest: – basis for theoretical description of signal processing algorithms – can learn about signal source properties – models work well in practice in real world applications • Types of Signal Models – deteministic, parametric models – stochastic models 3

Discrete Markov Processes Discrete Markov Processes { } System of N distinct states, S S , ,..., S 1 2 N Time( ) 1 2 3 4 5 ... t State q q q q q ... 1 2 3 4 5 Markov Property: ⎡ ⎤ ⎡ ⎤ = = = = = = P q S q | S q , S ,... P q S q | S ⎣ ⎦ ⎣ ⎦ − − − t i t 1 j t 2 k t i t 1 j 4

Properties of State Transition Coefficients Properties of State Transition Coefficients Consider processes where state transitions are time independent, i.e., ⎡ ⎤ = = = ≤ ≤ a P q S q | S , 1 i j , N ⎣ ⎦ − ji t i t 1 j ≥ ∀ a 0 j i , ji N ∑ = ∀ a 1 j ji = i 1 5

Example of Discrete Markov Example of Discrete Markov Process Process Once each day (e.g., at noon), the weather is observed and classified as being one of the following: – State 1—Rain (or Snow; e.g. precipitation) – State 2—Cloudy – State 3—Sunny with state transition probabilities: ⎡ ⎤ 0.4 0.3 0.3 ⎢ ⎥ { } = = ⎢ A a 0.2 0.6 0.2 ⎥ ij ⎢ ⎥ 0.1 0.1 0.8 ⎣ ⎦ 6

Discrete Markov Process Discrete Markov Process Problem: Given that the weather on day 1 is sunny, what is the probability (according to the model) that the weather for the next 7 days will be “sunny-sunny-rain- rain-sunny-cloudy-sunny”? Solution: We define the observation sequence, O, as: { } = O S S S S S S S S , , , , , , , 3 3 3 1 1 3 2 3 and we want to calculate P(O|Model). That is: [ ] = P O ( |Model) P S S S S S S S S , , , , , , , |Model 3 3 3 1 1 3 2 3 7

Discrete Markov Process Discrete Markov Process [ ] = P O ( |Model) P S S S S S S S S , , , , , , , |Model 3 3 3 1 1 3 2 3 [ ] [ ] [ ] [ ] 2 = P S P S | S P S | S P S | S 3 3 3 1 3 1 1 [ ] [ ] [ ] ⋅ P S | S P S | S P S | S 3 1 2 3 3 2 ( ) 2 = π a a a a a a 3 33 31 11 13 32 23 ( ) ( )( )( )( )( ) 2 = 1 0.8 0.1 0.4 0.3 0.1 0.2 − = ⋅ 04 1.536 10 [ ] π = = ≤ ≤ P q S , 1 i N i 1 i 8

Discrete Markov Process Discrete Markov Process Problem: Given that the model is in a known state, what is the probability it stays in that state for exactly d days? Solution: { } = ≠ O S S S , , ,..., S S , S i i i i j i + 1 2 3 d d 1 ( ) ( ) − d 1 = = − = P O |Model, q S a (1 a ) p d ( ) 1 i ii ii i ∞ 1 ∑ = = d d p d ( ) i i − 1 a = d 1 ii 9

Exercise Exercise Given a single fair coin, i.e., P (H=Heads)= P (T=Tails) = 0.5, which you toss once and observe Tails: a) what is the probability that the next 10 tosses will provide the sequence {H H T H T T H T T H}? SOLUTION: SOLUTION: For a fair coin, with independent coin tosses, the probability of any specific observation sequence of length 10 (10 tosses) is (1/2) 10 since there are 2 10 such sequences and all are equally probable. Thus: P (H H T H T T H T T H) = (1/2) 10 10

Exercise Exercise b) what is the probability that the next 10 tosses will produce the sequence {H H H H H H H H H H}? SOLUTION: SOLUTION: Similarly: P (H H H H H H H H H H)= (1/2) 10 Thus a specified run of length 10 is equally as likely as a specified run of interlaced H and T. 11

Exercise Exercise c) what is the probability that 5 of the next 10 tosses will be tails? What is the expected number of tails over the next 10 tosses? SOLUTION: SOLUTION: The probability of 5 tails in the next 10 tosses is just the number of observation sequences with 5 tails and 5 heads (in any sequence) and this is: P (5H, 5T)=(10C5) (1/2) 10 = 252/1024 ≈ 0.25 since there are (10C5) combinations (ways of getting 5H and 5T) for 10 coin tosses, and each sequence has probability of (1/2) 10 . The expected number of tails in 10 tosses is: 10 ⎛ ⎞⎛ 10 10 ⎞ 1 ∑ = = E (Number of T in 10 coin tosses) d 5 ⎜ ⎟⎜ ⎟ d 2 ⎝ ⎠ ⎝ ⎠ = d 0 Thus, on average, there will be 5H and 5T in 10 tosses, but the probability 12 of exactly 5H and 5T is only about 0.25.

Coin Toss Models Coin Toss Models A series of coin tossing experiments is performed. The number of coins is unknown; only the results of each coin toss are revealed. Thus a typical observation sequence is: = = O OO O 3 ... O HHTTTHTTH ... H 1 2 T Problem: Build an HMM to explain the observation sequence . Issues: 1. What are the states in the model? 2. How many states should be used? 3. What are the state transition probabilities? 13

Coin Toss Models Coin Toss Models 14

Coin Toss Models Coin Toss Models 15

Coin Toss Models Coin Toss Models Problem: Consider an HMM representation (model λ ) of a coin tossing experiment. Assume a 3-state model (corresponding to 3 different coins) with probabilities: State 1 State 2 State 3 P(H) 0.5 0.75 0.25 P(T) 0.5 0.25 0.75 and with all state transition probabilities equal to 1/3. (Assume initial state probabilities of 1/3). a) You observe the sequence: O=H H H H T H T T T T What state sequence is most likely? What is the probability of the observation sequence and this most likely state sequence? 16

Coin Toss Problem Solution Coin Toss Problem Solution SOLUTION: SOLUTION: Given O=HHHHTHTTTT, the most likely state sequence is the one for which the probability of each individual observation is maximum. Thus for each H, the most likely state is S 2 and for each T the most likely state is S 3 . Thus the most likely state sequence is: S= S 2 S 2 S 2 S 2 S 3 S 2 S 3 S 3 S 3 S 3 The probability of O and S (given the model) is: 10 ⎛ 1 ⎞ λ = 10 P O S ( , | ) (0.75) ⎜ ⎟ 3 ⎝ ⎠ 17

Coin Toss Models Coin Toss Models b) What is the probability that the observation sequence came entirely from state 1? SOLUTION: SOLUTION: The probability of O given that S is of the form: ˆ = S S S S S S S S S S S 1 1 1 1 1 1 1 1 1 1 is: 10 ⎛ ⎞ 1 ˆ λ = 10 P O S ( , | ) (0.50) ⎜ ⎟ 3 ⎝ ⎠ ˆ λ λ The ratio of ( , P O S | ) to P O S ( , | ) is: 10 λ P O S ( , | ) ⎛ 3 ⎞ = = = R 57.67 ⎜ ⎟ ˆ λ 2 ⎝ ⎠ P O S ( , | ) 18

Coin Toss Models Coin Toss Models c) Consider the observation sequence: � = O HT T HTHHTTH How would your answers to parts a and b change? SOLUTION: SOLUTION: � Given O which has the same number of H 's and T 's, the answers to parts a and b would remain the same as the most likely states occur the same number of times in both cases. 19

Coin Toss Models Coin Toss Models d) If the state transition probabilities were of the form: = = = a 0.9, a 0.45, a 0.45 11 21 31 = = = a 0.05, a 0.1 , a 0.45 12 22 32 = = = a 0.05, a 0.45, a 0.1 13 23 33 i.e., a new model λ ’, how would your answers to parts a-c change? What does this suggest about the type of sequences generated by the models? 20

Coin Toss Problem Solution Coin Toss Problem Solution SOLUTION: The new probability of O and becomes: S ⎛ ⎞ 1 ( ) ( ) 6 3 ′ = λ 10 ( , P O S | ) (0.75) 0.1 0.45 ⎜ ⎟ 3 ⎝ ⎠ ˆ The new probability of O and becomes: S ⎛ ⎞ 1 ˆ ′ = λ 10 9 ( , P O S | ) (0.50) (0.9) ⎜ ⎟ ⎝ 3 ⎠ The ratio is: 10 6 3 ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 3 1 1 − = = ⋅ 5 R 1.36 10 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 2 9 2 ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ 21

Coin Toss Problem Solution Coin Toss Problem Solution � Now the probability of O and is not the same as the probability of S O and . We now have: S ⎛ 1 ⎞ � ′ = λ 10 6 3 ( , P O S | ) (0.75) (0.45) (0.1) ⎜ ⎟ ⎝ 3 ⎠ ⎛ ⎞ 1 � ˆ ′ = λ 10 9 P O S ( , | ) (0.50) (0.9) ⎜ ⎟ ⎝ 3 ⎠ with the ratio: 10 6 3 ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 3 1 1 = = ⋅ − 3 R 1.24 10 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ 2 ⎠ ⎝ 2 ⎠ ⎝ 9 ⎠ λ Model , the initial model, clearly favors long runs of H 's or T 's, ′ λ whereas model , the new model, clearly favors random sequences of H 's and T 's. Thus even a run of H 's or T 's is more likely to ′ λ occur in state 1 for model , and a random sequence of H 's and T 's λ is more likely to occur in states 2 and 3 for model . 22

Balls in Urns Model Balls in Urns Model 23

The Hidden Markov The Hidden Markov Model (HMM) Model (HMM) 1 - PowerPoint PPT Presentation

Digital Speech Processing Digital Speech Processing Lecture 20 Lecture 20 The Hidden Markov The Hidden Markov Model (HMM) Model (HMM) 1 Lecture Outline Lecture Outline Theory of Markov Models discrete Markov processes

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Hidden Markov Models Pratik Lahiri Introduction A hidden Markov model (HMM) is a

Introduction to Hmm Introduction to Hmm Joe Wu Nov 4 th 2011 Agenda The applications of HMM.

Cell implementation HMM (HMM hidden Markov model) Authors: Jakub Hork Ji Hona

Hidden Markov Models Steven J Zeil Old Dominion Univ. Fall 2010 1 Discrete Markov Processes

Hidden Markov Model (HMM) Sensor Markov assumption: P ( E t | X 0: t , E 1: t 1 ) = P ( E t | X

Outline depmixS4: an R-package for hidden Markov models Hidden Markov Models Ingmar Visser 1

Hidden Markov Models (HMM) Many slides from Michael Collins and HMMs Overview I The Tagging

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

1 Real HMM Examples Real HMM Examples Speech recognition HMMs: Machine translation HMMs:

1 Real HMM Examples Real HMM Examples Speech recognition HMMs: Machine translation HMMs:

HMM Review Lecture Outline 1. Markov models 2. Hidden Markov

Lecture 9: Hidden Markov Model Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse

Markov Models Kunsch, H.R., State Space and Hidden Markov Models . ETH- Zurich, Zurich;

Download the notebook for this section from the CS109 repo or here: http://bit.ly/109_S6 1

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Ipopt Tutorial Andreas W achter IBM T.J. Watson Research Center andreasw@watson.ibm.com

MARKOV MODELING AND TRAFFIC FLOW MODELING FILTERS APPLIED IN EXISTING SIGNALING OF CELLULAR

Data for Official Statistics Marco Puts, Piet Daas, Martijn Tennekes Road sensors Road sensor

arXiv:1508.01991v1 [cs.CL] 9 Aug 2015 els include LSTM networks, bidirectional layer on the

Hidden Markov ov Model (HMM) based S Speech Synthesis using ing HTS Toolkit. Presenter: Omer

The Hidden Stories Maria Wolters Reader in Design Informatics University of Edinburgh of

Sambuz

Useful Links

Newsletter

Mail Us