[PPT] - Pattern Recognition Part 8: Hidden Markov Models (HMMs) Gerhard PowerPoint Presentation

SLIDE 1

Pattern Recognition

Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

Part 8: Hidden Markov Models (HMMs)

SLIDE 2

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 2

Hidden Markov Models (HMMs)

Motivation

❑ In the previous approaches (vector quantization, Gaussian mixture models), only the probability distribution of multi-

dimensional data vectors was analyzed and used. Their temporal progression was assumed to be uncorrelated.

❑ If also the temporal progression of the observed data vectors should be analyzed, the previous models can be extended

by a temporal component. This new component will again be derived on a statistical background.

❑ In hidden Markov models, two (or three) statistical components are nested. ❑ While for multivariate amplitude distributions, both discrete and continuous probability distributions can be used, the

temporal modeling will be done discretely.

Modeling of temporal dependencies

SLIDE 4

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 4

Hidden Markov Models (HMMs)

Literature

Hidden Markov Models

❑ B. Pfister, T. Kaufman: Sprachverarbeitung, Springer, 2008 (in German) ❑ C. M. Bishop: Pattern Recognition and Maschine Learning, Springer, 2006 ❑ L. Rabiner, B.H. Juang: Fundamentals of Speech Recognition, Prentice Hall, 1993 ❑ B. Gold, N. Morgan: Speech and Audio Signal Processing, Wiley, 2000

SLIDE 5

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 5

Hidden Markov Models (HMMs)

Common definitions – Part 1

❑ The hidden part of the model is assumed to be a Markov process

with N states. These states are not observable. For the state transitions from one discrete state to another, probabilities are specified.

❑ The hidden states govern a second family of random processes, which result in the observable sequence of vectors

.

❑ The sequence of hidden states is denoted as

where the elements each correspond to one of the hidden states, respectively:

Hidden part of the model (random process) in the Markov model

SLIDE 6

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 6

Hidden Markov Models (HMMs)

Common definitions – Part 2

❑ As soon as the model gets into a new state, the model generates an observation vector. Its distribution is only

dependant on the new state , but not on previous ones: In the following, this probability is denoted as ,

❑ The state transitions are specified (surprise!) by probabilities. These transition probabilities depend only on the current

transition’s source and target state, but not on previous states.

Hidden part of the model (random process) in the Markov model

Transition probability Emission probability

SLIDE 7

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 7

Hidden Markov Models (HMMs)

Common definitions – Part 3

❑

The transition probabilities are abbreviated as follows,

❑ The initial and final states of a HMM are called

initial state, and final state. Both states are modeled as “non-emitting”. The direct transition from the initial to the final state is forbidden – no observation would be created in this case. I.e., for the transition probabilities, the following holds:

Hidden part of the model (random process) in the Markov model

Direct transition from initial to final state Transitions that leave the final state Transitions that enter the initial state

SLIDE 8

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 8

Hidden Markov Models (HMMs)

Common definitions – Part 4

State Transition probabilities Emission probability

Hidden part of the model (random process) in the Markov model

SLIDE 9

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 9

Hidden Markov Models (HMMs)

Common definitions – Part 5

❑ The transition probabilities of the model are combined in a transition matrix

.

❑ The constraints are:

Hidden part of the model (random process) in the Markov model

SLIDE 10

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 10

Hidden Markov Models (HMMs)

Types of hidden Markov models – Part 1

Hidden Markov models of the type “left to right”

Transition matrix Structure of a left-to-right Markov model

❑ Initial, final and three emitting states are shown. ❑ Transitions from right to left are not possible.

SLIDE 11

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 11

Hidden Markov Models (HMMs)

Types of hidden Markov models – Part 2

Linear hidden Markov models

Structure of a linear hidden Markov model

❑ Initial, final, and three emitting states are shown. ❑ Only transitions to the state itself and to right

neighbors are possible. Consequently, a sequence of

bservations must have at least 3 observations.

Transition matrix

SLIDE 12

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 12

Hidden Markov Models (HMMs)

Common definitions – Part 6

❑ In order to generate the observation vectors, another random process is assigned to each state. It can be modeled

either as discrete or as continuous process.

❑ If the generation of the observations is modeled as N-2 discrete processes and each process may have K discrete

bservation states, then the applied probabilities can again be combined in a matrix

. Again, the following constraints hold:

Generation of observations by a random process

SLIDE 13

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 13

Hidden Markov Models (HMMs)

Common definitions – Part 7

❑ If the generation of observations is modeled as continuous processes using multivariate Gaussian densities (GMMs),

then the applied probabilities can be defined as follows, , assuming that per state K Gaussian distributions are used. The Gaussian distributions are defined as in the GMM lecture, with

Generation of observations by a random process

SLIDE 14

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 14

Hidden Markov Models (HMMs)

Common definitions – Part 8

Generation of observations by a random process

Final state Initial state Gaussian mixture model

f the first (non-initial) state

Gaussian mixture model

f the second (non-initial) state

SLIDE 15

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 15

Hidden Markov Models (HMMs)

Trellis diagrams – Part 1

The initial state always leads to the first (non-initial) state. Time index State We assume an HMM of this structure.

SLIDE 16

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 16

Hidden Markov Models (HMMs)

Trellis diagrams – Part 2

Based on state 1, only transitions to the states 1, 2, and 3 are possible. Time index State

SLIDE 17

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 17

Hidden Markov Models (HMMs)

Trellis diagrams – Part 3

All possible transitions based

n the first state are plotted.

Time index State

SLIDE 18

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 18

Hidden Markov Models (HMMs)

Motivation

All possible transitions based

n the second state are plotted.

Time index State

SLIDE 19

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 19

Hidden Markov Models (HMMs)

Trellis diagrams – Part 5

All possible transitions based

n the third state are plotted.

Time index State

SLIDE 20

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 20

Hidden Markov Models (HMMs)

Trellis diagrams – Part 6

All possible transitions from time index 2 to time index 3 are plotted. Time index State

SLIDE 21

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 21

Hidden Markov Models (HMMs)

Trellis diagrams – Part 7

Now, all possible transitions of an

bservation sequence of

length 10 are plotted. Time index State

SLIDE 22

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 22

Hidden Markov Models (HMMs)

Trellis diagrams – Part 8

❑ The transition probabilities are usually denoted at the edges. ❑ The emission probability, that the observed vector is produced by the corresponding state, is denoted at the nodes.

Meaning of edges and nodes

SLIDE 23

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 23

Hidden Markov Models (HMMs)

Essential problems of hidden Markov models

❑ The probability that the hidden Markov model creates the (given) observation sequence is to be calculated. ❑ In order to calculate this probability, all possible observation sequences have to be taken into account. The direct

calculation (summing over all possible observation sequences) would thus be very time consuming.

Evaluation problem

❑ Besides the probability calculated above, also the state sequence

that creates the observation sequence with the highest probability, is of note.

Decoding problem

❑ Based on a huge data base, all parameters of the hidden Markov model are to be estimated.

Estimation problem

SLIDE 24

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 24

Hidden Markov Models (HMMs)

Evaluation problem – Part 1

❑ The probability that the hidden Markov model creates the (given) observation sequence is to be found. ❑ The wanted probability can be calculated by summing up the conditional production probabilities of all possible

bservation sequences,

❑ This can be written as follows, ❑ In the following we will try to calculate the two conditional probabilities separately.

Evaluation problem

SLIDE 25

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 25

Hidden Markov Models (HMMs)

Evaluation problem – Part 2

❑ In a first step, the production probability is being calculated, that results from the assumption that the state sequence

is known. We use that the probability of an observation only depends on the actual state of the HMM – but not of previous or subsequent states:

❑ The probability that the sequence has been selected, can be evaluated as follows:

Evaluation problem

SLIDE 26

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 26

Hidden Markov Models (HMMs)

Evaluation problem – Part 3

❑ The production probability results in ❑ The problem when directly calculating the production probability is the fact that per time index, there are N-2 possible states.

As a result, for the overall sequence, (N-2)T possible paths exist, so the number of summands is no longer manageable.

❑ As a remedy, the so-called forward algorithm is used. For this purpose the so-called forward probability is defined in a first

step,

This is the probability that at time index n, the state Si is active and the “shortened” observation sequence X(n) could be observed up to now.

Evaluation problem

SLIDE 27

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 27

Hidden Markov Models (HMMs)

Evaluation problem – Part 4

❑ The upper indices specify the shortened versions of the observation matrix and of the state sequence, respectively: ❑ The forward probability can be determined by summing up all possible shortened observation sequences and being at

state Si at time index n,

Evaluation problem

SLIDE 28

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 28

Hidden Markov Models (HMMs)

Evaluation problem – Part 5

Time index State Illustration of the forward probabilities

Evaluation problem

SLIDE 29

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 29

Hidden Markov Models (HMMs)

Evaluation problem – Part 6

❑ Because of the independence of the previous states, the forward probabilities can be calculated recursively as follows, ❑ The initialization is done as follows, ❑ Hereby, the production probability of the observed sequence can be determined by summation of the previous forward

probabilities,

❑ Note that the computational complexity now just grows linearly with the sequence length (instead of growing

exponentially using direct calculation).

Evaluation problem

SLIDE 30

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 30

Hidden Markov Models (HMMs)

Decoding problem – Part 1

❑ Besides the probability that the hidden Markov model created the observation vector sequence , some

applications require the most probable state sequence. The latter can be defined as follows,

❑ The conditional probability mentioned above can be permuted, ❑ Because only depends on the (given) observation sequence, also

can be optimized instead. By this permutation of the cost function, similar quantities as in the previous problem can be considered.

Decoding problem

SLIDE 31

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 31

Hidden Markov Models (HMMs)

Decoding problem – Part 2

❑ The most probable state sequence can be calculated efficiently using the so-called Viterbi algorithm. In analogy to the

explanation of the evaluation problem, the joint probability for the shortened observation vector sequence and the

ptimal shortened state sequence is defined,

❑ The calculation of the probability can again be computed in a recursive way, ❑ For each time index and each state, the index of the state that induced the maximum probability has to be stored, so

the optimal path can be tracked later on.

Decoding problem

SLIDE 32

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 32

Hidden Markov Models (HMMs)

Decoding problem – Part 3

❑ Initialization ❑ Recursion (Iteration) ❑ Termination ❑ Backtracking of the optimal state sequence

Summary of the Viterbi algorithm

SLIDE 33

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 33

Hidden Markov Models (HMMs)

Decoding problem – Part 4

Time index State Initialization

SLIDE 34

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 34

Hidden Markov Models (HMMs)

Decoding problem – Part 5

Recursion for the first (non-initial) state Time index State

SLIDE 35

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 35

Hidden Markov Models (HMMs)

Decoding problem – Part 6

Recursion for the first (non-initial) state Time index State

SLIDE 36

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 36

Hidden Markov Models (HMMs)

Decoding problem – Part 7

Recursion for the second state Time index State

SLIDE 37

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 37

Hidden Markov Models (HMMs)

Decoding problem – Part 8

Recursion for the second state Time index State

SLIDE 38

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 38

Hidden Markov Models (HMMs)

Decoding problem – Part 9

Recursion for the third state Time index State

SLIDE 39

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 39

Hidden Markov Models (HMMs)

Decoding problem – Part 10

Recursion for the third state Time index State

SLIDE 40

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 40

Hidden Markov Models (HMMs)

Decoding problem – Part 11

Recursion for the fourth state Time index State

SLIDE 41

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 41

Hidden Markov Models (HMMs)

Decoding problem – Part 12

Recursion for the fourth state Time index State

SLIDE 42

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 42

Hidden Markov Models (HMMs)

Decoding problem – Part 13

Complete recursion Time index State

SLIDE 43

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 43

Hidden Markov Models (HMMs)

Decoding problem – Part 14

Termination Time index State

SLIDE 44

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 44

Hidden Markov Models (HMMs)

Decoding problem – Part 15

Backtracking

f the optimal

state sequence Time index State

SLIDE 45

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 45

Hidden Markov Models (HMMs)

Generating feature vectors using a hidden Markov model – Part 1

Basics

Final state Initial state Gaussian mixture model

f the first state

Gaussian mixture model

f the second state

Transition probabilities Emission probabilities

SLIDE 46

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 46

Hidden Markov Models (HMMs)

Generating feature vectors using a hidden Markov model – Part 2

Initial state So-far

bservation sequence

Initial state

SLIDE 47

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 47

Hidden Markov Models (HMMs)

Generating feature vectors using a hidden Markov model – Part 3

Initial state Transition probabilities

Determining the first transition

SLIDE 48

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 48

Hidden Markov Models (HMMs)

Generating feature vectors using a hidden Markov model – Part 4

Generating the first observation vector

Emission probabilities Gaussian mixture model

f the first state

So-far

bservation sequence

SLIDE 49

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 49

Hidden Markov Models (HMMs)

Generating feature vectors using a hidden Markov model – Part 5

Determining the second transition

Transition probabilities Gaussian mixture model

f the first state

SLIDE 50

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 50

Hidden Markov Models (HMMs)

Generating feature vectors using a hidden Markov model – Part 6

Generation of the second observation vector

Gaussian mixture model

f the second state

Emission probabilities So-far

bservation sequence

SLIDE 51

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 51

Hidden Markov Models (HMMs)

Generating feature vectors using a hidden Markov model – Part 7

Gaussian mixture model

f the second state

Transition probabilities

Determining the third transition

SLIDE 52

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 52

Hidden Markov Models (HMMs)

Generating feature vectors using a hidden Markov model – Part 8

Generation of the third observation vector

Gaussian mixture model

f the second state

Emission probabilities So-far

bservation sequence

SLIDE 53

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 53

Hidden Markov Models (HMMs)

Generating feature vectors using a hidden Markov model – Part 9

Transition probabilities

Determining the fourth transition

Gaussian mixture model

f the second state

SLIDE 54

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 54

Hidden Markov Models (HMMs)

Generating feature vectors using a hidden Markov model – Part 10

Final state

Final state Overall

bservation sequence

SLIDE 55

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 55

Hidden Markov Models (HMMs)

The three problems with hidden Markov models – Part 1

Initial state Final state Second model state

❑ After the model topology has been defined, the model parameters are to be estimated.

Emission probabilities Transition probabilities First model state Main subject of the next slides

SLIDE 56

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 56

Hidden Markov Models (HMMs)

The three problems with hidden Markov models – Part 2

❑ After the model topology has been defined, the model parameters are to be estimated. ❑ The probability that a model generates an observed feature sequence has to be calculated in an efficient way.

Observation sequence Model 1 Model 2 Subject of the previous slides

SLIDE 57

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 57

Hidden Markov Models (HMMs)

The three problems with hidden Markov models – Part 3

❑ After the model topology has been defined, the model parameters are to be estimated. ❑ The probability that a model generates an observed feature sequence has to be

calculated in an efficient way.

❑ The state sequence that generates the observed feature sequence with highest

probability has to calculated efficiently.

Overall

bservation sequence

Also subject of the previous slides!

SLIDE 58

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 58

Hidden Markov Models (HMMs)

Lecture Evaluation

❑ Please help to improve the lecture by

filling out our survey ….

SLIDE 59

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 59

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 1

❑ For one or more given observation sequences the parameters (transition and emission probabilities) are to be found

in such a way, that

❑ To do so, we assume that an initial HMM is already existing. This model is optimized iteratively, until a certain

ptimization criterion is fulfilled or a maximum number of iterations was computed.

❑ The iteration methods known so far only are able to find local maxima. ❑ The most common method is based on a maximum likelihood estimation and is called Baum-Welch or forward-

backward algorithm.

Estimation problem

SLIDE 60

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 60

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 2

❑ In analogy to the forward probability (see previous slides)

we now introduce the backward probability The partial observation sequence describes all observations from the nth time index up to the end of the sequence,

❑ The backward probability, similar to the forward probability, can be calculated recursively, ❑ The initialization is done as follows,

Backward probability

SLIDE 61

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 61

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 3

Time index State

Forward and backward probability

SLIDE 62

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 62

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 4

Probability distribution over states

❑ Using the forward and backward probabilities, we can calculate the probability that the state Si is active at time index n, ❑ The “normalization” can be calculated either using the forward or the backward probability,

SLIDE 63

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 63

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 5

Probability distribution over states

Time index State The state Si is active at time index n

SLIDE 64

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 64

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 6

Transition probabilities

❑ Using the forward and backward probability, we can also easily calculate the probability that the state of the hidden

Markov model changes from state Si to state Sj at time index n,

SLIDE 65

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 65

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 7

Transition probabilities

State Si is active at time index n! State Sj is active at time index n! Time index State

SLIDE 66

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 66

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 8

Estimation of the Markov transition probabilities

❑ For the next iteration, the following transition probabilities are used, ❑ Additionally, the parameters mentioned above are to be calculated based on multiple observation sequences X and

averaged before being used in the next step.

Expected average number

f state transitions from

state Si to state Sj Expected average number of state transitions that start in state Si

SLIDE 67

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 67

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 9

Emission probabilities

❑ In order to determine the individual parameters of

the Gaussian densities, in a first step a partition of the states with multiple Gaussians into multiple states with just one Gaussian is performed.

SLIDE 68

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 68

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 10

Emission probabilities

❑ In analogy to the first approach, individual transition probabilities can be calculated for this extended model, ❑

These can again be expressed by forward and backward probabilities,

Probability that a transition from state Si into state Sj was performed at time index n while the k-th Gaussian of the state Sj was creating the

bservation vector.

SLIDE 69

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 69

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 11

❑ Summing all transition probabilities over the outgoing states results in the probability that the k-th Gaussian of the j-th state

generated the observed vector at time index n,

❑ Now, analogously to the “main transition probabilities“, also the GMM parameters can be determined by iteration.

Emission probabilities

SLIDE 70

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 70

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 12

❑ The emission probability was defined as follows, ❑ The adaptation of the weights is done as follows, ❑ The adaption of the averages vectors is done as follows,

Adaption of the GMM parameters

Average number of transitions from the

utgoing state Sj to the incoming state Si

Average number of state transitions that start in state Sj

SLIDE 71

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 71

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 13

❑ The adaptation of the covariance matrices is performed as follows,

Adaption of the GMM parameters

SLIDE 72

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 72

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 14

Viterbi training

❑ The method to estimate the model parameters that was described above is called Baum-Welch algorithm.

It is a special case of the EM algorithm that was described in the GMM lecture.

❑ Alternatively, the so-called Viterbi training can be applied. To do so, in a first step the state sequence

with the highest probability is computed.

❑ Then it is assumed that this path was taken with “certain” probability, i.e., it holds

SLIDE 73

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 73

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 15

Viterbi training

❑ For the internal transitions, the following consequently holds, ❑ The subsequent iterations to optimize the model parameters are performed as described at the Baum-Welch algorithm. ❑ Similar to the Baum-Welch algorithm, the iterations are performed until the probability that the model generates the

bservation sequence is no longer increasing significantly or the maximum number of iterations is reached.

SLIDE 74

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 74

Hidden Markov Models (HMMs)

Solving the estimation problem – Part 16

Initializing a hidden Markov model

❑ In a first step, the number of states and their topology is defined (forbidden transitions are marked, i.e. their probability

is set to zero).

❑ Per state, just one Gaussian distribution is used. ❑ While the training is running, the number of Gaussian distributions is gradually increased. For example, the Gaussian

distributions are doubled and initialized as follows,

❑ This is repeated until the probability that the model generates the training sequences is no longer increased significantly

r a maximum number of parameters is reached.

SLIDE 75

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 75

Hidden Markov Models (HMMs)

„Intermezzo“

Partner exercise:

❑ Please answer (in groups of two people) the questions that you will get during the lecture!

SLIDE 76

Digital Signal Processing and System Theory | Pattern Recognition | Hidden Markov Models (HMMs) Slide 76

Hidden Markov Models (HMMs)

Summary and Outlook

Summary:

❑ Motivation ❑ Basics ❑ The „hidden“ part of the model ❑ The „inner“ random processes ❑ Basic problems of Hidden Markov Models ❑ Efficient computation of the probabilities of state sequences ❑ Efficient computation of the most probable sequence ❑ Computation (estimation) of the parameters of the model

Next week:

❑ Speaker and speech recognition

Pattern Recognition

Part 8: Hidden Markov Models (HMMs)

Contents

Motivation

Literature

Common definitions – Part 1

Common definitions – Part 2

Common definitions – Part 3

Common definitions – Part 4

Common definitions – Part 5

Types of hidden Markov models – Part 1

Types of hidden Markov models – Part 2

Common definitions – Part 6

Common definitions – Part 7

Common definitions – Part 8

Trellis diagrams – Part 1

Trellis diagrams – Part 2

Trellis diagrams – Part 3

Motivation

Trellis diagrams – Part 5

Trellis diagrams – Part 6

Trellis diagrams – Part 7

Trellis diagrams – Part 8

Essential problems of hidden Markov models

Evaluation problem – Part 1

Evaluation problem – Part 2

Evaluation problem – Part 3

Evaluation problem – Part 4

Evaluation problem – Part 5

Evaluation problem – Part 6

Decoding problem – Part 1

Decoding problem – Part 2

Decoding problem – Part 3

Decoding problem – Part 4

Decoding problem – Part 5

Decoding problem – Part 6

Decoding problem – Part 7

Decoding problem – Part 8

Decoding problem – Part 9

Decoding problem – Part 10

Decoding problem – Part 11

Decoding problem – Part 12

Decoding problem – Part 13

Decoding problem – Part 14

Decoding problem – Part 15

Generating feature vectors using a hidden Markov model – Part 1

Generating feature vectors using a hidden Markov model – Part 2

Generating feature vectors using a hidden Markov model – Part 3

Generating feature vectors using a hidden Markov model – Part 4

Generating feature vectors using a hidden Markov model – Part 5

Generating feature vectors using a hidden Markov model – Part 6

Generating feature vectors using a hidden Markov model – Part 7

Generating feature vectors using a hidden Markov model – Part 8

Generating feature vectors using a hidden Markov model – Part 9

Generating feature vectors using a hidden Markov model – Part 10

The three problems with hidden Markov models – Part 1

The three problems with hidden Markov models – Part 2

The three problems with hidden Markov models – Part 3

Lecture Evaluation

Solving the estimation problem – Part 1

Solving the estimation problem – Part 2

Solving the estimation problem – Part 3

Solving the estimation problem – Part 4

Solving the estimation problem – Part 5

Solving the estimation problem – Part 6

Solving the estimation problem – Part 7

Solving the estimation problem – Part 8

Solving the estimation problem – Part 9

Solving the estimation problem – Part 10

Solving the estimation problem – Part 11

Solving the estimation problem – Part 12

Solving the estimation problem – Part 13

Solving the estimation problem – Part 14

Solving the estimation problem – Part 15

Solving the estimation problem – Part 16

„Intermezzo“

Summary and Outlook