Hidden Markov Models (Ch. 15) Announcements Homework 2 posted - - PowerPoint PPT Presentation

hidden markov models ch 15 announcements
SMART_READER_LITE
LIVE PREVIEW

Hidden Markov Models (Ch. 15) Announcements Homework 2 posted - - PowerPoint PPT Presentation

Hidden Markov Models (Ch. 15) Announcements Homework 2 posted Programing: -Python (preferred) -Java -Matlab Markov... chains? Recap, Markov property: (Next state only depends on current state) pick B pick C c c c a a a b d


slide-1
SLIDE 1

Hidden Markov Models (Ch. 15)

slide-2
SLIDE 2

Announcements

Homework 2 posted Programing:

  • Python (preferred)
  • Java
  • Matlab
slide-3
SLIDE 3

Markov... chains?

Recap, Markov property: (Next state only depends on current state) For Gibbs sampling, we made a lot of samples using the Markov property (since this is 1-dimension, it looks like a “chain”)

a b ¬c d a ¬b ¬c d a ¬b ¬c d

pick B pick C

x0 x1 x2

slide-4
SLIDE 4

Markov... chains?

For the next bit, we will still have a “Markov” and uncertainty (i.e. probabilities) However, we will add partial-observability (some things we cannot see) These are often called Hidden Markov Models (not “chains” because they won’t be 1D... w/e)

slide-5
SLIDE 5

Hidden Markov Models

For Hidden Markov Models (HMMs) often: (1) E = the evidence (2) X = the hidden/not observable part We assume the hidden information is what causes the evidence (otherwise quite easy) x0 x1 x2 x3 x4 e1 e2 e3 e4 ... We only know these

slide-6
SLIDE 6

Hidden Markov Models

If you squint a bit, this is actually a Bayesian network as well (though can go on for a while) For simplicity’s sake, we will assume the probabilities of going to the right (next state) and down (seeing evidence) are the same for all subscripts (typically “time”) x0 x1 x2 x3 x4 e1 e2 e3 e4 ...

P(xt+1|xt) 0.6 P(xt+1|¬xt) 0.9 P(et|xt) 0.3 P(et|¬xt) 0.8

slide-7
SLIDE 7

Hidden Markov Models

Our example will be: sleep deprivation So variable Xt will be if a person got enough sleep on day t This person is not you, but you see them every day, and you can tell if their eyes are bloodshot (this is Et)

slide-8
SLIDE 8

Hidden Markov Models

As we will be dealing with quite a few variables, we will introduce some notation: E1:t = E1, E2, E3, .... Et (similarly for X0:t) So P(E1:t) = P(E1, E2, E3, .. Et), which is normal definition of commas like P(a,b) We will assume we only know E1:t (and X0) and want to figure out Xk for various k

slide-9
SLIDE 9

Hidden Markov Models

Quick Bayesian network recap:

a b c d

Used fact tons in our sampling...

slide-10
SLIDE 10

Hidden Markov Models

So in this Bayesian network (bigger): Typically, use above to compute four things: Filtering Prediction Smoothing MLE x0 x1 x2 x3 x4 e1 e2 e3 e4 ...

P(xt+1|xt) 0.6 P(xt+1|¬xt) 0.9 P(et|xt) 0.3 P(et|¬xt) 0.8

Most Likely Explanation

slide-11
SLIDE 11

All four of these are actually quite similar, and you can probably already find them The only difficulty is the size of the Bayesian network, so let’s start small to get intuition: x0 x1 e1

P(xt+1|xt) 0.6 P(xt+1|¬xt) 0.9 P(et|xt) 0.3 P(et|¬xt) 0.8 P(x0) 0.5

How can you find P(x1|¬e1)? (this is a simple Bays-net)

Filtering in HMMs

slide-12
SLIDE 12

x0 x1 e1

P(xt+1|xt) 0.6 P(xt+1|¬xt) 0.9 P(et|xt) 0.3 P(et|¬xt) 0.8 P(x0) 0.5

Similarly, P(¬x1|¬e1) ≈0.05α So normalized gives: P(x1|¬e1) ≈ 0.913

91% chance I slept last night, given today I didn’t have bloodshot eyes

Filtering in HMMs

slide-13
SLIDE 13

x0 x1 e1

P(xt+1|xt) 0.6 P(xt+1|¬xt) 0.9 P(et|xt) 0.3 P(et|¬xt) 0.8 P(x0) 0.5

Find: P(x2|¬e1,¬e2) x2 e2

Filtering in HMMs

slide-14
SLIDE 14

... after normalizing you should get: ≈0.854

Just computed this! It is P(x1|e1) Double sum?!?! Double... for loop..?

Filtering in HMMs

slide-15
SLIDE 15

... after normalizing you should get: ≈0.854

Just computed this! It is P(x1|e1) Double sum?!?! Double... for loop..? P(¬x2|¬e1,¬e2) ≈0.075α

change to ¬x2

Filtering in HMMs

slide-16
SLIDE 16

In general:

Filtering in HMMs

slide-17
SLIDE 17

In general: ... same, but different ‘t’

Filtering in HMMs

slide-18
SLIDE 18

In general: Actually, this is just a recursive function

Filtering in HMMs

slide-19
SLIDE 19

So we can compute f(t) = P(xt|e1:t): Of course, we don’t actually want to do this recursively... rather with dynamic programing Start with f(0) = P(x0), then use this to find f(1)... and so on (can either store in array,

  • r just have a single variable... like Fibonacci)

actually an array, as you need both T/F for sum(or 1-)

Filtering in HMMs

slide-20
SLIDE 20

How would you find “prediction”? Filtering Prediction Smoothing Most likely- explanation

Prediction in HMMs

slide-21
SLIDE 21

Probably best to go back to the example: What is chance I sleep on day 3 given, you saw me without bloodshot eyes on day 1&2? P(x3 | e1, e2)=??? x0 x1 e1

P(xt+1|xt) 0.6 P(xt+1|¬xt) 0.9 P(et|xt) 0.3 P(et|¬xt) 0.8 P(x0) 0.5

x2 e2 x3

Prediction in HMMs

slide-22
SLIDE 22

Turns out that P(¬x3|¬e1,¬e2) ≈ 0.356α, so α=1

Prediction in HMMs

whew...

slide-23
SLIDE 23

Day 4? P(x4 | e1, e2)=??? x0 x1 e1

P(xt+1|xt) 0.6 P(xt+1|¬xt) 0.9 P(et|xt) 0.3 P(et|¬xt) 0.8 P(x0) 0.5

x2 e2 x3

Prediction in HMMs

x4

slide-24
SLIDE 24

Turns out that P(¬x4|¬e1,¬e2) ≈ 0.293α, so α=1 (α always 1 now, as can move into red box)

Prediction in HMMs

...think I see a pattern here

slide-25
SLIDE 25

We’ll save the other two for next time... Filtering Prediction Smoothing Most likely- explanation

Prediction in HMMs