' $ HMM T utorial/ T apas Kan ungo{1 c Hidden Mark o v - - PDF document

hmm t utorial t apas kan ungo 1 c hidden mark o v mo dels
SMART_READER_LITE
LIVE PREVIEW

' $ HMM T utorial/ T apas Kan ungo{1 c Hidden Mark o v - - PDF document

' $ HMM T utorial/ T apas Kan ungo{1 c Hidden Mark o v Mo dels T apas Kan ungo Cen ter for Automation Researc h Univ ersit y of Maryland W eb: www.cfar.umd.edu/~k an ungo Email: k an ungo@cfar.umd.edu


slide-1
SLIDE 1 HMM T utorial/ c
  • T
apas Kan ungo{1 ' & $ % Hidden Mark
  • v
Mo dels T apas Kan ungo Cen ter for Automation Researc h Univ ersit y
  • f
Maryland W eb: www.cfar.umd.edu/~k an ungo Email: k an ungo@cfar.umd.edu
slide-2
SLIDE 2 HMM T utorial/ c
  • T
apas Kan ungo{2 ' & $ % Outline 1. Mark
  • v
mo dels 2. Hidden Mark
  • v
mo dels 3. F
  • rw
ard/Bac kw ard algorithm 4. Viterbi algorithm 5. Baum-W elc h estimation algorithm
slide-3
SLIDE 3 HMM T utorial/ c
  • T
apas Kan ungo{3 ' & $ % Mark
  • v
Mo dels
  • Observ
able states: 1; 2; : : : ; N
  • Observ
ed sequence: q 1 ; q 2 ; : : : ; q t ; : : : ; q T
  • First
  • rder
Mark
  • v
assumption: P (q t = j jq t1 = i; q t2 = k ; : : : ) = P (q t = j jq t1 = i)
  • Stationarit
y: P (q t = j jq t1 = i) = P (q t+l = j jq t+l 1 = i)
slide-4
SLIDE 4 HMM T utorial/ c
  • T
apas Kan ungo{4 ' & $ % Mark
  • v
Mo dels
  • State
transition matrix A : A = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 a 11 a 12
  • a
1j
  • a
1N a 21 a 22
  • a
2j
  • a
2N . . . . . .
  • .
. .
  • .
. . a i1 a i2
  • a
ij
  • a
iN . . . . . .
  • .
. .
  • .
. . a N 1 a N 2
  • a
N j
  • a
N N 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 where a ij = P (q t = j jq t1 = i) 1
  • i;
j;
  • N
  • Constrain
ts
  • n
a ij : a ij
  • 0;
8i; j N X j =1 a ij = 1; 8i
slide-5
SLIDE 5 HMM T utorial/ c
  • T
apas Kan ungo{5 ' & $ % Mark
  • v
Mo dels: Example
  • States:
1. Rain y (R ) 2. Cloudy (C ) 3. Sunn y (S )
  • State
transition probabilit y matrix: A = 2 6 6 6 6 6 6 6 6 6 6 4 0:4 0:3 0:3 0:2 0:6 0:2 0:1 0:1 0:8 3 7 7 7 7 7 7 7 7 7 7 5
  • Compute
the probabilit y
  • f
  • bserving
S S R R S C S giv en that to da y is S .
slide-6
SLIDE 6 HMM T utorial/ c
  • T
apas Kan ungo{6 ' & $ % Mark
  • v
Mo dels: Example Basic conditional probabilit y rule: P (A; B ) = P (AjB )P (B ) The Mark
  • v
c hain rule: P (q 1 ; q 2 ; : : : ; q T ) = P (q T jq 1 ; q 2 ; : : : ; q T 1 )P (q 1 ; q 2 ; : : : ; q T 1 ) = P (q T jq T 1 )P (q 1 ; q 2 ; : : : ; q T 1 ) = P (q T jq T 1 )P (q T 1 jq T 2 )P (q 1 ; q 2 ; : : : ; q T 2 ) = P (q T jq T 1 )P (q T 1 jq T 2 )
  • P
(q 2 jq 1 )P (q 1 )
slide-7
SLIDE 7 HMM T utorial/ c
  • T
apas Kan ungo{7 ' & $ % Mark
  • v
Mo dels: Example
  • Observ
ation sequence O : O = (S; S; S; R ; R ; S; C ; S )
  • Using
the c hain rule w e get: P (O jmodel ) = P (S; S; S; R ; R ; S; C ; S jmodel ) = P (S )P (S jS )P (S jS )P (R jS )P (R jR )
  • P
(S jR )P (C jS )P (S jC ) =
  • 3
a 33 a 33 a 31 a 11 a 13 a 32 a 23 = (1)(0:8) 2 (0:1)(0:4)(0:3)(0:1)(0:2) = 1:536
  • 10
4
  • The
prior probabilit y
  • i
= P (q 1 = i)
slide-8
SLIDE 8 HMM T utorial/ c
  • T
apas Kan ungo{8 ' & $ % Mark
  • v
Mo dels: Example
  • What
is the probabilit y that the sequence re- mains in state i for exactly d time units? p i (d) = P (q 1 = i; q 2 = i; : : : ; q d = i; q d+1 6= i; : : :) =
  • i
(a ii ) d1 (1
  • a
ii )
  • Exp
  • nen
tial Mark
  • v
c hain duration densit y .
  • What
is the exp ected v alue
  • f
the duration d in state i?
  • d
i = 1 X d=1 dp i (d) = 1 X d=1 d(a ii ) d1 (1
  • a
ii ) = (1
  • a
ii ) 1 X d=1 d(a ii ) d1 = (1
  • a
ii ) @ @ a ii 1 X d=1 (a ii ) d = (1
  • a
ii ) @ @ a ii @ a ii 1
  • a
ii 1 A = 1 1
  • a
ii
slide-9
SLIDE 9 HMM T utorial/ c
  • T
apas Kan ungo{9 ' & $ % Mark
  • v
Mo dels: Example
  • Avg.
n um b er
  • f
consecutiv e sunn y da ys = 1 1
  • a
33 = 1 1
  • 0:8
= 5
  • Avg.
n um b er
  • f
consecutiv e cloudy da ys = 2.5
  • Avg.
n um b er
  • f
consecutiv e rain y da ys = 1.67
slide-10
SLIDE 10 HMM T utorial/ c
  • T
apas Kan ungo{10 ' & $ % Hidden Mark
  • v
Mo dels
  • States
are not
  • bserv
able
  • Observ
ations are probabilistic functions
  • f
state
  • State
transitions are still probabilistic
slide-11
SLIDE 11 HMM T utorial/ c
  • T
apas Kan ungo{11 ' & $ % Urn and Ball Mo del
  • N
urns con taining colored balls
  • M
distinct colors
  • f
balls
  • Eac
h urn has a (p
  • ssibly)
dieren t distribution
  • f
colors
  • Sequence
generation algorithm: 1. Pic k initial urn according to some random pro cess. 2. Randomly pic k a ball from the urn and then replace it 3. Select another urn according a random selec- tion pro cess asso ciated with the urn 4. Rep eat steps 2 and 3
slide-12
SLIDE 12 HMM T utorial/ c
  • T
apas Kan ungo{12 ' & $ % The T rellis

1 2 t+1 t STATES 1 2 3 4 N t-1 T T-1 t+2 TIME

  • 1

2 t-1 t t+1 t+2 T-1 T

OBSERVATION

slide-13
SLIDE 13 HMM T utorial/ c
  • T
apas Kan ungo{13 ' & $ % Elemen ts
  • f
Hidden Mark
  • v
Mo dels
  • N
{ the n um b er
  • f
hidden states
  • Q
{ set
  • f
states Q = f1; 2; : : : ; N g
  • M
{ the n um b er
  • f
sym b
  • ls
  • V
{ set
  • f
sym b
  • ls
V = f1; 2; : : : ; M g
  • A
{ the state-transition probabilit y matrix. a ij = P (q t+1 = j jq t = i) 1
  • i;
j;
  • N
  • B
{ Observ ation probabilit y distribution: B j (k ) = P (o t = k jq t = j ) 1
  • k
  • M
  • {
the initial state distribution:
  • i
= P (q 1 = i) 1
  • i
  • N
  • {
the en tire mo del
  • =
(A; B ;
  • )
slide-14
SLIDE 14 HMM T utorial/ c
  • T
apas Kan ungo{14 ' & $ % Three Basic Problems 1. Giv en
  • bserv
ation O = (o 1 ;
  • 2
; : : : ;
  • T
) and mo del
  • =
(A; B ;
  • );
ecien tly compute P (O j):
  • Hidden
states complicate the ev aluation
  • Giv
en t w
  • mo
dels
  • 1
and
  • 2
, this can b e used to c ho
  • se
the b etter
  • ne.
2. Giv en
  • bserv
ation O = (o 1 ;
  • 2
; : : : ;
  • T
) and mo del
  • nd
the
  • ptimal
state sequence q = (q 1 ; q 2 ; : : : ; q T ):
  • Optimalit
y criterion has to b e decided (e.g. maxim um lik eliho
  • d)
  • \Explanation"
for the data. 3. Giv en O = (o 1 ;
  • 2
; : : : ;
  • T
); estimate mo del parame- ters
  • =
(A; B ;
  • )
that maximize P (O j):
slide-15
SLIDE 15 HMM T utorial/ c
  • T
apas Kan ungo{15 ' & $ % Solution to Problem 1
  • Problem:
Compute P (o 1 ;
  • 2
; : : : ;
  • T
j)
  • Algorithm:
{ Let q = (q 1 ; q 2 ; : : : ; q T ) b e a state sequence. { Assume the
  • bserv
ations are indep enden t: P (O jq ; ) = T Y i=1 P (o t jq t ; ) = b q 1 (o 1 )b q 2 (o 2 )
  • b
q T (o T ) { Probabilit y
  • f
a particular state sequence is: P (q j) =
  • q
1 a q 1 q 2 a q 2 q 3
  • a
q T 1 q T { Also, P (O ; q j) = P (O jq ; )P (q j) { En umerate paths and sum probabilities: P (O j) = X q P (O jq ; )P (q j)
  • N
T state sequences and O (T ) calculations. Complexit y: O (T N T ) calculations.
slide-16
SLIDE 16 HMM T utorial/ c
  • T
apas Kan ungo{16 ' & $ % F
  • rw
ard Pro cedure: In tuition

k t t+1 TIME STATES 1 2 3 N aNk a1K a3k

slide-17
SLIDE 17 HMM T utorial/ c
  • T
apas Kan ungo{17 ' & $ % F
  • rw
ard Algorithm
  • Dene
forw ard v ariable
  • t
(i) as:
  • t
(i) = P (o 1 ;
  • 2
; : : : ;
  • t
; q t = ij)
  • t
(i) is the probabilit y
  • f
  • bserving
the partial sequence (o 1 ;
  • 2
: : : ;
  • t
) suc h that the state q t is i.
  • Induction:
1. Initialization:
  • 1
(i) =
  • i
b i (o 1 ) 2. Induction:
  • t+1
(j ) = 2 4 N X i=1
  • t
(i)a ij 3 5 b j (o t+1 ) 3. T ermination: P (O j) = N X i=1
  • T
(i)
  • Complexit
y: O (N 2 T ):
slide-18
SLIDE 18 HMM T utorial/ c
  • T
apas Kan ungo{18 ' & $ % Example Consider the follo wing coin-tossing exp erimen t: State 1 State 2 State 3 P(H) 0.5 0.75 0.25 P(T) 0.5 0.25 0.75 { state-transition probabilities equal to 1/3 { initial state probabilities equal to 1/3 1. Y
  • u
  • bserv
e O = (H ; H ; H ; H ; T ; H ; T ; T ; T ; T ): What state sequence, q ; is most lik ely? What is the join t probabilit y , P (O ; q j);
  • f
the
  • bserv
ation se- quence and the state sequence? 2. What is the probabilit y that the
  • bserv
ation se- quence came en tirely
  • f
state 1?
slide-19
SLIDE 19 HMM T utorial/ c
  • T
apas Kan ungo{19 ' & $ % 3. Consider the
  • bserv
ation sequence ~ O = (H ; T ; T ; H ; T ; H ; H ; T ; T ; H ): Ho w w
  • uld
y
  • ur
answ ers to parts 1 and 2 c hange? 4. If the state transition probabilities w ere: A = 2 6 6 6 6 6 6 6 6 6 6 4 0:9 0:45 0:45 0:05 0:1 0:45 0:05 0:45 0:1 3 7 7 7 7 7 7 7 7 7 7 5 ; ho w w
  • uld
the new mo del
  • c
hange y
  • ur
answ ers to parts 1-3?
slide-20
SLIDE 20 HMM T utorial/ c
  • T
apas Kan ungo{20 ' & $ % Bac kw ard Algorithm
  • 1

2 t+1 t STATES 1 2 3 4 N t-1 T T-1 t+2 TIME

  • 1

2 t-1 t t+1 t+2 T-1 T

OBSERVATION i a a

iN i1

slide-21
SLIDE 21 HMM T utorial/ c
  • T
apas Kan ungo{21 ' & $ % Bac kw ard Algorithm
  • Dene
bac kw ard v ariable
  • t
(i) as:
  • t
(i) = P (o t+1 ;
  • t+2
; : : : ;
  • T
jq t = i; )
  • t
(i) is the probabilit y
  • f
  • bserving
the partial sequence (o t+1 ;
  • t+2
: : : ;
  • T
) suc h that the state q t is i.
  • Induction:
1. Initialization:
  • T
(i) = 1 2. Induction:
  • t
(i) = N X j =1 a ij b j (o t+1 ) t+1 (j ); 1
  • i
  • N
; t = T
  • 1;
: : : ; 1
slide-22
SLIDE 22 HMM T utorial/ c
  • T
apas Kan ungo{22 ' & $ % Solution to Problem 2
  • Cho
  • se
the most lik ely path
  • Find
the path (q 1 ; q t ; : : : ; q T ) that maximizes the lik eliho
  • d:
P (q 1 ; q 2 ; : : : ; q T jO ; )
  • Solution
b y Dynamic Programming
  • Dene:
  • t
(i) = max q 1 ;q 2 ;::: ;q t1 P (q 1 ; q 2 ; : : : ; q t = i;
  • 1
;
  • 2
; : : : ;
  • t
j)
  • t
(i) is the highest prob. path ending in state i
  • By
induction w e ha v e:
  • t+1
(j ) = max i [ t (i)a ij ]
  • b
j (o t+1 )
slide-23
SLIDE 23 HMM T utorial/ c
  • T
apas Kan ungo{23 ' & $ % Viterbi Algorithm
  • 1

2 t+1 t STATES 1 2 3 4 N t-1 T T-1 t+2 TIME

  • 1

2 t-1 t t+1 t+2 T-1 T

OBSERVATION k a aNk

1k

slide-24
SLIDE 24 HMM T utorial/ c
  • T
apas Kan ungo{24 ' & $ % Viterbi Algorithm
  • Initialization:
  • 1
(i) =
  • i
b i (o 1 ); 1
  • i
  • N
1 (i) =
  • Recursion:
  • t
(j ) = max 1iN [ t1 (i)a ij ]b j (o t ) t (j ) = arg max 1iN [ t1 (i)a ij ] 2
  • t
  • T
; 1
  • j
  • N
  • T
ermination: P
  • =
max 1iN [ T (i)] q
  • T
= arg max 1iN [ T (i)]
  • P
ath (state sequence) bac ktrac king: q
  • t
= t+1 (q
  • t+1
); t = T
  • 1;
T
  • 2;
: : : ; 1
slide-25
SLIDE 25 HMM T utorial/ c
  • T
apas Kan ungo{25 ' & $ % Solution to Problem 3
  • Estimate
  • =
(A; B ;
  • )
to maximize P (O j)
  • No
analytic metho d b ecause
  • f
complexit y { it- erativ e solution.
  • Baum-W
elc h Algorithm: 1. Let initial mo del b e
  • :
2. Compute new
  • based
  • n
  • and
  • bserv
ation O : 3. If log P (O j)
  • log
P (O j ) < D E LT A stop. 4. Else set
  • and
goto step 2.
slide-26
SLIDE 26 HMM T utorial/ c
  • T
apas Kan ungo{26 ' & $ % Baum-W elc h: Prelimi nari es
  • Dene
  • (i;
j ) as the probabilit y
  • f
b eing in state i at time t and in state j at time t + 1:
  • (i;
j ) =
  • t
(i)a ij b j (o t+1 ) t+1 (j ) P (O j) =
  • t
(i)a ij b j (o t+1 ) t+1 (j ) P N i=1 P N j =1
  • t
(i)a ij b j (o t+1 ) t+1 (j )
  • Dene
  • t
(i) as probabilit y
  • f
b eing in state i at time t; giv en the
  • bserv
ation sequence.
  • t
(i) = N X j =1
  • t
(i; j )
  • P
T t=1
  • t
(i) is the exp ected n um b er
  • f
times state i is visited.
  • P
T 1 t=1
  • t
(i; j ) is the exp ected n um b er
  • f
transitions from state i to state j:
slide-27
SLIDE 27 HMM T utorial/ c
  • T
apas Kan ungo{27 ' & $ % Baum-W elc h: Up date Rules
  • i
= exp ected frequency in state i at time (t = 1) =
  • 1
(i):
  • a
ij = (exp ected n um b er
  • f
transition from state i to state j )/ (exp ected n ubmer
  • f
transitions from state i):
  • a
ij = P
  • t
(i; j ) P
  • t
(i)
  • b
j (k ) = (exp ected n um b er
  • f
times in state j and
  • bserving
sym b
  • l
k ) / (exp ected n um b er
  • f
times in state j :
  • b
j (k ) = P t;o t =k
  • t
(j ) P t
  • (j
)
slide-28
SLIDE 28 HMM T utorial/ c
  • T
apas Kan ungo{28 ' & $ % Prop erties
  • Co
v ariance
  • f
the estimated parameters
  • Con
v ergence rates
slide-29
SLIDE 29 HMM T utorial/ c
  • T
apas Kan ungo{29 ' & $ % T yp es
  • f
HMM
  • Con
tin uous densit y
  • Ergo
dic
  • State
duration
slide-30
SLIDE 30 HMM T utorial/ c
  • T
apas Kan ungo{30 ' & $ % Implemen tation Issues
  • Scaling
  • Initial
parameters
  • Multiple
  • bserv
ation
slide-31
SLIDE 31 HMM T utorial/ c
  • T
apas Kan ungo{31 ' & $ % Comparison
  • f
HMMs
  • What
is a natural distance function?
  • If
( 1 ;
  • 2
) is large, do es it mean that the mo dels are really dieren t?