PGM 2014, Utrecht Cedric De Boom, Jasper De Bock, Arthur Van Camp, Gert de Cooman
Robustifying the Viterbi Algorithm
Robustifying the Viterbi Algorithm Cedric De Boom, Jasper De Bock, - - PowerPoint PPT Presentation
Robustifying the Viterbi Algorithm Cedric De Boom, Jasper De Bock, Arthur Van Camp, Gert de Cooman PGM 2014, Utrecht A Toy Example Weather estimation ? 2 A Toy Example Weather estimation ? 3 A Toy Example Weather estimation H ? L 4
PGM 2014, Utrecht Cedric De Boom, Jasper De Bock, Arthur Van Camp, Gert de Cooman
Robustifying the Viterbi Algorithm
A Toy Example
Weather estimation
2
3
A Toy Example
Weather estimation
A Toy Example
Weather estimation
4
A Toy Example
Weather estimation
5
A Toy Example
Weather estimation
6
p1(X1) p2(X2|X1) p3(X3|X2) q1(O1|X1) q2(O2|X2) q3(O3|X3)
Hidden Markov Model
Local models
7
p1(X1) p2(X2|X1) p3(X3|X2) q1(O1|X1) q2(O2|X2) q3(O3|X3)
Local models
Hidden Markov Model
Local models
8
p1(X1) p2(X2|X1) p3(X3|X2) q1(O1|X1) q2(O2|X2) q3(O3|X3)
Local models Global model
p(X1:3, O1:3) = p1(X1)q1(O1|X1)
3
Y
i=2
pi(Xi|Xi−1)qi(Oi|Xi)
Hidden Markov Model
Global model
9
10
Global model
p(X1:3, O1:3) = p1(X1)q1(O1|X1)
3
Y
i=2
pi(Xi|Xi−1)qi(Oi|Xi) arg max
x1:3 p(x1:3|o1:3)
Hidden Markov Model
Estimating the hidden sequence
10
Global model
p(X1:3, O1:3) = p1(X1)q1(O1|X1)
3
Y
i=2
pi(Xi|Xi−1)qi(Oi|Xi) arg max
x1:3 p(x1:3|o1:3) = arg max x1:3
p(x1:3, o1:3) p(o1:3)
Hidden Markov Model
Estimating the hidden sequence
10
Global model
p(X1:3, O1:3) = p1(X1)q1(O1|X1)
3
Y
i=2
pi(Xi|Xi−1)qi(Oi|Xi) arg max
x1:3 p(x1:3|o1:3)
= arg max
x1:3 p(x1:3, o1:3)
= arg max
x1:3
p(x1:3, o1:3) p(o1:3)
Hidden Markov Model
Estimating the hidden sequence
10
Global model
p(X1:3, O1:3) = p1(X1)q1(O1|X1)
3
Y
i=2
pi(Xi|Xi−1)qi(Oi|Xi) arg max
x1:3 p(x1:3|o1:3)
= arg max
x1:3 p(x1:3, o1:3)
Viterbi algorithm (1967)
= arg max
x1:3
p(x1:3, o1:3) p(o1:3)
Hidden Markov Model
Estimating the hidden sequence
Hidden Markov Model
Viterbi algorithm
11
Viterbi algorithm (1967)
Recursive
Hidden Markov Model
Viterbi algorithm
11
Viterbi algorithm (1967)
Recursive Complexity O(nm2) n: length of the sequence m: size of state space
Hidden Markov Model
Viterbi algorithm
11
Viterbi algorithm (1967)
Recursive Complexity O(nm2) n: length of the sequence m: size of state space Extendible to k-best Viterbi
Imprecise Hidden Markov Model
Local models
12
MX1 MX2|X1 MX3|X2 MO1|X1 MO2|X2 MO3|X3
Imprecise Hidden Markov Model
Local models
13
Local models
MX1 MX2|X1 MX3|X2 MO1|X1 MO2|X2 MO3|X3
M = ( 3 Y
i=1
pi(Xi|Xi−1)qi(Oi|Xi) :
Imprecise Hidden Markov Model
Global model
14
Local models Global model
MX1 MX2|X1 MX3|X2 MO1|X1 MO2|X2 MO3|X3 (∀k ∈ {1, 2, 3}) pk(·|Xk−1) ∈ MXk|Xk−1, qk(·|Xk) ∈ MOk|Xk
M = ( 3 Y
i=1
pi(Xi|Xi−1)qi(Oi|Xi) :
Imprecise Hidden Markov Model
Global model
14
Local models Global model
MX1 MX2|X1 MX3|X2 MO1|X1 MO2|X2 MO3|X3 (∀k ∈ {1, 2, 3}) pk(·|Xk−1) ∈ MXk|Xk−1, qk(·|Xk) ∈ MOk|Xk
May contain infinitely many precise models!
Imprecise Hidden Markov Model
Estimating the hidden sequence
15
Partial order
x1:3 ˆ x1:3 , (8p 2 M) p(x1:3|o1:3) > p(ˆ x1:3|o1:3)
Imprecise Hidden Markov Model
Estimating the hidden sequence
15
Set of maximal solutions
x1:3 ∈ X1:3 : (∀x1:3 ∈ X1:3) x1:3 ⌥ ˆ x1:3}
Partial order
x1:3 ˆ x1:3 , (8p 2 M) p(x1:3|o1:3) > p(ˆ x1:3|o1:3)
Imprecise Hidden Markov Model
Estimating the hidden sequence
15
Set of maximal solutions
x1:3 ∈ X1:3 : (∀x1:3 ∈ X1:3) x1:3 ⌥ ˆ x1:3}
Indecision
There may be multiple maximal solutions.
Partial order
x1:3 ˆ x1:3 , (8p 2 M) p(x1:3|o1:3) > p(ˆ x1:3|o1:3)
Imprecise Hidden Markov Model
Rewriting the solution set
16
n , (8p 2 M) p(x1:n|o1:n) > p(ˆ
x1:n|o1:n) x1:n ˆ x1:n ,
Partial order
Imprecise Hidden Markov Model
Rewriting the solution set
16
n , (8p 2 M) p(x1:n|o1:n) > p(ˆ
x1:n|o1:n) ⇔ (∀p ∈ M) p(x1:n, o1:n) > p(ˆ x1:n, o1:n) x1:n ˆ x1:n ,
Partial order
Imprecise Hidden Markov Model
Rewriting the solution set
16
⇔ (∀p ∈ M) p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1
n , (8p 2 M) p(x1:n|o1:n) > p(ˆ
x1:n|o1:n) ⇔ (∀p ∈ M) p(x1:n, o1:n) > p(ˆ x1:n, o1:n) x1:n ˆ x1:n ,
Partial order
Imprecise Hidden Markov Model
Rewriting the solution set
16
⇔ (∀p ∈ M) p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 ⇔ min
p∈M
p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1
n , (8p 2 M) p(x1:n|o1:n) > p(ˆ
x1:n|o1:n) ⇔ (∀p ∈ M) p(x1:n, o1:n) > p(ˆ x1:n, o1:n) x1:n ˆ x1:n ,
Partial order
Imprecise Hidden Markov Model
Rewriting the solution set
16
⇔ (∀p ∈ M) p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 ⇔ min
p∈M
p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 What if becomes zero? p(ˆ x1:n, o1:n)
n , (8p 2 M) p(x1:n|o1:n) > p(ˆ
x1:n|o1:n) ⇔ (∀p ∈ M) p(x1:n, o1:n) > p(ˆ x1:n, o1:n) x1:n ˆ x1:n ,
Partial order
Imprecise Hidden Markov Model
Rewriting the solution set
17
⇔ min
p∈M
p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 x1:n ˆ x1:n ,
Partial order
Imprecise Hidden Markov Model
Rewriting the solution set
17
⇔ min
p∈M
p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 x1:n ˆ x1:n , ⇔ min
p∈M n
Y
k=1
pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) qk(ok|xk) qk(ok|ˆ xk) > 1
Partial order
Imprecise Hidden Markov Model
Rewriting the solution set
17
⇔ min
p∈M
p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 x1:n ˆ x1:n , ⇔ min
p∈M n
Y
k=1
pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) qk(ok|xk) qk(ok|ˆ xk) > 1
⇔
n
Y
k=1
min
pk(·|Xk−1)∈MXk|Xk−1
pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) min
qk(·|Xk)∈MOk|Xk
qk(ok|xk) qk(ok|ˆ xk) > 1 Partial order
Imprecise Hidden Markov Model
Rewriting the solution set
17
⇔ min
p∈M
p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 x1:n ˆ x1:n , ⇔ min
p∈M n
Y
k=1
pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) qk(ok|xk) qk(ok|ˆ xk) > 1
⇔
n
Y
k=1
min
pk(·|Xk−1)∈MXk|Xk−1
pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) min
qk(·|Xk)∈MOk|Xk
qk(ok|xk) qk(ok|ˆ xk) > 1
⇔
n
Y
k=1
χk(xk, xk−1, ˆ xk, ˆ xk−1)ωk(xk, ˆ xk, ok) > 1
Partial order
Imprecise Hidden Markov Model
Rewriting the solution set
17
⇔ min
p∈M
p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 x1:n ˆ x1:n , ⇔ min
p∈M n
Y
k=1
pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) qk(ok|xk) qk(ok|ˆ xk) > 1
⇔
n
Y
k=1
min
pk(·|Xk−1)∈MXk|Xk−1
pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) min
qk(·|Xk)∈MOk|Xk
qk(ok|xk) qk(ok|ˆ xk) > 1
⇔
n
Y
k=1
χk(xk, xk−1, ˆ xk, ˆ xk−1)ωk(xk, ˆ xk, ok) > 1 Can be calculated in advance
Partial order
Imprecise Hidden Markov Model
Rewriting the solution set
18
Set of maximal solutions
⇔ max
x1:n∈X1:n n
Y
k=1
χk(xk, xk−1, ˆ xk, ˆ xk−1)ωk(xk, ˆ xk, ok) ≤ 1
x1:n 2 X1:n : (8x1:n 2 X1:n) x1:n 6 ˆ x1:n}
Imprecise Hidden Markov Model
Rewriting the solution set
18
How do we calculate this set?
Set of maximal solutions
⇔ max
x1:n∈X1:n n
Y
k=1
χk(xk, xk−1, ˆ xk, ˆ xk−1)ωk(xk, ˆ xk, ok) ≤ 1
x1:n 2 X1:n : (8x1:n 2 X1:n) x1:n 6 ˆ x1:n}
Imprecise Hidden Markov Model
Rewriting the solution set
18
How do we calculate this set?
MaxiHMM algorithm Set of maximal solutions
⇔ max
x1:n∈X1:n n
Y
k=1
χk(xk, xk−1, ˆ xk, ˆ xk−1)ωk(xk, ˆ xk, ok) ≤ 1
x1:n 2 X1:n : (8x1:n 2 X1:n) x1:n 6 ˆ x1:n}
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
19
MaxiHMM algorithm
MaxiHMM algorithm
General overview
20
We recursively define two parameters:
MaxiHMM algorithm
General overview
20
γk(xk, ˆ xk) , min
ˆ xk+1∈Xk+1
max
xk+1∈Xk+1 χk+1(xk+1, xk, ˆ
xk+1, ˆ xk) ωk+1(xk+1, ˆ xk+1, ok+1)γk+1(xk+1, ˆ xk+1) γn(xn, ˆ xn) , 1 We recursively define two parameters:
MaxiHMM algorithm
General overview
20
γk(xk, ˆ xk) , min
ˆ xk+1∈Xk+1
max
xk+1∈Xk+1 χk+1(xk+1, xk, ˆ
xk+1, ˆ xk) ωk+1(xk+1, ˆ xk+1, ok+1)γk+1(xk+1, ˆ xk+1) γn(xn, ˆ xn) , 1 … and … δk(xk, ˆ x1:k) , max
xk−1∈Xk−1 χk(xk, xk−1, ˆ
xk, ˆ xk−1) ωk(xk, ˆ xk, ok)δk−1(xk−1, ˆ x1:k−1) δ1(x1, ˆ x1) , χ1(x1, ˆ x1)ω(x1, ˆ x1, o1) We recursively define two parameters:
MaxiHMM algorithm
General overview
21
We want to be able to check whether: (8ˆ x1:n 2 optmax(X1:n)) ˆ x1:k 6= ˆ x∗
1:k
some initial segment
MaxiHMM algorithm
General overview
21
We want to be able to check whether: (8ˆ x1:n 2 optmax(X1:n)) ˆ x1:k 6= ˆ x∗
1:k
some initial segment max
xk∈Xk δk(xk, ˆ
x∗
1:k)γk(xk, ˆ
x∗
k) > 1
MaxiHMM algorithm
General overview
22
MaxiHMM algorithm
Cedric De Boom
MaxiHMM algorithm
General overview
23
MaxiHMM algorithm
Properties
24
MaxiHMM algorithm
Recursive
MaxiHMM algorithm
Properties
24
MaxiHMM algorithm
Recursive Heuristic complexity O(Snm2) S: number of solutions n: length of the sequence m: size of state space
MaxiHMM algorithm
Properties
25
MaxiHMM algorithm
MaxiHMM algorithm
Applications
26
OCR
Optical Character Recognition
27
Willem, die vele bouke maecte, Daer hi dicken omme waecte, Hem vernoyde so haerde Dat die avonture van Reynaerde In Dietsche onghemaket bleven
Dat hi die vijte van Reynaerde soucken Ende hise na den Walschen boucken In Dietsche dus hevet begonnen.
OCR
Hidden Markov models
28
Willem, die vele bouke maecte, Daer hi dicken omme waecte, Hem vernoyde so haerde Dat die avonture van Reynaerde In Dietsche onghemaket bleven
Dat hi die vijte van Reynaerde soucken Ende hise na den Walschen boucken In Dietsche dus hevet begonnen.
OCR
Hidden Markov models
29
OCR
Hidden Markov models
30
31
Van den Vos Reynaerde
OCR
Textual data
Medieval Dutch (13th century)
31
Van den Vos Reynaerde
OCR
Textual data
Medieval Dutch (13th century) Often little data on hand
31
Van den Vos Reynaerde
OCR
Textual data
Medieval Dutch (13th century) Often little data on hand Building very accurate precise models is impossible
31
Van den Vos Reynaerde
OCR
Textual data
Medieval Dutch (13th century) Often little data on hand Building very accurate precise models is impossible Use imprecise models
OCR
Results
32
BEWAENT
read as
BEWAEHT 3-best Viterbi solutions
BEWAENT BEWAERT BEWAEHT
MaxiHMM solutions
BEWAENT
OCR
Results
33
UP
read as
UF 3-best Viterbi solutions
VE OF UP
MaxiHMM solutions
OF UF UP VE
OCR
Results
33
UP
read as
UF 3-best Viterbi solutions
VE OF UP
MaxiHMM solutions
OF UF UP VE
Multiple solutions are typically returned in cases where the Viterbi algorithm fails to return the correct result
OCR
Results
34
COMT
read as
COMT 3-best Viterbi solutions
CONT COMI CONI
MaxiHMM solutions
COMI COMT CONT
OCR
Results
34
COMT
read as
COMT 3-best Viterbi solutions
CONT COMI CONI
MaxiHMM solutions
COMI COMT CONT
Imprecision takes care of problems with small data sets
Questions?