Robustifying the Viterbi Algorithm Cedric De Boom, Jasper De Bock, - - PowerPoint PPT Presentation

robustifying the viterbi algorithm
SMART_READER_LITE
LIVE PREVIEW

Robustifying the Viterbi Algorithm Cedric De Boom, Jasper De Bock, - - PowerPoint PPT Presentation

Robustifying the Viterbi Algorithm Cedric De Boom, Jasper De Bock, Arthur Van Camp, Gert de Cooman PGM 2014, Utrecht A Toy Example Weather estimation ? 2 A Toy Example Weather estimation ? 3 A Toy Example Weather estimation H ? L 4


slide-1
SLIDE 1

PGM 2014, Utrecht Cedric De Boom, Jasper De Bock, Arthur Van Camp, Gert de Cooman

Robustifying the Viterbi Algorithm

slide-2
SLIDE 2

?

A Toy Example

Weather estimation

2

slide-3
SLIDE 3

3

?

A Toy Example

Weather estimation

slide-4
SLIDE 4

H L

?

A Toy Example

Weather estimation

4

slide-5
SLIDE 5

H ?

A Toy Example

Weather estimation

5

slide-6
SLIDE 6

H L L

O1 O2 O3

? ? ?

X1 X2 X3

A Toy Example

Weather estimation

6

slide-7
SLIDE 7

H L L

O1 O2 O3 X1 X2 X3

? ? ?

p1(X1) p2(X2|X1) p3(X3|X2) q1(O1|X1) q2(O2|X2) q3(O3|X3)

Hidden Markov Model

Local models

7

slide-8
SLIDE 8

p1(X1) p2(X2|X1) p3(X3|X2) q1(O1|X1) q2(O2|X2) q3(O3|X3)

Local models

Hidden Markov Model

Local models

8

slide-9
SLIDE 9

p1(X1) p2(X2|X1) p3(X3|X2) q1(O1|X1) q2(O2|X2) q3(O3|X3)

Local models Global model

p(X1:3, O1:3) = p1(X1)q1(O1|X1)

3

Y

i=2

pi(Xi|Xi−1)qi(Oi|Xi)

Hidden Markov Model

Global model

9

slide-10
SLIDE 10

10

Global model

p(X1:3, O1:3) = p1(X1)q1(O1|X1)

3

Y

i=2

pi(Xi|Xi−1)qi(Oi|Xi) arg max

x1:3 p(x1:3|o1:3)

Hidden Markov Model

Estimating the hidden sequence

slide-11
SLIDE 11

10

Global model

p(X1:3, O1:3) = p1(X1)q1(O1|X1)

3

Y

i=2

pi(Xi|Xi−1)qi(Oi|Xi) arg max

x1:3 p(x1:3|o1:3) = arg max x1:3

p(x1:3, o1:3) p(o1:3)

Hidden Markov Model

Estimating the hidden sequence

slide-12
SLIDE 12

10

Global model

p(X1:3, O1:3) = p1(X1)q1(O1|X1)

3

Y

i=2

pi(Xi|Xi−1)qi(Oi|Xi) arg max

x1:3 p(x1:3|o1:3)

= arg max

x1:3 p(x1:3, o1:3)

= arg max

x1:3

p(x1:3, o1:3) p(o1:3)

Hidden Markov Model

Estimating the hidden sequence

slide-13
SLIDE 13

10

Global model

p(X1:3, O1:3) = p1(X1)q1(O1|X1)

3

Y

i=2

pi(Xi|Xi−1)qi(Oi|Xi) arg max

x1:3 p(x1:3|o1:3)

= arg max

x1:3 p(x1:3, o1:3)

Viterbi algorithm (1967)

= arg max

x1:3

p(x1:3, o1:3) p(o1:3)

Hidden Markov Model

Estimating the hidden sequence

slide-14
SLIDE 14

Hidden Markov Model

Viterbi algorithm

11

Viterbi algorithm (1967)

Recursive

slide-15
SLIDE 15

Hidden Markov Model

Viterbi algorithm

11

Viterbi algorithm (1967)

Recursive Complexity O(nm2) n: length of the sequence m: size of state space

slide-16
SLIDE 16

Hidden Markov Model

Viterbi algorithm

11

Viterbi algorithm (1967)

Recursive Complexity O(nm2) n: length of the sequence m: size of state space Extendible to k-best Viterbi

slide-17
SLIDE 17

Imprecise Hidden Markov Model

Local models

12

H L L ? ? ?

MX1 MX2|X1 MX3|X2 MO1|X1 MO2|X2 MO3|X3

slide-18
SLIDE 18

Imprecise Hidden Markov Model

Local models

13

Local models

MX1 MX2|X1 MX3|X2 MO1|X1 MO2|X2 MO3|X3

slide-19
SLIDE 19

M = ( 3 Y

i=1

pi(Xi|Xi−1)qi(Oi|Xi) :

Imprecise Hidden Markov Model

Global model

14

Local models Global model

MX1 MX2|X1 MX3|X2 MO1|X1 MO2|X2 MO3|X3 (∀k ∈ {1, 2, 3}) pk(·|Xk−1) ∈ MXk|Xk−1, qk(·|Xk) ∈ MOk|Xk

slide-20
SLIDE 20

M = ( 3 Y

i=1

pi(Xi|Xi−1)qi(Oi|Xi) :

Imprecise Hidden Markov Model

Global model

14

Local models Global model

MX1 MX2|X1 MX3|X2 MO1|X1 MO2|X2 MO3|X3 (∀k ∈ {1, 2, 3}) pk(·|Xk−1) ∈ MXk|Xk−1, qk(·|Xk) ∈ MOk|Xk

May contain infinitely many precise models!

slide-21
SLIDE 21

Imprecise Hidden Markov Model

Estimating the hidden sequence

15

Partial order

x1:3 ˆ x1:3 , (8p 2 M) p(x1:3|o1:3) > p(ˆ x1:3|o1:3)

slide-22
SLIDE 22

Imprecise Hidden Markov Model

Estimating the hidden sequence

15

Set of maximal solutions

  • ptmax(X1:3) , {ˆ

x1:3 ∈ X1:3 : (∀x1:3 ∈ X1:3) x1:3 ⌥ ˆ x1:3}

Partial order

x1:3 ˆ x1:3 , (8p 2 M) p(x1:3|o1:3) > p(ˆ x1:3|o1:3)

slide-23
SLIDE 23

Imprecise Hidden Markov Model

Estimating the hidden sequence

15

Set of maximal solutions

  • ptmax(X1:3) , {ˆ

x1:3 ∈ X1:3 : (∀x1:3 ∈ X1:3) x1:3 ⌥ ˆ x1:3}

Indecision

There may be multiple maximal solutions.

Partial order

x1:3 ˆ x1:3 , (8p 2 M) p(x1:3|o1:3) > p(ˆ x1:3|o1:3)

slide-24
SLIDE 24

Imprecise Hidden Markov Model

Rewriting the solution set

16

n , (8p 2 M) p(x1:n|o1:n) > p(ˆ

x1:n|o1:n) x1:n ˆ x1:n ,

Partial order

slide-25
SLIDE 25

Imprecise Hidden Markov Model

Rewriting the solution set

16

n , (8p 2 M) p(x1:n|o1:n) > p(ˆ

x1:n|o1:n) ⇔ (∀p ∈ M) p(x1:n, o1:n) > p(ˆ x1:n, o1:n) x1:n ˆ x1:n ,

Partial order

slide-26
SLIDE 26

Imprecise Hidden Markov Model

Rewriting the solution set

16

⇔ (∀p ∈ M) p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1

n , (8p 2 M) p(x1:n|o1:n) > p(ˆ

x1:n|o1:n) ⇔ (∀p ∈ M) p(x1:n, o1:n) > p(ˆ x1:n, o1:n) x1:n ˆ x1:n ,

Partial order

slide-27
SLIDE 27

Imprecise Hidden Markov Model

Rewriting the solution set

16

⇔ (∀p ∈ M) p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 ⇔ min

p∈M

p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1

n , (8p 2 M) p(x1:n|o1:n) > p(ˆ

x1:n|o1:n) ⇔ (∀p ∈ M) p(x1:n, o1:n) > p(ˆ x1:n, o1:n) x1:n ˆ x1:n ,

Partial order

slide-28
SLIDE 28

Imprecise Hidden Markov Model

Rewriting the solution set

16

⇔ (∀p ∈ M) p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 ⇔ min

p∈M

p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 What if becomes zero? p(ˆ x1:n, o1:n)

n , (8p 2 M) p(x1:n|o1:n) > p(ˆ

x1:n|o1:n) ⇔ (∀p ∈ M) p(x1:n, o1:n) > p(ˆ x1:n, o1:n) x1:n ˆ x1:n ,

Partial order

slide-29
SLIDE 29

Imprecise Hidden Markov Model

Rewriting the solution set

17

⇔ min

p∈M

p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 x1:n ˆ x1:n ,

Partial order

slide-30
SLIDE 30

Imprecise Hidden Markov Model

Rewriting the solution set

17

⇔ min

p∈M

p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 x1:n ˆ x1:n , ⇔ min

p∈M n

Y

k=1

pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) qk(ok|xk) qk(ok|ˆ xk) > 1

Partial order

slide-31
SLIDE 31

Imprecise Hidden Markov Model

Rewriting the solution set

17

⇔ min

p∈M

p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 x1:n ˆ x1:n , ⇔ min

p∈M n

Y

k=1

pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) qk(ok|xk) qk(ok|ˆ xk) > 1

n

Y

k=1

min

pk(·|Xk−1)∈MXk|Xk−1

pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) min

qk(·|Xk)∈MOk|Xk

qk(ok|xk) qk(ok|ˆ xk) > 1 Partial order

slide-32
SLIDE 32

Imprecise Hidden Markov Model

Rewriting the solution set

17

⇔ min

p∈M

p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 x1:n ˆ x1:n , ⇔ min

p∈M n

Y

k=1

pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) qk(ok|xk) qk(ok|ˆ xk) > 1

n

Y

k=1

min

pk(·|Xk−1)∈MXk|Xk−1

pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) min

qk(·|Xk)∈MOk|Xk

qk(ok|xk) qk(ok|ˆ xk) > 1

n

Y

k=1

χk(xk, xk−1, ˆ xk, ˆ xk−1)ωk(xk, ˆ xk, ok) > 1

Partial order

slide-33
SLIDE 33

Imprecise Hidden Markov Model

Rewriting the solution set

17

⇔ min

p∈M

p(x1:n, o1:n) p(ˆ x1:n, o1:n) > 1 x1:n ˆ x1:n , ⇔ min

p∈M n

Y

k=1

pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) qk(ok|xk) qk(ok|ˆ xk) > 1

n

Y

k=1

min

pk(·|Xk−1)∈MXk|Xk−1

pk(xk|xk−1) pk(ˆ xk|ˆ xk−1) min

qk(·|Xk)∈MOk|Xk

qk(ok|xk) qk(ok|ˆ xk) > 1

n

Y

k=1

χk(xk, xk−1, ˆ xk, ˆ xk−1)ωk(xk, ˆ xk, ok) > 1 Can be calculated in advance

Partial order

slide-34
SLIDE 34

Imprecise Hidden Markov Model

Rewriting the solution set

18

Set of maximal solutions

⇔ max

x1:n∈X1:n n

Y

k=1

χk(xk, xk−1, ˆ xk, ˆ xk−1)ωk(xk, ˆ xk, ok) ≤ 1

  • ptmax(X1:n) , {ˆ

x1:n 2 X1:n : (8x1:n 2 X1:n) x1:n 6 ˆ x1:n}

slide-35
SLIDE 35

Imprecise Hidden Markov Model

Rewriting the solution set

18

How do we calculate this set?

Set of maximal solutions

⇔ max

x1:n∈X1:n n

Y

k=1

χk(xk, xk−1, ˆ xk, ˆ xk−1)ωk(xk, ˆ xk, ok) ≤ 1

  • ptmax(X1:n) , {ˆ

x1:n 2 X1:n : (8x1:n 2 X1:n) x1:n 6 ˆ x1:n}

slide-36
SLIDE 36

Imprecise Hidden Markov Model

Rewriting the solution set

18

How do we calculate this set?

MaxiHMM algorithm Set of maximal solutions

⇔ max

x1:n∈X1:n n

Y

k=1

χk(xk, xk−1, ˆ xk, ˆ xk−1)ωk(xk, ˆ xk, ok) ≤ 1

  • ptmax(X1:n) , {ˆ

x1:n 2 X1:n : (8x1:n 2 X1:n) x1:n 6 ˆ x1:n}

slide-37
SLIDE 37

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-38
SLIDE 38

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-39
SLIDE 39

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-40
SLIDE 40

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-41
SLIDE 41

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-42
SLIDE 42

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-43
SLIDE 43

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-44
SLIDE 44

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-45
SLIDE 45

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-46
SLIDE 46

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-47
SLIDE 47

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-48
SLIDE 48

MaxiHMM algorithm

General overview

19

MaxiHMM algorithm

slide-49
SLIDE 49

MaxiHMM algorithm

General overview

20

We recursively define two parameters:

slide-50
SLIDE 50

MaxiHMM algorithm

General overview

20

γk(xk, ˆ xk) , min

ˆ xk+1∈Xk+1

max

xk+1∈Xk+1 χk+1(xk+1, xk, ˆ

xk+1, ˆ xk) ωk+1(xk+1, ˆ xk+1, ok+1)γk+1(xk+1, ˆ xk+1) γn(xn, ˆ xn) , 1 We recursively define two parameters:

slide-51
SLIDE 51

MaxiHMM algorithm

General overview

20

γk(xk, ˆ xk) , min

ˆ xk+1∈Xk+1

max

xk+1∈Xk+1 χk+1(xk+1, xk, ˆ

xk+1, ˆ xk) ωk+1(xk+1, ˆ xk+1, ok+1)γk+1(xk+1, ˆ xk+1) γn(xn, ˆ xn) , 1 … and … δk(xk, ˆ x1:k) , max

xk−1∈Xk−1 χk(xk, xk−1, ˆ

xk, ˆ xk−1) ωk(xk, ˆ xk, ok)δk−1(xk−1, ˆ x1:k−1) δ1(x1, ˆ x1) , χ1(x1, ˆ x1)ω(x1, ˆ x1, o1) We recursively define two parameters:

slide-52
SLIDE 52

MaxiHMM algorithm

General overview

21

We want to be able to check whether: (8ˆ x1:n 2 optmax(X1:n)) ˆ x1:k 6= ˆ x∗

1:k

some initial segment

slide-53
SLIDE 53

MaxiHMM algorithm

General overview

21

We want to be able to check whether: (8ˆ x1:n 2 optmax(X1:n)) ˆ x1:k 6= ˆ x∗

1:k

some initial segment max

xk∈Xk δk(xk, ˆ

x∗

1:k)γk(xk, ˆ

x∗

k) > 1

slide-54
SLIDE 54

MaxiHMM algorithm

General overview

22

MaxiHMM algorithm

slide-55
SLIDE 55

Cedric De Boom

MaxiHMM algorithm

General overview

23

slide-56
SLIDE 56

MaxiHMM algorithm

Properties

24

MaxiHMM algorithm

Recursive

slide-57
SLIDE 57

MaxiHMM algorithm

Properties

24

MaxiHMM algorithm

Recursive Heuristic complexity O(Snm2) S: number of solutions n: length of the sequence m: size of state space

slide-58
SLIDE 58

MaxiHMM algorithm

Properties

25

MaxiHMM algorithm

slide-59
SLIDE 59

MaxiHMM algorithm

Applications

26

slide-60
SLIDE 60

OCR

Optical Character Recognition

27

Willem, die vele bouke maecte, Daer hi dicken omme waecte, Hem vernoyde so haerde Dat die avonture van Reynaerde In Dietsche onghemaket bleven

  • Die Willem niet hevet vulscreven -

Dat hi die vijte van Reynaerde soucken Ende hise na den Walschen boucken In Dietsche dus hevet begonnen.

slide-61
SLIDE 61

OCR

Hidden Markov models

28

Willem, die vele bouke maecte, Daer hi dicken omme waecte, Hem vernoyde so haerde Dat die avonture van Reynaerde In Dietsche onghemaket bleven

  • Die Willem niet hevet vulscreven -

Dat hi die vijte van Reynaerde soucken Ende hise na den Walschen boucken In Dietsche dus hevet begonnen.

D ?

slide-62
SLIDE 62

OCR

Hidden Markov models

29

M ? M ? E ? D ?

slide-63
SLIDE 63

OCR

Hidden Markov models

30

M M M M E E D O

slide-64
SLIDE 64

31

Van den Vos Reynaerde

OCR

Textual data

Medieval Dutch (13th century)

slide-65
SLIDE 65

31

Van den Vos Reynaerde

OCR

Textual data

Medieval Dutch (13th century) Often little data on hand

slide-66
SLIDE 66

31

Van den Vos Reynaerde

OCR

Textual data

Medieval Dutch (13th century) Often little data on hand Building very accurate precise
 models is impossible

slide-67
SLIDE 67

31

Van den Vos Reynaerde

OCR

Textual data

Medieval Dutch (13th century) Often little data on hand Building very accurate precise
 models is impossible Use imprecise models

slide-68
SLIDE 68

OCR

Results

32

BEWAENT

read as

BEWAEHT 3-best Viterbi solutions

BEWAENT BEWAERT BEWAEHT

MaxiHMM solutions

BEWAENT

slide-69
SLIDE 69

OCR

Results

33

UP

read as

UF 3-best Viterbi solutions

VE OF UP

MaxiHMM solutions

OF UF UP VE

slide-70
SLIDE 70

OCR

Results

33

UP

read as

UF 3-best Viterbi solutions

VE OF UP

MaxiHMM solutions

OF UF UP VE

Multiple solutions are typically returned in cases where the Viterbi algorithm fails to return the correct result

slide-71
SLIDE 71

OCR

Results

34

COMT

read as

COMT 3-best Viterbi solutions

CONT COMI CONI

MaxiHMM solutions

COMI COMT CONT

slide-72
SLIDE 72

OCR

Results

34

COMT

read as

COMT 3-best Viterbi solutions

CONT COMI CONI

MaxiHMM solutions

COMI COMT CONT

Imprecision takes care of problems with small data sets

slide-73
SLIDE 73

Questions?