Graphical Models over Multiple Strings Markus Dreyer and Jason - - PowerPoint PPT Presentation

graphical models over multiple strings
SMART_READER_LITE
LIVE PREVIEW

Graphical Models over Multiple Strings Markus Dreyer and Jason - - PowerPoint PPT Presentation

Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Center for Language and Speech Processing (CLSP) Center of Excellence in Language and Speech Processing (COE) Computer Science Department (CS) Johns Hopkins University (JHU)


slide-1
SLIDE 1

Graphical Models

  • ver Multiple Strings

Markus Dreyer and Jason Eisner

Center for Language and Speech Processing (CLSP) Center of Excellence in Language and Speech Processing (COE) Computer Science Department (CS) Johns Hopkins University (JHU)

EMNLP 2009

slide-2
SLIDE 2

Motivation

single prediction joint prediction simple variables complex variables

in: text,

  • ut: topic ID

in

  • ut

Function

this talk goes here!

slide-3
SLIDE 3

Motivation

single prediction joint prediction simple variables complex variables

in: text,

  • ut: topic ID

in

  • ut

Function

in

this talk goes here!

i n : t e x t ,

  • u

t : t a g s e q u e n c e ( C R F , . . . )

  • ut
  • ut
  • ut
slide-4
SLIDE 4

Motivation

single prediction joint prediction simple variables complex variables

in: text,

  • ut: topic ID

in

  • ut

Function

in

this talk goes here!

i n : t e x t ,

  • u

t : t a g s e q u e n c e ( C R F , . . . )

  • ut
  • ut
  • ut
slide-5
SLIDE 5

Motivation

single prediction joint prediction simple variables complex variables

in: text,

  • ut: topic ID

in

  • ut

Function

in

this talk goes here!

i n : t e x t ,

  • u

t : t a g s e q u e n c e ( C R F , . . . )

  • ut
  • ut
  • ut
slide-6
SLIDE 6

Motivation

single prediction joint prediction simple variables complex variables

in: text,

  • ut: topic ID

in

  • ut

Function

in

this talk goes here!

i n : t e x t ,

  • u

t : t a g s e q u e n c e ( C R F , . . . )

  • ut
  • ut
  • ut
slide-7
SLIDE 7

Motivation

single prediction joint prediction simple variables complex variables

in: text,

  • ut: topic ID

Y

in

  • ut

Function

in

  • ut
  • ut
  • ut

FST, ...

in

this talk goes here!

in: word,

  • ut: trans-

literation, ...

i n : t e x t ,

  • u

t : t a g s e q u e n c e ( C R F , . . . )

  • ut
slide-8
SLIDE 8

Motivation

single prediction joint prediction simple variables complex variables

in: text,

  • ut: topic ID

Y in: word,

  • ut: trans-

literation, ... Y2 Y3 Y1

in

  • ut

Function

in

  • ut
  • ut

in

  • ut
  • ut
  • ut
  • ut

FST, ...

in

i n : t e x t ,

  • u

t : t a g s e q u e n c e ( C R F , . . . )

  • ut
slide-9
SLIDE 9
  • Motivation. Example tasks

Morphology

? ? ? ? ? ? ? ? ? ? ? ?

slide-10
SLIDE 10
  • Motivation. Example tasks

Morphology

slide-11
SLIDE 11
  • Motivation. Example tasks

Morphology

slide-12
SLIDE 12
  • Motivation. Example tasks

Morphology

predict predict

slide-13
SLIDE 13
  • Motivation. Example tasks

Morphology

predict predict

slide-14
SLIDE 14
  • Motivation. Example tasks

Morphology

predict predict

slide-15
SLIDE 15
  • Motivation. Example tasks

Morphology

predict predict reinforce

slide-16
SLIDE 16
  • Motivation. Example tasks

Transliteration

Japanese

  • rthogr.

English

  • rthogr.

ice cream

slide-17
SLIDE 17
  • Motivation. Example tasks

Transliteration

Japanese

  • rthogr.

English

  • rthogr.

ice cream

predict

slide-18
SLIDE 18
  • Motivation. Example tasks

Transliteration

Japanese

  • rthogr.

English

  • rthogr.

ice cream ay s u k u l iy m u Japanese phonology ay s k r iy m English phonology

Knight & Graehl 1997

hidden pronunciations

slide-19
SLIDE 19
  • Motivation. Example tasks

Transliteration

Japanese

  • rthogr.

English

  • rthogr.

ice cream ay s u k u l iy m u Japanese phonology ay s k r iy m English phonology

Knight & Graehl 1997

hidden pronunciations

slide-20
SLIDE 20
  • Motivation. Example tasks

Transliteration

Japanese

  • rthogr.

English

  • rthogr.

ice cream ay s u k u l iy m u Japanese phonology ay s k r iy m English phonology

Knight & Graehl 1997

hidden pronunciations

slide-21
SLIDE 21
  • Motivation. Example tasks

Transliteration

Japanese

  • rthogr.

English

  • rthogr.

ice cream ay s u k u l iy m u Japanese phonology ay s k r iy m English phonology

Knight & Graehl 1997

hidden pronunciations

slide-22
SLIDE 22
  • Motivation. Example tasks

Transliteration

Japanese

  • rthogr.

English

  • rthogr.

ice cream ay s u k u l iy m u Japanese phonology ay s k r iy m English phonology

hidden pronunciations

Add arbitrary piecewise factors!

slide-23
SLIDE 23
  • Motivation. Example tasks

Transliteration

Japanese

  • rthogr.

English

  • rthogr.

ice cream ay s u k u l iy m u Japanese phonology ay s k r iy m English phonology

hidden pronunciations

Add arbitrary piecewise factors!

slide-24
SLIDE 24
  • Motivation. Example tasks
  • Further examples:
  • Cognate modeling
  • Multiple-string alignment
  • System combination
slide-25
SLIDE 25

Overview

  • Motivation
  • Model
  • Inference & Approximations
  • Experiments
  • Conclusions
slide-26
SLIDE 26
  • Model. Getting started: 2 strings
  • Suppose we have a probability

distribution over two string variables S1 and S2

  • Construct weighted finite-state

transducer F that can assign a score to any values of the strings s1, s2.

Dreyer, Smith & Eisner, 2008

Pr(s1,s2) = 1/Z F(s1,s2) F S2 S1

slide-27
SLIDE 27
  • Model. 2 strings: An example

b r e c h e n b r a c h t S1 = S2 = F S2 S1 F

slide-28
SLIDE 28
  • Model. 2 strings: An example

b r e c h e n b r a c h t S1 = S2 = F S2 S1 F =13.26 b r e c h e n ε b r a c h εεt n t

slide-29
SLIDE 29
  • Model. 2 strings: An example

b r e c h e n b r a c h t S1 = S2 = F S2 S1

Transducer F computes score by looking at all alignments

F =13.26 b r e c h e n ε b r a c h εεt n t

slide-30
SLIDE 30

Factor Graph:

  • Model. Factor graph examples

Pr(s1, s2) = F1 S2 S1 1/Z x F1(s1, s2)

slide-31
SLIDE 31

Factor Graph:

  • Model. Factor graph examples

Pr(s1, s2, s3) = F1 S2 S1 1/Z x F1(s1, s2) x F2(s1, s3) S3 F2

slide-32
SLIDE 32

Factor Graph:

  • Model. Factor graph examples

Pr(s1, s2, s3, s4) = F1 S2 S1 1/Z x F1(s1, s2) x F2(s1, s3) x F3(s1, s4) S3 F2 F3 S4

slide-33
SLIDE 33

Factor Graph:

  • Model. Factor graph examples

Pr(s1, s2, s3, s4) = F1 S2 S1 F4 1/Z x F1(s1, s2) x F2(s1, s3) x F3(s1, s4) x F4(s2, s3) S3 F2 F3 S4

slide-34
SLIDE 34

Factor Graph:

  • Model. Factor graph examples

Pr(s1, s2, s3, s4) = F1 S2 S1 F4 F5 1/Z x F1(s1, s2) x F2(s1, s3) x F3(s1, s4) x F4(s2, s3) x F5(s3, s4) S3 F2 F3 S4

slide-35
SLIDE 35

Factor Graph:

  • Model. Factor graph examples

Pr(s1, s2, s3, s4) = F1 S2 S1 F4 F5 1/Z x F1(s1, s2) x F2(s1, s3) x F3(s1, s4) x F4(s2, s3) x F5(s3, s4) F6 S3 F2 F3 S4 x F6(s2, s4)

slide-36
SLIDE 36
  • Model. Summary
  • Our model is formally an undirected graphical

model,

  • in which the variables are string-valued,

and the factors (potential functions) are finite-state transducers.

slide-37
SLIDE 37
  • Model. Less formal description

To model multiple strings and their various interactions, we

  • build many finite-state transducers, like

the ones we presented last year,

  • have each of them look at a different string pair,
  • plug them together into a big network,
  • and coordinate them to predict all strings

jointly.

slide-38
SLIDE 38
  • Model. Comparison with k-tape FSM
  • Model k strings with a k-tape finite-state machine?

b r e εchenε b r εachεεt b r εachenε b r εachεεε F F S1 S2 S3 S4

slide-39
SLIDE 39
  • Model. Comparison with k-tape FSM
  • Model k strings with a k-tape finite-state machine?
  • >26k arcs, intractable!

b r e εchenε b r εachεεt b r εachenε b r εachεεε F F S1 S2 S3 S4

Multiple-sequence alignment

slide-40
SLIDE 40
  • Model. Comparison with k-tape FSM
  • Model k strings with a k-tape finite-state machine?
  • >26k arcs, intractable!

b r e εchenε b r εachεεt b r εachenε b r εachεεε F F S1 S2 S3 S4

Multiple-sequence alignment

  • Factored model more powerful:
  • Encode swaps and other useful models
  • Encode undecidable models

☺ ☹

slide-41
SLIDE 41

Overview

  • Motivation
  • Model
  • Inference & Approximations
  • Experiments
  • Conclusions
slide-42
SLIDE 42
  • Inference. Overview

Factor Graph: F1 S2 S1 F4 F5 F6 S3 F2 F3 S4

slide-43
SLIDE 43
  • Inference. Overview

Factor Graph: F1 S2 S1 F4 F5 F6 S3 F2 F3 S4

  • We run Belief

Propagation (BP)

slide-44
SLIDE 44
  • Inference. Overview

Factor Graph: F1 S2 S1 F4 F5 F6 S3 F2 F3 S4

  • We run Belief

Propagation (BP)

  • BP is a message-passing

algorithm, a generalization of forward-backward.

slide-45
SLIDE 45
  • Inference. Overview

Factor Graph: F1 S2 S1 F4 F5 F6 S3 F2 F3 S4

  • We run Belief

Propagation (BP)

  • BP is a message-passing

algorithm, a generalization of forward-backward.

  • BP computes marginals
slide-46
SLIDE 46
  • Inference. Overview

Factor Graph: F1 S2 S1 F4 F5 F6 S3 F2 F3 S4

  • We run Belief

Propagation (BP)

  • BP is a message-passing

algorithm, a generalization of forward-backward.

  • BP computes marginals

In our version of BP, all messages and beliefs are finite-state machines, which is novel.

slide-47
SLIDE 47

S2 F1 S1

  • Inference. Multiple strings

brechen

Example:

slide-48
SLIDE 48

S2 F1 S1

  • Inference. Multiple strings

brechen

Example:

0.20 bracht 0.13 brechtet 0.08 brachtet ...

predict

( w h

  • l

e d i s t r i b u t i

  • n

)

slide-49
SLIDE 49

S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

  • Inference. Multiple strings

brechen brechen brechen

0.20 bracht 0.13 brechtet 0.08 brachtet ...

Example:

( w h

  • l

e d i s t r i b u t i

  • n

)

slide-50
SLIDE 50

S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

  • Inference. Multiple strings

brechen brechen brechen

0.20 bracht 0.13 brechtet 0.08 brachtet ...

0.27 brachen 0.07 brechten ...

Example:

( w h

  • l

e d i s t r i b u t i

  • n

)

slide-51
SLIDE 51

S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

  • Inference. Multiple strings

brechen brechen brechen

0.20 bracht 0.13 brechtet 0.08 brachtet ... 0.09 brach 0.03 brech 0.02 brich ...

0.27 brachen 0.07 brechten ...

Example:

( w h

  • l

e d i s t r i b u t i

  • n

)

slide-52
SLIDE 52

S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

  • Inference. Multiple strings

brechen brechen brechen

0.20 bracht 0.13 brechtet 0.08 brachtet ... 0.23 brachten 0.18 brachen 0.11 brechten ... 0.09 brach 0.03 brech 0.02 brich ...

0.27 brachen 0.07 brechten ...

Example:

( w h

  • l

e d i s t r i b u t i

  • n

)

slide-53
SLIDE 53

S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

  • Inference. Multiple strings

brechen brechen brechen

0.20 bracht 0.13 brechtet 0.08 brachtet ... 0.23 brachten 0.18 brachen 0.11 brechten ... 0.09 brach 0.03 brech 0.02 brich ... 0.12 brachen 0.07 brechen 0.01 brichen ...

0.27 brachen 0.07 brechten ...

Example:

( w h

  • l

e d i s t r i b u t i

  • n

)

slide-54
SLIDE 54

S2 F1 S1 F2 S1 F4 F5 S1 S4 F3

  • Inference. Multiple strings

brechen brechen brechen

0.20 bracht 0.13 brechtet 0.08 brachtet ... 0.23 brachten 0.18 brachen 0.11 brechten ... 0.09 brach 0.03 brech 0.02 brich ... 0.12 brachen 0.07 brechen 0.01 brichen ...

0.27 brachen 0.07 brechten ...

F3

0.23 brachten 0.18 brachen 0.11 brechten ... 0.12 brachen 0.07 brechen 0.01 brichen ...

0.27 brachen 0.07 brechten ...

S3 Example:

slide-55
SLIDE 55

S2 F1 S1 F2 S1 F4 F5 S1 S4 F3

  • Inference. Multiple strings

brechen brechen brechen

0.20 bracht 0.13 brechtet 0.08 brachtet ... 0.23 brachten 0.18 brachen 0.11 brechten ... 0.09 brach 0.03 brech 0.02 brich ... 0.12 brachen 0.07 brechen 0.01 brichen ...

0.27 brachen 0.07 brechten ...

F3

0.23 brachten 0.18 brachen 0.11 brechten ... 0.12 brachen 0.07 brechen 0.01 brichen ...

0.27 brachen 0.07 brechten ...

S3 Example:

slide-56
SLIDE 56

S2 F1 S1 F2 S1 F4 F5 S1 S4 F3

  • Inference. Multiple strings

brechen brechen brechen

0.20 bracht 0.13 brechtet 0.08 brachtet ... 0.23 brachten 0.18 brachen 0.11 brechten ... 0.09 brach 0.03 brech 0.02 brich ... 0.12 brachen 0.07 brechen 0.01 brichen ...

0.27 brachen 0.07 brechten ...

F3

0.23 brachten 0.18 brachen 0.11 brechten ... 0.12 brachen 0.07 brechen 0.01 brichen ...

0.27 brachen 0.07 brechten ...

S3

Decoding output for S3 (consensus): brachen

Example:

slide-57
SLIDE 57

S2 F1 S1 F2 S1 F4 F5 S1 S4 F3

  • Inference. Multiple strings

brechen brechen brechen

0.27 brachen 0.07 brechten ...

F3 S3

  • Each message is a finite-

state acceptor!

  • Intersect all

incoming messages Example:

slide-58
SLIDE 58
  • Inference. Multiple strings

S3

slide-59
SLIDE 59
  • Inference. Multiple strings

S3

but in CRFs: simple lookup tables, not finite-state machines!

α β word tag ... ...

Similar to inference in CRFs:

slide-60
SLIDE 60
  • Inference. Multiple strings

S3

slide-61
SLIDE 61
  • Inference. Multiple strings

S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

0.20 bracht 0.13 brechtet 0.08 brachtet ...

b r e c h e n S1 = (observed)

slide-62
SLIDE 62
  • Inference. Multiple strings

b r e c h e n S1 = (observed) e e b r e c h e n ε b r a c h εεt e ε ε a e e e t n ε ε e n t S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

send S1 through transducer F1

0.20 bracht 0.13 brechtet 0.08 brachtet ...

weighted finite-state acceptor

slide-63
SLIDE 63
  • Inference. Multiple strings

b r e c h e n S1 = (observed) e e b r e c h e n ε b r a c h εεt e ε ε a e e e t n ε ε e n t S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

send S1 through transducer F1

weighted finite-state acceptor

slide-64
SLIDE 64
  • Inference. Multiple strings

b r e c h e n S1 = (observed) e e b r e c h e n ε b r a c h εεt e ε ε a e e e t n ε ε e n t S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3 F1 S1

circle means composition

Step 1:

slide-65
SLIDE 65
  • Inference. Multiple strings

b r e c h e n S1 = (observed) e e b r e c h e n ε b r a c h εεt e ε ε a e e e t n ε ε e n t S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3 range( F1) S1 Step 2:

slide-66
SLIDE 66
  • Inference. Multiple strings

b r e c h e n S1 = (observed) e e b r e c h e n ε b r a c h εεt e ε ε a e e e t n ε ε e n t S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3 range( F1) S1 Step 2:

slide-67
SLIDE 67
  • Inference. Multiple strings
  • What happens if the factor

graph has loops?

  • Messages not

independent of each

  • ther anymore
  • Send anyway, iterate!
  • Obtained beliefs are only

approximate b r e c h e n S1 = (observed) Factor Graph: S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

slide-68
SLIDE 68
  • Inference. Multiple strings
  • What happens if the factor

graph has loops?

  • Messages not

independent of each

  • ther anymore
  • Send anyway, iterate!
  • Obtained beliefs are only

approximate b r e c h e n S1 = (observed) Factor Graph: S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

slide-69
SLIDE 69
  • Inference. Multiple strings
  • What happens if the factor

graph has loops?

  • Messages not

independent of each

  • ther anymore
  • Send anyway, iterate!
  • Obtained beliefs are only

approximate b r e c h e n S1 = (observed) Factor Graph: S2 F1 S1 S3 F2 S1 F4 F5 S1 S4 F3

slide-70
SLIDE 70
  • Inference. Approximations

b r e c h e n = S1 F S2 S1 Factor Graph: ... send S1 through transducer F: e e b r e c h e n ε b r a c h εεt e ε ε a e e e t n ε ε e n t

slide-71
SLIDE 71
  • Inference. Approximations
  • A message becomes bigger when sent through

transducer!

b r e c h e n = S1 F S2 S1 Factor Graph: ... send S1 through transducer F: e e b r e c h e n ε b r a c h εεt e ε ε a e e e t n ε ε e n t

slide-72
SLIDE 72
  • Inference. Approximations
  • A message becomes bigger when sent through

transducer!

  • And we keep sending from transducer to

transducer, so messages keep growing in size!

b r e c h e n = S1 F S2 S1 Factor Graph: ... send S1 through transducer F: e e b r e c h e n ε b r a c h εεt e ε ε a e e e t n ε ε e n t

slide-73
SLIDE 73
  • Inference. Approximations
  • A message becomes bigger when sent through

transducer!

  • And we keep sending from transducer to

transducer, so messages keep growing in size!

b r e c h e n = S1 F S2 S1 Factor Graph: ... send S1 through transducer F:

slide-74
SLIDE 74
  • Determinize does not help
  • Solution: Approximate messages!
  • n-gram approximation
  • k-best-paths approximation
  • mixture model: use both!

Li, Eisner & Khudanpur, 2009

  • Inference. Approximations

In our experiments: k=1000, n=0

slide-75
SLIDE 75

Overview

  • Motivation
  • Model
  • Inference & Approximations
  • Experiments
  • Conclusions
slide-76
SLIDE 76

Experiments

? ? ? ? ? ?

Task: Reconstruct missing word forms in morphological paradigms (German) Missing: forms that occur rarely in free text (i.e.

frequency count <10 in CELEX)

slide-77
SLIDE 77

Experiments

  • Train model parameters from the observed

forms in the 9393 paradigms (piecewise training)

  • Task: Exactly reconstruct all missing forms!
  • Use 100 paradigms as dev set for model

selection, evaluate on remaining 9293 paradigms

slide-78
SLIDE 78

Experiments

(U)

1

Pres Past Singular Plural

2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

(C1) (C2) (C3)

1

2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

(C4)

1 2 3

1,3

2

1,3

2 2

1,3

(T1)

1 2 3

1,3

2

1,3

2 2

1,3

(T2) (L1)

1 2 3

1,3

2

1,3

2 2

1,3

(L2)

1 2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

(L3)

Pres Past Pres Past Pres Past Pres Past Singular Plural

1 2 3

1,3

2

1,3

2 2

1,3

(L4)

slide-79
SLIDE 79

Experiments

(U)

1

Pres Past Singular Plural

2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

(C1) (C2) (C3)

1

2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

(C4)

1 2 3

1,3

2

1,3

2 2

1,3

(T1)

1 2 3

1,3

2

1,3

2 2

1,3

(T2) (L1)

1 2 3

1,3

2

1,3

2 2

1,3

(L2)

1 2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

(L3)

Pres Past Pres Past Pres Past Pres Past Singular Plural

1 2 3

1,3

2

1,3

2 2

1,3

(L4)

69.0 72.9 73.4 74.8 65.2

slide-80
SLIDE 80

Experiments

(U)

1

Pres Past Singular Plural

2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

(C1) (C2) (C3)

1

2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

(C4)

1 2 3

1,3

2

1,3

2 2

1,3

(T1)

1 2 3

1,3

2

1,3

2 2

1,3

(T2) (L1)

1 2 3

1,3

2

1,3

2 2

1,3

(L2)

1 2 3

1,3

2

1,3

2 2

1,3

1 2 3

1,3

2

1,3

2 2

1,3

(L3)

Pres Past Pres Past Pres Past Pres Past Singular Plural

1 2 3

1,3

2

1,3

2 2

1,3

(L4)

69.0 72.9 73.4 74.8 65.2 78.1 78.7 62.3 79.6 78.9 82.1

slide-81
SLIDE 81

Experiments

Moses baselines

Un- connected Loopy

slide-82
SLIDE 82

Experiments

All

Moses baselines

Un- connected Loopy

more

  • bser-

vations

slide-83
SLIDE 83

Experiments

All

Moses baselines

Un- connected Loopy

more

  • bser-

vations

slide-84
SLIDE 84

Experiments

All

Moses baselines

Un- connected Loopy

more

  • bser-

vations

slide-85
SLIDE 85

Overview

  • Motivation
  • Model
  • Inference & Approximations
  • Experiments
  • Conclusions
slide-86
SLIDE 86

Conclusions

  • Jointly predict multiple interdependent strings
  • Undirected graphical model over strings

(variables: strings, factors: finite-state machines)

  • Belief propagation with finite-state messages
  • Approximations:
  • Loopy BP
  • Approximate messages to prevent blowup
  • Showed results in morphology, potentially useful for

many other string tasks (transliteration, cognate modeling, ...)

slide-87
SLIDE 87

Conclusions

  • Jointly predict multiple interdependent strings
  • Undirected graphical model over strings

(variables: strings, factors: finite-state machines)

  • Belief propagation with finite-state messages
  • Approximations:
  • Loopy BP
  • Approximate messages to prevent blowup
  • Showed results in morphology, potentially useful for

many other string tasks (transliteration, cognate modeling, ...)

General idea: Coordinate NLP models (FSTs, PCFGs, ...) by using them as factors in graphical models!