Dual Decomposition for Marginal Inference Justin Domke Rochester - - PowerPoint PPT Presentation

dual decomposition for marginal inference
SMART_READER_LITE
LIVE PREVIEW

Dual Decomposition for Marginal Inference Justin Domke Rochester - - PowerPoint PPT Presentation

Introduction Dual Decomposition Experimental Results Conclusions Dual Decomposition for Marginal Inference Justin Domke Rochester Institute of Technology AAAI 2011 Introduction Dual Decomposition Experimental Results Conclusions Outline


slide-1
SLIDE 1

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition for Marginal Inference

Justin Domke

Rochester Institute of Technology

AAAI 2011

slide-2
SLIDE 2

Introduction Dual Decomposition Experimental Results Conclusions

Outline

Introduction Graphical Models Motivation Wait a Second Dual Decomposition Dual Decomposition in General Dual Decomposition for Marginal Inference Experimental Results Experiments Conclusions Conclusions

slide-3
SLIDE 3

Introduction Dual Decomposition Experimental Results Conclusions

Outline

Introduction Graphical Models Motivation Wait a Second Dual Decomposition Dual Decomposition in General Dual Decomposition for Marginal Inference Experimental Results Experiments Conclusions Conclusions

slide-4
SLIDE 4

Introduction Dual Decomposition Experimental Results Conclusions

Graphical Models

  • Markov Random Field / Factor Graph:

p(x) ∝ ∏

c

ψ(xc)

slide-5
SLIDE 5

Introduction Dual Decomposition Experimental Results Conclusions

Graphical Models

c1 = {1,2,3}, c2 = {3,4}, c3 = {4,5,6} p(x) ∝ ∏

c

ψ(xc) = ψ(x1,x2,x3)ψ(x3,x4)ψ(x4,x5,x6)

slide-6
SLIDE 6

Introduction Dual Decomposition Experimental Results Conclusions

Marginal Inference

  • Want to recover p(Xi = xi).
  • Brute-force sum: Define ˆ

p(x) = ∏c ψ(xc) P(Xi = xi) = 1 Z ∑

x1

... ∑

xi−1 ∑ xi+1

...∑

xM

ˆ p(x) Z = ∑

x1

...∑

xM

ˆ p(x)

  • On trees, can do sums quickly by dynamic programming.
  • Sum-product algorithm / belief propagation
  • #P-hard
  • Approximate: Tree-reweighted belief propagation (TRW)
  • This paper: Same approximation as TRW, different algorithm.
slide-7
SLIDE 7

Introduction Dual Decomposition Experimental Results Conclusions

Marginal Inference

  • Want to recover p(Xi = xi).
  • Brute-force sum: Define ˆ

p(x) = ∏c ψ(xc) P(Xi = xi) = 1 Z ∑

x1

... ∑

xi−1 ∑ xi+1

...∑

xM

ˆ p(x) Z = ∑

x1

...∑

xM

ˆ p(x)

  • On trees, can do sums quickly by dynamic programming.
  • Sum-product algorithm / belief propagation
  • #P-hard
  • Approximate: Tree-reweighted belief propagation (TRW)
  • This paper: Same approximation as TRW, different algorithm.
slide-8
SLIDE 8

Introduction Dual Decomposition Experimental Results Conclusions

Marginal Inference

  • Want to recover p(Xi = xi).
  • Brute-force sum: Define ˆ

p(x) = ∏c ψ(xc) P(Xi = xi) = 1 Z ∑

x1

... ∑

xi−1 ∑ xi+1

...∑

xM

ˆ p(x) Z = ∑

x1

...∑

xM

ˆ p(x)

  • On trees, can do sums quickly by dynamic programming.
  • Sum-product algorithm / belief propagation
  • #P-hard
  • Approximate: Tree-reweighted belief propagation (TRW)
  • This paper: Same approximation as TRW, different algorithm.
slide-9
SLIDE 9

Introduction Dual Decomposition Experimental Results Conclusions

Marginal Inference

  • Want to recover p(Xi = xi).
  • Brute-force sum: Define ˆ

p(x) = ∏c ψ(xc) P(Xi = xi) = 1 Z ∑

x1

... ∑

xi−1 ∑ xi+1

...∑

xM

ˆ p(x) Z = ∑

x1

...∑

xM

ˆ p(x)

  • On trees, can do sums quickly by dynamic programming.
  • Sum-product algorithm / belief propagation
  • #P-hard
  • Approximate: Tree-reweighted belief propagation (TRW)
  • This paper: Same approximation as TRW, different algorithm.
slide-10
SLIDE 10

Introduction Dual Decomposition Experimental Results Conclusions

Outline

Introduction Graphical Models Motivation Wait a Second Dual Decomposition Dual Decomposition in General Dual Decomposition for Marginal Inference Experimental Results Experiments Conclusions Conclusions

slide-11
SLIDE 11

Introduction Dual Decomposition Experimental Results Conclusions

Motivation

  • TRW Convergence rates can be very slow.
  • If lucky, TRW = block coordate ascent on dual.
  • TRW may fail to converge.
  • Damping converges in practice, slower.
  • Recent alternatives guarantee convergence.

[Hazan & Shashua 2009, Globerson & Jaakkola 2007b]

  • Not claimed faster than TRW. TRW-S [Meltzer et al. 2009] is

an exception.

  • This paper: use a quasi-newton method on dual.
  • Line searches guarantee convergence.
  • Hopefully, faster convergence.
slide-12
SLIDE 12

Introduction Dual Decomposition Experimental Results Conclusions

Motivation

  • TRW Convergence rates can be very slow.
  • If lucky, TRW = block coordate ascent on dual.
  • TRW may fail to converge.
  • Damping converges in practice, slower.
  • Recent alternatives guarantee convergence.

[Hazan & Shashua 2009, Globerson & Jaakkola 2007b]

  • Not claimed faster than TRW. TRW-S [Meltzer et al. 2009] is

an exception.

  • This paper: use a quasi-newton method on dual.
  • Line searches guarantee convergence.
  • Hopefully, faster convergence.
slide-13
SLIDE 13

Introduction Dual Decomposition Experimental Results Conclusions

Motivation

  • TRW Convergence rates can be very slow.
  • If lucky, TRW = block coordate ascent on dual.
  • TRW may fail to converge.
  • Damping converges in practice, slower.
  • Recent alternatives guarantee convergence.

[Hazan & Shashua 2009, Globerson & Jaakkola 2007b]

  • Not claimed faster than TRW. TRW-S [Meltzer et al. 2009] is

an exception.

  • This paper: use a quasi-newton method on dual.
  • Line searches guarantee convergence.
  • Hopefully, faster convergence.
slide-14
SLIDE 14

Introduction Dual Decomposition Experimental Results Conclusions

Ising Model

  • xi ∈ {−1,+1}
  • p(x) ∝ ∏ij exp
  • θ(xi,xj)
  • ∏i exp(θ(xi)
  • θ(xi) = αFxi,

αF ∈ [−1,+1]

  • θ(xi,xj) = αIxixj,

αI ∈ [0,T] for various T

slide-15
SLIDE 15

Introduction Dual Decomposition Experimental Results Conclusions

θ(xi,xj) = αIxixj, αI ∈ [0,1]

slide-16
SLIDE 16

Introduction Dual Decomposition Experimental Results Conclusions

θ(xi,xj) = αIxixj, αI ∈ [0,1]

20 40 60 80 100 10

−6

10

−4

10

−2

10 iters |µ−µ*|∞ trw

slide-17
SLIDE 17

Introduction Dual Decomposition Experimental Results Conclusions

θ(xi,xj) = αIxixj, αI ∈ [0,1]

20 40 60 80 100 10

−6

10

−4

10

−2

10 iters |µ−µ*|∞ trw dual decomp

slide-18
SLIDE 18

Introduction Dual Decomposition Experimental Results Conclusions

θ(xi,xj) = αIxixj, αI ∈ [0,3]

slide-19
SLIDE 19

Introduction Dual Decomposition Experimental Results Conclusions

θ(xi,xj) = αIxixj, αI ∈ [0,3]

2000 4000 6000 8000 10000 10

−6

10

−4

10

−2

10 iters |µ−µ*|∞ trw

slide-20
SLIDE 20

Introduction Dual Decomposition Experimental Results Conclusions

θ(xi,xj) = αIxixj, αI ∈ [0,3]

2000 4000 6000 8000 10000 10

−6

10

−4

10

−2

10 iters |µ−µ*|∞ trw dual decomp

slide-21
SLIDE 21

Introduction Dual Decomposition Experimental Results Conclusions

θ(xi,xj) = αIxixj, αI ∈ [0,5]

slide-22
SLIDE 22

Introduction Dual Decomposition Experimental Results Conclusions

θ(xi,xj) = αIxixj, αI ∈ [0,5]

2000 4000 6000 8000 10000 10

−6

10

−4

10

−2

10 iters |µ−µ*|∞ trw

slide-23
SLIDE 23

Introduction Dual Decomposition Experimental Results Conclusions

θ(xi,xj) = αIxixj, αI ∈ [0,5]

2000 4000 6000 8000 10000 10

−6

10

−4

10

−2

10 iters |µ−µ*|∞ trw dual decomp

slide-24
SLIDE 24

Introduction Dual Decomposition Experimental Results Conclusions

Outline

Introduction Graphical Models Motivation Wait a Second Dual Decomposition Dual Decomposition in General Dual Decomposition for Marginal Inference Experimental Results Experiments Conclusions Conclusions

slide-25
SLIDE 25

Introduction Dual Decomposition Experimental Results Conclusions

Wait a Second

Question: Why should I care about very accurately computing approximate marginals!? Answer: You might not. One reason to care:

  • Number of iterations TRW needs for reasonable results is not

easy to predict.

slide-26
SLIDE 26

Introduction Dual Decomposition Experimental Results Conclusions

Wait a Second

Question: Why should I care about very accurately computing approximate marginals!? Answer: You might not. One reason to care:

  • Number of iterations TRW needs for reasonable results is not

easy to predict.

slide-27
SLIDE 27

Introduction Dual Decomposition Experimental Results Conclusions

Why I Care

Want to fit a CRF with some loss L(θ) = M(µ(θ)). Algorithm (Domke, 2010):

  • 1. Get µ by running TRW with parameters θ.
  • 2. Compute dM(µ)

  • 3. Get µ+ by running TRW with parameters θ +r dM

  • 4. dL

dθ ≈ 1 r

  • µ+ − µ
  • Strong convergence needed for difference µ+ − µ to be

meaniningful.

slide-28
SLIDE 28

Introduction Dual Decomposition Experimental Results Conclusions

Why I Care

Want to fit a CRF with some loss L(θ) = M(µ(θ)). Algorithm (Domke, 2010):

  • 1. Get µ by running TRW with parameters θ.
  • 2. Compute dM(µ)

  • 3. Get µ+ by running TRW with parameters θ +r dM

  • 4. dL

dθ ≈ 1 r

  • µ+ − µ
  • Strong convergence needed for difference µ+ − µ to be

meaniningful.

slide-29
SLIDE 29

Introduction Dual Decomposition Experimental Results Conclusions

Why I Care

Want to fit a CRF with some loss L(θ) = M(µ(θ)). Algorithm (Domke, 2010):

  • 1. Get µ by running TRW with parameters θ.
  • 2. Compute dM(µ)

  • 3. Get µ+ by running TRW with parameters θ +r dM

  • 4. dL

dθ ≈ 1 r

  • µ+ − µ
  • Strong convergence needed for difference µ+ − µ to be

meaniningful.

slide-30
SLIDE 30

Introduction Dual Decomposition Experimental Results Conclusions

Outline

Introduction Graphical Models Motivation Wait a Second Dual Decomposition Dual Decomposition in General Dual Decomposition for Marginal Inference Experimental Results Experiments Conclusions Conclusions

slide-31
SLIDE 31

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition with Two subproblems

max

x f (x)+g(x)

  • Can quickly and exactly maximize f (x)+a·x.
  • Can quickly and exactly maximize g(x)+b·x.
slide-32
SLIDE 32

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition with Two subproblems

  • Transform max

x f (x)+g(x) to a constrained problem:

max

x,y

f (x)+g(y) s.t. x = y

  • Leads to dual problem:

min

λ h(λ),

h(λ) = max

x f (x)+λ ·x

+ max

y g(y)−λ ·y

slide-33
SLIDE 33

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition with Two subproblems

  • Transform max

x f (x)+g(x) to a constrained problem:

max

x,y

f (x)+g(y) s.t. x = y

  • Leads to dual problem:

min

λ h(λ),

h(λ) = max

x f (x)+λ ·x

+ max

y g(y)−λ ·y

slide-34
SLIDE 34

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition with Two subproblems

min

λ h(λ)

max

x

f (x)+λ ·x max

x

g(x)−λ ·x

slide-35
SLIDE 35

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition with N subproblems

max

x N

i=1

fi(x)

  • Can quickly and exactly maximize fi(x)+ai ·x, for all i.
slide-36
SLIDE 36

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition with N subproblems

  • Transform max

x ∑ i

fi(x) to a constrained problem: max

{xi }

i

fi(xi) s.t. xi = 1 N ∑

j

xj

  • Leads to dual problem:

min

λ h(λ),

h(λ) = ∑

i

hi(λ) hi(λ) = max

xi fi(xi)+(λ i − 1

N ∑

i

λ j)·xi

slide-37
SLIDE 37

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition with N subproblems

  • Transform max

x ∑ i

fi(x) to a constrained problem: max

{xi }

i

fi(xi) s.t. xi = 1 N ∑

j

xj

  • Leads to dual problem:

min

λ h(λ),

h(λ) = ∑

i

hi(λ) hi(λ) = max

xi fi(xi)+(λ i − 1

N ∑

i

λ j)·xi

slide-38
SLIDE 38

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition with N subproblems

min

λ h(λ)

max

x f ′ 1(x,λ)

max

x f ′ 2(x,λ)

max

x f ′ 3(x,λ)

f ′

i (x,λ) = fi(xi)+(λ i − 1

N ∑

i

λ j)·xi

slide-39
SLIDE 39

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition with N subproblems

  • Has been used extensively for MAP inference.
  • h(λ) is non-differentiable.
  • For marginal inference, h(λ) is differentiable, convex.
slide-40
SLIDE 40

Introduction Dual Decomposition Experimental Results Conclusions

Outline

Introduction Graphical Models Motivation Wait a Second Dual Decomposition Dual Decomposition in General Dual Decomposition for Marginal Inference Experimental Results Experiments Conclusions Conclusions

slide-41
SLIDE 41

Introduction Dual Decomposition Experimental Results Conclusions

Variational Inference

Can represent a graphical model in exponential family: p(x;θ) = exp

  • f(x)·θ −A(θ)
  • , A(θ) = log∑

x

exp

  • f(x)·θ
  • Can compute A as [Wainwright and Jordan]

A(θ) = max

µ∈M θ · µ +H(µ)

  • M is marginal polytope (hard).
  • H is entropy (hard).
slide-42
SLIDE 42

Introduction Dual Decomposition Experimental Results Conclusions

Variational Inference

Exact inference: A(θ) = max

µ∈M θ · µ +H(µ)

TRW approximation: B(θ) = max

µ∈L θ · µ +∑ T

ρTH(µ(T))

  • L - is marginal polytope (easy)
  • H(µ(T)) - entropy of marginals projected onto tree T (easy)

Our problem: how to compute B?

slide-43
SLIDE 43

Introduction Dual Decomposition Experimental Results Conclusions

Variational Inference

Exact inference: A(θ) = max

µ∈M θ · µ +H(µ)

TRW approximation: B(θ) = max

µ∈L θ · µ +∑ T

ρTH(µ(T))

  • L - is marginal polytope (easy)
  • H(µ(T)) - entropy of marginals projected onto tree T (easy)

Our problem: how to compute B?

slide-44
SLIDE 44

Introduction Dual Decomposition Experimental Results Conclusions

Variational Inference

Exact inference: A(θ) = max

µ∈M θ · µ +H(µ)

TRW approximation: B(θ) = max

µ∈L θ · µ +∑ T

ρTH(µ(T))

  • L - is marginal polytope (easy)
  • H(µ(T)) - entropy of marginals projected onto tree T (easy)

Our problem: how to compute B?

slide-45
SLIDE 45

Introduction Dual Decomposition Experimental Results Conclusions

Variational Inference

Exact inference: A(θ) = max

µ∈M θ · µ +H(µ)

TRW approximation: B(θ) = max

µ∈L θ · µ +∑ T

ρTH(µ(T))

  • L - is marginal polytope (easy)
  • H(µ(T)) - entropy of marginals projected onto tree T (easy)

Our problem: how to compute B?

slide-46
SLIDE 46

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition for Marginal Inference

TRW approximation: B(θ) = max

µ∈L θ · µ +∑ T

ρTH(µ(T)) Theorem (main result): B(θ) = min

{θT }

h({θ T}) s.t. ∑

T:a∈T

θT

a =θa

h({θ T}) = ∑

T

BT(θ T) BT(θ T) = max

µT ∈MT

θ T · µT +ρTHT (µT) BT(θ T) is computable by running regular sum-product algorithm.

slide-47
SLIDE 47

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition for Marginal Inference

TRW approximation: B(θ) = max

µ∈L θ · µ +∑ T

ρTH(µ(T)) Theorem (main result): B(θ) = min

{θT }

h({θ T}) s.t. ∑

T:a∈T

θT

a =θa

h({θ T}) = ∑

T

BT(θ T) BT(θ T) = max

µT ∈MT

θ T · µT +ρTHT (µT) BT(θ T) is computable by running regular sum-product algorithm.

slide-48
SLIDE 48

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition for Marginal Inference

TRW approximation: B(θ) = max

µ∈L θ · µ +∑ T

ρTH(µ(T)) Theorem (main result): B(θ) = min

{θT }

h({θ T}) s.t. ∑

T:a∈T

θT

a =θa

h({θ T}) = ∑

T

BT(θ T) BT(θ T) = max

µT ∈MT

θ T · µT +ρTHT (µT) BT(θ T) is computable by running regular sum-product algorithm.

slide-49
SLIDE 49

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition for Marginal Inference

min

{θT }

h({θ T}) max

µT ∈MT

f T(θ T,µT) max

µT ∈MT

f T(θ T,µT) f T(θ T,µT) = θ T · µT +ρTHT(µT)

slide-50
SLIDE 50

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition for Marginal Inference

slide-51
SLIDE 51

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomposition for Marginal Inference

Inference: Plug min

{θT }∑ T

BT(θ T) into L-BFGS.

  • Guarantees convergence. (Line searches)
  • Fast convergence rates.
slide-52
SLIDE 52

Introduction Dual Decomposition Experimental Results Conclusions

Outline

Introduction Graphical Models Motivation Wait a Second Dual Decomposition Dual Decomposition in General Dual Decomposition for Marginal Inference Experimental Results Experiments Conclusions Conclusions

slide-53
SLIDE 53

Introduction Dual Decomposition Experimental Results Conclusions

Ising Model

  • xi ∈ {−1,+1}
  • p(x) ∝ ∏ij exp
  • θ(xi,xj)
  • ∏i exp(θ(xi)
  • θ(xi) = αFxi
  • θ(xi,xj) = αIxixj
slide-54
SLIDE 54

Introduction Dual Decomposition Experimental Results Conclusions

Algorithms

Algorithms Compared:

  • Dual Decomposition + L-BFGS
  • TRW
  • TRW with damping of 1/2 in the log-domain.
  • TRW-S [Meltzer et al. 2009]

Max of 105 iterations allowed.

slide-55
SLIDE 55

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomp vs. TRW

# Iterations, TRW

αI ∈ [0 1]

10 100 103 104 105 10 100 103 104 105 10−2 convergence 10−4 convergence 10−6 convergence 10−2 convergence 10−4 convergence 10−6 convergence # Iterations, Dual Decomposition 10 10 10

slide-56
SLIDE 56

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomp vs. TRW

αI ∈ [0 3]

# Iterations, TRW 10 100 103 104 105 10 100 103 104 105 10−2 convergence 10−4 convergence 10−6 convergence # Iterations, Dual Decomposition 10 10 10

slide-57
SLIDE 57

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomp vs. TRW

αI ∈ [0 9]

# Iterations, TRW 10 100 103 104 105 10 100 103 104 105 10−2 convergence 10−4 convergence 10−6 convergence # Iterations, Dual Decomposition 10 10 10

slide-58
SLIDE 58

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomp vs. TRW-damped

αI ∈ [0 1]

# Iterations, TRW−damped 10 100 103 104 105 10 100 103 104 105 10−2 convergence 10−4 convergence 10−6 convergence # Iterations, Dual Decomposition 10 10 10

slide-59
SLIDE 59

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomp vs. TRW-damped

αI ∈ [0 3]

# Iterations, TRW−damped 10 100 103 104 105 10 100 103 104 105 10−2 convergence 10−4 convergence 10−6 convergence # Iterations, Dual Decomposition 10 10 10

slide-60
SLIDE 60

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomp vs. TRW-damped

αI ∈ [0 9]

# Iterations, TRW−damped 10 100 103 104 105 10 100 103 104 105 10−2 convergence 10−4 convergence 10−6 convergence # Iterations, Dual Decomposition 10 10 10

slide-61
SLIDE 61

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomp vs. TRW-S

αI ∈ [0 1]

# Iterations, TRW−S # Iterations, Dual Decomposition 10 100 103 104 105 10 100 103 104 105 10−2 convergence 10−4 convergence 10−6 convergence

slide-62
SLIDE 62

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomp vs. TRW-S

αI ∈ [0 3]

# Iterations, TRW−S # Iterations, Dual Decomposition 10 100 103 104 105 10 100 103 104 105 10−2 convergence 10−4 convergence 10−6 convergence

slide-63
SLIDE 63

Introduction Dual Decomposition Experimental Results Conclusions

Dual Decomp vs. TRW-S

αI ∈ [0 9]

# Iterations, TRW−S # Iterations, Dual Decomposition 10 100 103 104 105 10 100 103 104 105 10−2 convergence 10−4 convergence 10−6 convergence

slide-64
SLIDE 64

Introduction Dual Decomposition Experimental Results Conclusions

Convergence

Convergence Level # Iterations .01 .001 10−4 10−5 10−6 1 10 100 103 104 105

αI ∈ [0 1]

Dual Decomposition TRW−damped TRW−S TRW Dual Decomposition TRW−damped TRW−S TRW

slide-65
SLIDE 65

Introduction Dual Decomposition Experimental Results Conclusions

Convergence

# Iterations Convergence Level .01 .001 10−4 10−5 10−6 1 10 100 103 104 105

αI ∈ [0 3]

Dual Decomposition TRW−damped TRW−S TRW

slide-66
SLIDE 66

Introduction Dual Decomposition Experimental Results Conclusions

Convergence

# Iterations Convergence Level .01 .001 10−4 10−5 10−6 1 10 100 103 104 105

αI ∈ [0 9]

Dual Decomposition TRW−damped TRW−S TRW

slide-67
SLIDE 67

Introduction Dual Decomposition Experimental Results Conclusions

Outline

Introduction Graphical Models Motivation Wait a Second Dual Decomposition Dual Decomposition in General Dual Decomposition for Marginal Inference Experimental Results Experiments Conclusions Conclusions

slide-68
SLIDE 68

Introduction Dual Decomposition Experimental Results Conclusions

Conclusions

  • Dual Decomposition
  • Faster on “hard” problems or if strong convergence needed.
  • Caveats
  • Not really faster on “easy” problems.
  • Restriction on tree distribution P(T).
slide-69
SLIDE 69

Introduction Dual Decomposition Experimental Results Conclusions

Conclusions

  • Dual Decomposition
  • Faster on “hard” problems or if strong convergence needed.
  • Caveats
  • Not really faster on “easy” problems.
  • Restriction on tree distribution P(T).
slide-70
SLIDE 70

Introduction Dual Decomposition Experimental Results Conclusions

Conclusions

Thank you Graphical models toolbox: people.rit.edu/jcdicsa/ (Dual decomposition coming soon.)