Frank-Wolfe Algorithms for Saddle Point problems Gauthier Gidel 1 - - PowerPoint PPT Presentation

frank wolfe algorithms for saddle point problems
SMART_READER_LITE
LIVE PREVIEW

Frank-Wolfe Algorithms for Saddle Point problems Gauthier Gidel 1 - - PowerPoint PPT Presentation

Frank-Wolfe Algorithms for Saddle Point problems Gauthier Gidel 1 Tony Jebara 2 Simon Lacoste-Julien 3 1 INRIA Paris, Sierra Team 2 Department of CS, Columbia University 3 Department of CS & OR (DIRO) Universit de Montral 10th December


slide-1
SLIDE 1

Frank-Wolfe Algorithms for Saddle Point problems

Gauthier Gidel1 Tony Jebara2 Simon Lacoste-Julien3

1INRIA Paris, Sierra Team 2Department of CS, Columbia University 3Department of CS & OR (DIRO) Université de Montréal

10th December 2016

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-2
SLIDE 2

Overview

◮ Frank-Wolfe algorithm (FW) gained in popularity in the

last couple of years.

◮ Main advantage: FW only needs LMO. ◮ Extend FW properties to solve saddle point problem. ◮ Straightforward extension but Non trivial analysis.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-3
SLIDE 3

Saddle point and link with variational inequalities

Let L : X × Y → R, where X and Y are convex and compact. Saddle point problem: solve min

x∈X max y∈Y L(x, y)

A solution (x∗, y∗) is called a Saddle Point.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-4
SLIDE 4

Saddle point and link with variational inequalities

Let L : X × Y → R, where X and Y are convex and compact. Saddle point problem: solve min

x∈X max y∈Y L(x, y)

A solution (x∗, y∗) is called a Saddle Point.

◮ Necessary stationary conditions:

x − x∗, ∇xL(x∗, y∗) ≥ 0

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-5
SLIDE 5

Saddle point and link with variational inequalities

Let L : X × Y → R, where X and Y are convex and compact. Saddle point problem: solve min

x∈X max y∈Y L(x, y)

A solution (x∗, y∗) is called a Saddle Point.

◮ Necessary stationary conditions:

x − x∗, ∇xL(x∗, y∗) ≥ 0 y − y∗, −∇yL(x∗, y∗) ≥ 0

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-6
SLIDE 6

Saddle point and link with variational inequalities

Let L : X × Y → R, where X and Y are convex and compact. Saddle point problem: solve min

x∈X max y∈Y L(x, y)

A solution (x∗, y∗) is called a Saddle Point.

◮ Necessary stationary conditions:

x − x∗, ∇xL(x∗, y∗) ≥ 0 y − y∗, −∇yL(x∗, y∗) ≥ 0

◮ Variational inequality:

∀z ∈ X × Y z − z∗, g(z∗) ≥ 0 where (x∗, y∗) = z∗ and g(z) = (∇xL(z), −∇yL(z))

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-7
SLIDE 7

Saddle point and link with variational inequalities

Let L : X × Y → R, where X and Y are convex and compact. Saddle point problem: solve min

x∈X max y∈Y L(x, y)

A solution (x∗, y∗) is called a Saddle Point.

◮ Necessary stationary conditions:

x − x∗, ∇xL(x∗, y∗) ≥ 0 y − y∗, −∇yL(x∗, y∗) ≥ 0

◮ Variational inequality:

∀z ∈ X × Y z − z∗, g(z∗) ≥ 0 where (x∗, y∗) = z∗ and g(z) = (∇xL(z), −∇yL(z))

◮ Sufficient condition: Global solution if L

convex-concave. ∀(x, y) ∈ X × Y x′ → L(x′, y) is convex and y′ → L(x, y′) is concave.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-8
SLIDE 8

Motivations: games and robust learning

◮ Zero-sum games with two players:

min

x∈∆(I) max y∈∆(J) x⊤My

  • 1J. Wen, C. Yu, and R. Greiner. “Robust Learning under Uncertain

Test Distributions: Relating Covariate Shift to Model Misspecification.” In: ICML. 2014.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-9
SLIDE 9

Motivations: games and robust learning

◮ Zero-sum games with two players:

min

x∈∆(I) max y∈∆(J) x⊤My ◮ Generative Adversarial Network (GAN)

  • 1J. Wen, C. Yu, and R. Greiner. “Robust Learning under Uncertain

Test Distributions: Relating Covariate Shift to Model Misspecification.” In: ICML. 2014.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-10
SLIDE 10

Motivations: games and robust learning

◮ Zero-sum games with two players:

min

x∈∆(I) max y∈∆(J) x⊤My ◮ Generative Adversarial Network (GAN) ◮ Robust learning:1 We want to learn

min

θ∈Θ

1 n

n

  • i=1

ℓ (fθ(xi), yi) + λΩ(θ) with an uncertainty regarding the data: min

θ∈Θ max w∈∆n n

  • i=1

ωiℓ (fθ(xi), yi) + λΩ(θ) Minimize the worst case → gives robustness

  • 1J. Wen, C. Yu, and R. Greiner. “Robust Learning under Uncertain

Test Distributions: Relating Covariate Shift to Model Misspecification.” In: ICML. 2014.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-11
SLIDE 11

Problem with Hard projection

The structured SVM: min

ω∈Rd λΩ(ω) + 1

n

n

  • i=1

max

y∈Yi (Li(y) − ω, φi(y))

  • structured hinge loss

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-12
SLIDE 12

Problem with Hard projection

The structured SVM: min

ω∈Rd λΩ(ω) + 1

n

n

  • i=1

max

y∈Yi (Li(y) − ω, φi(y))

  • structured hinge loss

Regularization: penalized → constrained. min

Ω(ω)≤β

max

α∈∆(|Y|) bT α − ωT Mα

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-13
SLIDE 13

Problem with Hard projection

The structured SVM: min

ω∈Rd λΩ(ω) + 1

n

n

  • i=1

max

y∈Yi (Li(y) − ω, φi(y))

  • structured hinge loss

Regularization: penalized → constrained. min

Ω(ω)≤β

max

α∈∆(|Y|) bT α − ωT Mα

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-14
SLIDE 14

Problem with Hard projection

The structured SVM: min

ω∈Rd λΩ(ω) + 1

n

n

  • i=1

max

y∈Yi (Li(y) − ω, φi(y))

  • structured hinge loss

Regularization: penalized → constrained. min

Ω(ω)≤β

max

α∈∆(|Y|) bT α − ωT Mα

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-15
SLIDE 15

Problem with Hard projection

The structured SVM: min

ω∈Rd λΩ(ω) + 1

n

n

  • i=1

max

y∈Yi (Li(y) − ω, φi(y))

  • structured hinge loss

Regularization: penalized → constrained. min

Ω(ω)≤β

max

α∈∆(|Y|) bT α − ωT Mα

Hard to project when:

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-16
SLIDE 16

Problem with Hard projection

The structured SVM: min

ω∈Rd λΩ(ω) + 1

n

n

  • i=1

max

y∈Yi (Li(y) − ω, φi(y))

  • structured hinge loss

Regularization: penalized → constrained. min

Ω(ω)≤β

max

α∈∆(|Y|) bT α − ωT Mα

Hard to project when:

◮ Structured sparsity norm (group lasso norm).

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-17
SLIDE 17

Problem with Hard projection

The structured SVM: min

ω∈Rd λΩ(ω) + 1

n

n

  • i=1

max

y∈Yi (Li(y) − ω, φi(y))

  • structured hinge loss

Regularization: penalized → constrained. min

Ω(ω)≤β

max

α∈∆(|Y|) bT α − ωT Mα

Hard to project when:

◮ Structured sparsity norm (group lasso norm). ◮ The output Y is structured: exponential size.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-18
SLIDE 18

Standard approaches in literature

Simplest algorithm to solve Saddle point problems is the projected gradient algorithm. x(t+1) = PX (x(t) − η∇xL(x(t), y(t))) y(t+1) = PY(y(t) + η∇yL(x(t), y(t))) For non-smooth optimization, 1 T

T

  • t=1
  • x(t), y(t)

− →

T→∞ (x∗, y∗)

  • 2N. He and Z. Harchaoui. “Semi-proximal Mirror-Prox for Nonsmooth

Composite Minimization”. In: NIPS. 2015.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-19
SLIDE 19

Standard approaches in literature

Simplest algorithm to solve Saddle point problems is the projected gradient algorithm. x(t+1) = PX (x(t) − η∇xL(x(t), y(t))) y(t+1) = PY(y(t) + η∇yL(x(t), y(t))) For non-smooth optimization, 1 T

T

  • t=1
  • x(t), y(t)

− →

T→∞ (x∗, y∗)

Faster algorithm: projected extra-gradient algorithm.

  • 2N. He and Z. Harchaoui. “Semi-proximal Mirror-Prox for Nonsmooth

Composite Minimization”. In: NIPS. 2015.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-20
SLIDE 20

Standard approaches in literature

Simplest algorithm to solve Saddle point problems is the projected gradient algorithm. x(t+1) = PX (x(t) − η∇xL(x(t), y(t))) y(t+1) = PY(y(t) + η∇yL(x(t), y(t))) For non-smooth optimization, 1 T

T

  • t=1
  • x(t), y(t)

− →

T→∞ (x∗, y∗)

Faster algorithm: projected extra-gradient algorithm. Can use LMO to compute approximate projections2.

  • 2N. He and Z. Harchaoui. “Semi-proximal Mirror-Prox for Nonsmooth

Composite Minimization”. In: NIPS. 2015.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-21
SLIDE 21

The FW algorithm

Algorithm Frank-Wolfe algorithm

1: Let x(0) ∈ X 2: for t = 0 . . . T do 3:

Compute r(t) = ∇f(x(t))

4:

Compute s(t) ∈ argmin

s∈X

  • s, r(t)

5:

Compute gt :=

  • x(t) − s(t), r(t)

6:

if gt ≤ ǫ then return x(t)

7:

Let γ =

2 2+t (or do line-search)

8:

Update x(t+1) := (1−γ)x(t)+γs(t)

9: end for

α f(α) M

f

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-22
SLIDE 22

The FW algorithm

Algorithm Frank-Wolfe algorithm

1: Let x(0) ∈ X 2: for t = 0 . . . T do 3:

Compute r(t) = ∇f(x(t))

4:

Compute s(t) ∈ argmin

s∈X

  • s, r(t)

5:

Compute gt :=

  • x(t) − s(t), r(t)

6:

if gt ≤ ǫ then return x(t)

7:

Let γ =

2 2+t (or do line-search)

8:

Update x(t+1) := (1−γ)x(t)+γs(t)

9: end for

α f(α) M

f

f(α) +

  • s0 − α, rf(α)
  • Gauthier Gidel

Frank-Wolfe Algorithms for SP 10th December 2016

slide-23
SLIDE 23

The FW algorithm

Algorithm Frank-Wolfe algorithm

1: Let x(0) ∈ X 2: for t = 0 . . . T do 3:

Compute r(t) = ∇f(x(t))

4:

Compute s(t) ∈ argmin

s∈X

  • s, r(t)

5:

Compute gt :=

  • x(t) − s(t), r(t)

6:

if gt ≤ ǫ then return x(t)

7:

Let γ =

2 2+t (or do line-search)

8:

Update x(t+1) := (1−γ)x(t)+γs(t)

9: end for

α f(α) M

f

f(α) +

  • s0 − α, rf(α)
  • Gauthier Gidel

Frank-Wolfe Algorithms for SP 10th December 2016

slide-24
SLIDE 24

The FW algorithm

Algorithm Frank-Wolfe algorithm

1: Let x(0) ∈ X 2: for t = 0 . . . T do 3:

Compute r(t) = ∇f(x(t))

4:

Compute s(t) ∈ argmin

s∈X

  • s, r(t)

5:

Compute gt :=

  • x(t) − s(t), r(t)

6:

if gt ≤ ǫ then return x(t)

7:

Let γ =

2 2+t (or do line-search)

8:

Update x(t+1) := (1−γ)x(t)+γs(t)

9: end for

α f(α) M

f

s

f(α) +

  • s0 − α, rf(α)
  • Gauthier Gidel

Frank-Wolfe Algorithms for SP 10th December 2016

slide-25
SLIDE 25

SP-FW

Algorithm Saddle point FW algorithm

1: Let z(0) = (x(0), y(0)) ∈ X × Y 2: for t = 0 . . . T do 3:

Compute r(t) :=

  • ∇xL(x(t), y(t))

−∇yL(x(t), y(t))

  • 4:

Compute s(t) ∈ argmin

z∈X×Y

  • z, r(t)

5:

Compute gt :=

  • z(t) − s(t), r(t)

6:

if gt ≤ ǫ then return z(t)

7:

Let γ = min

1, ν

C gt

  • r γ =

2 2+t

8:

Update z(t+1) := (1 − γ)z(t) + γs(t)

9: end for

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-26
SLIDE 26

SP-FW

Algorithm Saddle point FW algorithm

1: Let z(0) = (x(0), y(0)) ∈ X × Y 2: for t = 0 . . . T do 3:

Compute r(t) :=

  • ∇xL(x(t), y(t))

−∇yL(x(t), y(t))

  • 4:

Compute s(t) ∈ argmin

z∈X×Y

  • z, r(t)

5:

Compute gt :=

  • z(t) − s(t), r(t)

6:

if gt ≤ ǫ then return z(t)

7:

Let γ = min

1, ν

C gt

  • r γ =

2 2+t

8:

Update z(t+1) := (1 − γ)z(t) + γs(t)

9: end for

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-27
SLIDE 27

SP-FW

Algorithm Saddle point FW algorithm

1: Let z(0) = (x(0), y(0)) ∈ X × Y 2: for t = 0 . . . T do 3:

Compute r(t) :=

  • ∇xL(x(t), y(t))

−∇yL(x(t), y(t))

  • 4:

Compute s(t) ∈ argmin

z∈X×Y

  • z, r(t)

5:

Compute gt :=

  • z(t) − s(t), r(t)

6:

if gt ≤ ǫ then return z(t)

7:

Let γ = min

1, ν

C gt

  • r γ =

2 2+t

8:

Update z(t+1) := (1 − γ)z(t) + γs(t)

9: end for

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-28
SLIDE 28

SP-FW

Algorithm Saddle point FW algorithm

1: Let z(0) = (x(0), y(0)) ∈ X × Y 2: for t = 0 . . . T do 3:

Compute r(t) :=

  • ∇xL(x(t), y(t))

−∇yL(x(t), y(t))

  • 4:

Compute s(t) ∈ argmin

z∈X×Y

  • z, r(t)

5:

Compute gt :=

  • z(t) − s(t), r(t)

6:

if gt ≤ ǫ then return z(t)

7:

Let γ = min

1, ν

C gt

  • r γ =

2 2+t

8:

Update z(t+1) := (1 − γ)z(t) + γs(t)

9: end for

◮ One can define FW extension with away step.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-29
SLIDE 29

SP-FW

Algorithm Saddle point FW algorithm

1: Let z(0) = (x(0), y(0)) ∈ X × Y 2: for t = 0 . . . T do 3:

Compute r(t) :=

  • ∇xL(x(t), y(t))

−∇yL(x(t), y(t))

  • 4:

Compute s(t) ∈ argmin

z∈X×Y

  • z, r(t)

5:

Compute gt :=

  • z(t) − s(t), r(t)

6:

if gt ≤ ǫ then return z(t)

7:

Let γ = min

1, ν

C gt

  • r γ =

2 2+t

8:

Update z(t+1) := (1 − γ)z(t) + γs(t)

9: end for

◮ One can define FW extension with away step. ◮ γt = 1 1+t ⇒ z(t) = 1 t

t

i=0 s(i). ◮ (γt = 1 1+t) + Bilinear objective ↔ fictitious play algorithm.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-30
SLIDE 30

Advantages of SP-FW

Same main property as FW: Only LMO (linear minimization oracle).

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-31
SLIDE 31

Advantages of SP-FW

Same main property as FW: Only LMO (linear minimization oracle). Same other advantages as FW:

◮ Convergence certificate gt for free.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-32
SLIDE 32

Advantages of SP-FW

Same main property as FW: Only LMO (linear minimization oracle). Same other advantages as FW:

◮ Convergence certificate gt for free. ◮ Affine invariance of the algorithm.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-33
SLIDE 33

Advantages of SP-FW

Same main property as FW: Only LMO (linear minimization oracle). Same other advantages as FW:

◮ Convergence certificate gt for free. ◮ Affine invariance of the algorithm. ◮ Sparsity of the iterates.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-34
SLIDE 34

Advantages of SP-FW

Same main property as FW: Only LMO (linear minimization oracle). Same other advantages as FW:

◮ Convergence certificate gt for free. ◮ Affine invariance of the algorithm. ◮ Sparsity of the iterates. ◮ Universal step size γt := 2 2+t, adaptive step size γt := ν C gt.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-35
SLIDE 35

Advantages of SP-FW

Same main property as FW: Only LMO (linear minimization oracle). Same other advantages as FW:

◮ Convergence certificate gt for free. ◮ Affine invariance of the algorithm. ◮ Sparsity of the iterates. ◮ Universal step size γt := 2 2+t, adaptive step size γt := ν C gt.

Main difference with SP:

◮ No line-search.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-36
SLIDE 36

Advantages of SP-FW

Same main property as FW: Only LMO (linear minimization oracle). Same other advantages as FW:

◮ Convergence certificate gt for free. ◮ Affine invariance of the algorithm. ◮ Sparsity of the iterates. ◮ Universal step size γt := 2 2+t, adaptive step size γt := ν C gt.

Main difference with SP:

◮ No line-search.

When constraints set is a “complicated” structured polytope projections can be hard whereas LMO might be tractable.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-37
SLIDE 37

Theoretical contribution

SP extension of FW with away step: Convergence: Linear rate with adaptive step size. Sublinear rate with universal step size.

  • 3J. Hammond. “Solving asymmetric variational inequality problems and

systems of equations with generalized nonlinear programming algorithms”. PhD thesis. MIT, 1984.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-38
SLIDE 38

Theoretical contribution

SP extension of FW with away step: Convergence: Linear rate with adaptive step size. Sublinear rate with universal step size.

◮ Similar hypothesis as AFW for linear convergence:

  • 1. Strong convexity and smoothness of the function.
  • 2. X and Y polytopes.
  • 3J. Hammond. “Solving asymmetric variational inequality problems and

systems of equations with generalized nonlinear programming algorithms”. PhD thesis. MIT, 1984.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-39
SLIDE 39

Theoretical contribution

SP extension of FW with away step: Convergence: Linear rate with adaptive step size. Sublinear rate with universal step size.

◮ Similar hypothesis as AFW for linear convergence:

  • 1. Strong convexity and smoothness of the function.
  • 2. X and Y polytopes.

◮ Additional assumption on the bilinearity.

L(x, y) = f(x) + x⊤My − g(y) M smaller than the strong convexity constant.

  • 3J. Hammond. “Solving asymmetric variational inequality problems and

systems of equations with generalized nonlinear programming algorithms”. PhD thesis. MIT, 1984.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-40
SLIDE 40

Theoretical contribution

SP extension of FW with away step: Convergence: Linear rate with adaptive step size. Sublinear rate with universal step size.

◮ Similar hypothesis as AFW for linear convergence:

  • 1. Strong convexity and smoothness of the function.
  • 2. X and Y polytopes.

◮ Additional assumption on the bilinearity.

L(x, y) = f(x) + x⊤My − g(y) M smaller than the strong convexity constant.

◮ Proof use recent advances on AFW. ◮ Partially answering a 30 years old conjecture3.

  • 3J. Hammond. “Solving asymmetric variational inequality problems and

systems of equations with generalized nonlinear programming algorithms”. PhD thesis. MIT, 1984.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-41
SLIDE 41

Difficulties for saddle point

Usual descent Lemma: ht+1 ≤ ht − γtgt

  • ≥0

+γ2

t

Ld(t)2 2 With γt small enough the sequence decreases.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-42
SLIDE 42

Difficulties for saddle point

Usual descent Lemma: ht+1 ≤ ht − γtgt

  • ≥0

+γ2

t

Ld(t)2 2 With γt small enough the sequence decreases. For saddle point problem the Lipschitz gradient property gives Lt+1 − L∗ ≤ Lt − L∗ − γt

  • g(x)

t

− g(y)

t

  • arbitrary sign

+γ2

t

Ld(t)2 2 .

◮ Cannot control the oscillation of the sequence. ◮ Must introduce other quantities to establish convergence.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-43
SLIDE 43

Toy experiments

SP-AFW on a toy example d = 30. with theoretical step-size γt = ν

C gt.

Figure: SP-AFW on a toy example d = 30 with heuristic step-size. γt =

gt C+2 M2D2

µ

C = 2LD2

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-44
SLIDE 44

Toy experiments

SP-AFW on a toy example d = 30. with theoretical step-size γt = ν

C gt.

Figure: SP-AFW on a toy example d = 30 with heuristic step-size. γt =

gt C+2 M2D2

µ

C = 2LD2

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-45
SLIDE 45

Conclusion

◮ SP-FW one of the first SP solver only working with LMO.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-46
SLIDE 46

Conclusion

◮ SP-FW one of the first SP solver only working with LMO. ◮ FW resurgence lead to new structured problems.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-47
SLIDE 47

Conclusion

◮ SP-FW one of the first SP solver only working with LMO. ◮ FW resurgence lead to new structured problems. ◮ Same hope as FW for SP-FW

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-48
SLIDE 48

Conclusion

◮ SP-FW one of the first SP solver only working with LMO. ◮ FW resurgence lead to new structured problems. ◮ Same hope as FW for SP-FW

  • Call for applications !

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-49
SLIDE 49

Conclusion

◮ SP-FW one of the first SP solver only working with LMO. ◮ FW resurgence lead to new structured problems. ◮ Same hope as FW for SP-FW

  • Call for applications !

◮ Still many theoretical opened questions.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-50
SLIDE 50

Conclusion

◮ SP-FW one of the first SP solver only working with LMO. ◮ FW resurgence lead to new structured problems. ◮ Same hope as FW for SP-FW

  • Call for applications !

◮ Still many theoretical opened questions. ◮ With a bilinear objective this algorithm is highly related

to the fictitious play algorithm.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-51
SLIDE 51

Conclusion

◮ SP-FW one of the first SP solver only working with LMO. ◮ FW resurgence lead to new structured problems. ◮ Same hope as FW for SP-FW

  • Call for applications !

◮ Still many theoretical opened questions. ◮ With a bilinear objective this algorithm is highly related

to the fictitious play algorithm.

◮ Rich interplay tapping into this game theory literature.

Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016

slide-52
SLIDE 52

Thank You !

Slides available on www.di.ens.fr/~gidel.