The moment-LP and moment-SOS approaches Jean B. Lasserre LAAS-CNRS - - PowerPoint PPT Presentation

the moment lp and moment sos approaches
SMART_READER_LITE
LIVE PREVIEW

The moment-LP and moment-SOS approaches Jean B. Lasserre LAAS-CNRS - - PowerPoint PPT Presentation

The moment-LP and moment-SOS approaches Jean B. Lasserre LAAS-CNRS and Institute of Mathematics, Toulouse, France NIPS-2014, Optimization workshop, Montreal Jean B. Lasserre semidefinite characterization The moment-LP and moment-SOS


slide-1
SLIDE 1

The moment-LP and moment-SOS approaches

Jean B. Lasserre

LAAS-CNRS and Institute of Mathematics, Toulouse, France

NIPS-2014, Optimization workshop, Montreal

Jean B. Lasserre semidefinite characterization

slide-2
SLIDE 2

The moment-LP and moment-SOS approaches

Jean B. Lasserre

LAAS-CNRS and Institute of Mathematics, Toulouse, France

NIPS-2014, Optimization workshop, Montreal

Jean B. Lasserre semidefinite characterization

slide-3
SLIDE 3

Jean B. Lasserre semidefinite characterization

slide-4
SLIDE 4

Jean B. Lasserre semidefinite characterization

slide-5
SLIDE 5

Jean B. Lasserre semidefinite characterization

slide-6
SLIDE 6

Why polynomial optimization? LP- and SDP- CERTIFICATES of POSITIVITY The moment-LP and moment-SOS approaches An alternative characterization of nonnegativity

Jean B. Lasserre semidefinite characterization

slide-7
SLIDE 7

Why polynomial optimization? LP- and SDP- CERTIFICATES of POSITIVITY The moment-LP and moment-SOS approaches An alternative characterization of nonnegativity

Jean B. Lasserre semidefinite characterization

slide-8
SLIDE 8

Why polynomial optimization? LP- and SDP- CERTIFICATES of POSITIVITY The moment-LP and moment-SOS approaches An alternative characterization of nonnegativity

Jean B. Lasserre semidefinite characterization

slide-9
SLIDE 9

Why polynomial optimization? LP- and SDP- CERTIFICATES of POSITIVITY The moment-LP and moment-SOS approaches An alternative characterization of nonnegativity

Jean B. Lasserre semidefinite characterization

slide-10
SLIDE 10

Why Polynomial Optimization?

After all ... the polynomial optimization problem: f ∗ = min{f(x) : gj(x) ≥ 0, j = 1, . . . , m} is just a particular case of Non Linear Programming (NLP)! True! ... if one is interested with a LOCAL optimum only!!

Jean B. Lasserre semidefinite characterization

slide-11
SLIDE 11

Why Polynomial Optimization?

After all ... the polynomial optimization problem: f ∗ = min{f(x) : gj(x) ≥ 0, j = 1, . . . , m} is just a particular case of Non Linear Programming (NLP)! True! ... if one is interested with a LOCAL optimum only!!

Jean B. Lasserre semidefinite characterization

slide-12
SLIDE 12

When searching for a local minimum ... Optimality conditions and descent algorithms use basic tools from REAL and CONVEX analysis and linear algebra The focus is on how to improve f by looking at a NEIGHBORHOOD of a nominal point x ∈ K, i.e., LOCALLY AROUND x ∈ K, and in general, no GLOBAL property of x ∈ K can be inferred. The fact that f and gj are POLYNOMIALS does not help much!

Jean B. Lasserre semidefinite characterization

slide-13
SLIDE 13

When searching for a local minimum ... Optimality conditions and descent algorithms use basic tools from REAL and CONVEX analysis and linear algebra The focus is on how to improve f by looking at a NEIGHBORHOOD of a nominal point x ∈ K, i.e., LOCALLY AROUND x ∈ K, and in general, no GLOBAL property of x ∈ K can be inferred. The fact that f and gj are POLYNOMIALS does not help much!

Jean B. Lasserre semidefinite characterization

slide-14
SLIDE 14

When searching for a local minimum ... Optimality conditions and descent algorithms use basic tools from REAL and CONVEX analysis and linear algebra The focus is on how to improve f by looking at a NEIGHBORHOOD of a nominal point x ∈ K, i.e., LOCALLY AROUND x ∈ K, and in general, no GLOBAL property of x ∈ K can be inferred. The fact that f and gj are POLYNOMIALS does not help much!

Jean B. Lasserre semidefinite characterization

slide-15
SLIDE 15

BUT for GLOBAL Optimization ... the picture is different! Remember that for the GLOBAL minimum f ∗: f ∗ = sup { λ : f(x) − λ ≥ 0 ∀x ∈ K}. ... and so to compute f ∗ one needs TRACTABLE CERTIFICATES of POSITIVITY on K!

Jean B. Lasserre semidefinite characterization

slide-16
SLIDE 16

BUT for GLOBAL Optimization ... the picture is different! Remember that for the GLOBAL minimum f ∗: f ∗ = sup { λ : f(x) − λ ≥ 0 ∀x ∈ K}. ... and so to compute f ∗ one needs TRACTABLE CERTIFICATES of POSITIVITY on K!

Jean B. Lasserre semidefinite characterization

slide-17
SLIDE 17

BUT for GLOBAL Optimization ... the picture is different! Remember that for the GLOBAL minimum f ∗: f ∗ = sup { λ : f(x) − λ ≥ 0 ∀x ∈ K}. ... and so to compute f ∗ one needs TRACTABLE CERTIFICATES of POSITIVITY on K!

Jean B. Lasserre semidefinite characterization

slide-18
SLIDE 18

REAL ALGEBRAIC GEOMETRY helps!!!! Indeed, POWERFUL CERTIFICATES OF POSITIVITY EXIST! Moreover .... and importantly, Such certificates are amenable to PRACTICAL COMPUTATION! (⋆ Stronger Positivstellensatzë exist for analytic functions but are useless from a computational viewpoint.)

Jean B. Lasserre semidefinite characterization

slide-19
SLIDE 19

REAL ALGEBRAIC GEOMETRY helps!!!! Indeed, POWERFUL CERTIFICATES OF POSITIVITY EXIST! Moreover .... and importantly, Such certificates are amenable to PRACTICAL COMPUTATION! (⋆ Stronger Positivstellensatzë exist for analytic functions but are useless from a computational viewpoint.)

Jean B. Lasserre semidefinite characterization

slide-20
SLIDE 20

REAL ALGEBRAIC GEOMETRY helps!!!! Indeed, POWERFUL CERTIFICATES OF POSITIVITY EXIST! Moreover .... and importantly, Such certificates are amenable to PRACTICAL COMPUTATION! (⋆ Stronger Positivstellensatzë exist for analytic functions but are useless from a computational viewpoint.)

Jean B. Lasserre semidefinite characterization

slide-21
SLIDE 21

LP-based certificate

K = {x : gj(x) ≥ 0; (1 − gj(x)) ≥ 0, j = 1, . . . , m} Theorem (Krivine-Vasilescu-Handelman’s Positivstellensatz) Let K be compact and the family {gj, (1 − gj)} generate R[x]. If f > 0 on K then: ⋆ f(x) =

  • α,β

cαβ

m

  • j=1

gj(x)αj (1 − gj(x))βj, ∀x ∈ Rn, for some NONNEGATIVE scalars (cαβ). Testing whether ⋆ holds for some NONNEGATIVE (cαβ) with |α + β| ≤ M, is SOLVING an LP!

Jean B. Lasserre semidefinite characterization

slide-22
SLIDE 22

LP-based certificate

K = {x : gj(x) ≥ 0; (1 − gj(x)) ≥ 0, j = 1, . . . , m} Theorem (Krivine-Vasilescu-Handelman’s Positivstellensatz) Let K be compact and the family {gj, (1 − gj)} generate R[x]. If f > 0 on K then: ⋆ f(x) =

  • α,β

cαβ

m

  • j=1

gj(x)αj (1 − gj(x))βj, ∀x ∈ Rn, for some NONNEGATIVE scalars (cαβ). Testing whether ⋆ holds for some NONNEGATIVE (cαβ) with |α + β| ≤ M, is SOLVING an LP!

Jean B. Lasserre semidefinite characterization

slide-23
SLIDE 23

Indeed with d = maxj[deg g1, . . . , deg gm] and

i αi + βi ≤ M,

  • α,β

cαβ

m

  • j=1

gj(X)αj (1 − gj(X))βj, is a polynomial of degree at most M + d, which can be written

  • α,β

cαβ

m

  • j=1

gj(X)αj (1 − gj(X))βj =

  • γ∈Nn

M+d

X γ θγ(c) linear in c And so the identity f(x) =

  • γ

fγ X γ =

  • α,β

cαβ

m

  • j=1

gj(X)αj (1 − gj(X))βj, for all X ∈ Rn, holds if and only if { fγ = θγ(c), ∀γ ∈ Nn

M+d; c ≥ 0 }.

→ c ∈ polyhedron!.

Jean B. Lasserre semidefinite characterization

slide-24
SLIDE 24

SOS-based certificate

K = {x : gj(x) ≥ 0, j = 1, . . . , m} Theorem (Putinar’s Positivstellensatz) If K is compact (+ a technical Archimedean assumption) and f > 0 on K then: † f(x) = σ0(x) +

m

  • j=1

σj(x) gj(x), ∀x ∈ Rn, for some SOS polynomials (σj) ⊂ R[x]. Testing whether † holds for some SOS (σj) ⊂ R[x] with a degree bound, is SOLVING an SDP!

Jean B. Lasserre semidefinite characterization

slide-25
SLIDE 25

SOS-based certificate

K = {x : gj(x) ≥ 0, j = 1, . . . , m} Theorem (Putinar’s Positivstellensatz) If K is compact (+ a technical Archimedean assumption) and f > 0 on K then: † f(x) = σ0(x) +

m

  • j=1

σj(x) gj(x), ∀x ∈ Rn, for some SOS polynomials (σj) ⊂ R[x]. Testing whether † holds for some SOS (σj) ⊂ R[x] with a degree bound, is SOLVING an SDP!

Jean B. Lasserre semidefinite characterization

slide-26
SLIDE 26

Checking whether a given polynomial is SOS reduces to solving an SDP ... that one may solve efficiently to arbitrary precision, in time polynomial in the input size! Indeed, let vd(X) = (X α (= X α1

1 · · · X αn n )), |α| := i αi ≤ d, be

a basis of R[X]d (polynomials of degree at most d) Example with n = 2 and d = 3, v3(X) = (1, X1, X2, X 2

1 , X1X2, X 2 2 , X 3 1 , X 2 1 X2, X1X 2 2 , X 3 2 )

Let f ∈ R[X]2d be an SOS polynomial, that is, f(X) =

s

  • k=1

qk(X)2, for some polynomials {qk}s

k=1 ⊂ R[X]d.

Jean B. Lasserre semidefinite characterization

slide-27
SLIDE 27

Checking whether a given polynomial is SOS reduces to solving an SDP ... that one may solve efficiently to arbitrary precision, in time polynomial in the input size! Indeed, let vd(X) = (X α (= X α1

1 · · · X αn n )), |α| := i αi ≤ d, be

a basis of R[X]d (polynomials of degree at most d) Example with n = 2 and d = 3, v3(X) = (1, X1, X2, X 2

1 , X1X2, X 2 2 , X 3 1 , X 2 1 X2, X1X 2 2 , X 3 2 )

Let f ∈ R[X]2d be an SOS polynomial, that is, f(X) =

s

  • k=1

qk(X)2, for some polynomials {qk}s

k=1 ⊂ R[X]d.

Jean B. Lasserre semidefinite characterization

slide-28
SLIDE 28

Denote also qk = {qkα}α∈Nn, the vector of coefficients of the polynomial qk, in the basis vd(X), that is, qk(X) = qk, vd(X) =

  • |α|≤r

qkαX α and define the real symmetric matrix Q := s

k=1 qkqT k 0.

vd(X), Q vd(X) =

s

  • k=1

qk, vd(X)2 =

s

  • k=1

qk(X)2 = f(X) Conversely, let Q 0 be a real s(d) × s(d) positive semidefinite symmetric matrix (s(d) is the dimension of the vector space R[X]d). As Q 0, write Q = s

k=1 qkqT k , so that

f(X) := vd(X), Q vd(X)

  • =

s

  • k=1

qk, vd(X)2 =

s

  • k=1

qk(X)2

  • is SOS.

Jean B. Lasserre semidefinite characterization

slide-29
SLIDE 29

Denote also qk = {qkα}α∈Nn, the vector of coefficients of the polynomial qk, in the basis vd(X), that is, qk(X) = qk, vd(X) =

  • |α|≤r

qkαX α and define the real symmetric matrix Q := s

k=1 qkqT k 0.

vd(X), Q vd(X) =

s

  • k=1

qk, vd(X)2 =

s

  • k=1

qk(X)2 = f(X) Conversely, let Q 0 be a real s(d) × s(d) positive semidefinite symmetric matrix (s(d) is the dimension of the vector space R[X]d). As Q 0, write Q = s

k=1 qkqT k , so that

f(X) := vd(X), Q vd(X)

  • =

s

  • k=1

qk, vd(X)2 =

s

  • k=1

qk(X)2

  • is SOS.

Jean B. Lasserre semidefinite characterization

slide-30
SLIDE 30

Next, write the matrix vd(X) vd(X)T as: vd(X) vd(X)T =

  • α∈Nn

2d

Bα xα, for some real symmetric matrices (Bα). Checking whether f(X)

  • α fα X α

:= vd(X), Q vd(X) = Q, vd(X) vd(X)T =

  • α∈Nn

2d

Q, Bα X α for some Q 0 reduces to checking the LMI Bα, Q = fα, α ∈ Nn, |α| ≤ 2d Q

  • .

has a solution!

Jean B. Lasserre semidefinite characterization

slide-31
SLIDE 31

Example

Let t → f(t) = 6 + 4t + 9t2 − 4t3 + 6t4. Is f an SOS? Do we have f(t) =   1 t t2  

T 

 a b c b d e c e f  

  • Q 0

  1 t t2   ? for some Q 0? We must have: a = 6 ; 2 b = 4; d + 2 c = 9; 2 e = −4; f = 6. And so we must find a scalar c such that Q =   6 2 c 2 9 − 2c −2 c −2 6   0.

Jean B. Lasserre semidefinite characterization

slide-32
SLIDE 32

Example

Let t → f(t) = 6 + 4t + 9t2 − 4t3 + 6t4. Is f an SOS? Do we have f(t) =   1 t t2  

T 

 a b c b d e c e f  

  • Q 0

  1 t t2   ? for some Q 0? We must have: a = 6 ; 2 b = 4; d + 2 c = 9; 2 e = −4; f = 6. And so we must find a scalar c such that Q =   6 2 c 2 9 − 2c −2 c −2 6   0.

Jean B. Lasserre semidefinite characterization

slide-33
SLIDE 33

With c = −4 we have Q =   6 2 −4 2 17 −2 −4 −2 6   0. et Q = 2  

  • (2/2)
  • (2)/2

   

  • (2/2)
  • (2)/2

 

+ 9   2/3 −1/3 −2/3     2/3 1/3 −2/3  

+18   1/

  • (18)

4/

  • (18)

−1/

  • (18)

    1/

  • (18)

4/

  • (18)

−1/

  • (18)

 

Jean B. Lasserre semidefinite characterization

slide-34
SLIDE 34

and so f(t) = (1 + t2)2 + (2 − t − 2t2)2 + (1 + 4t − t2)2 which is an SOS polynomial.

Jean B. Lasserre semidefinite characterization

slide-35
SLIDE 35

SUCH POSITIVITY CERTIFICATES allow to infer GLOBAL Properties of FEASIBILITY and OPTIMALITY, ... the analogue of (well-known) previous ones valid in the CONVEX CASE ONLY! Farkas Lemma → Krivine-Stengle KKT-Optimality conditions → Schmüdgen-Putinar

Jean B. Lasserre semidefinite characterization

slide-36
SLIDE 36

SUCH POSITIVITY CERTIFICATES allow to infer GLOBAL Properties of FEASIBILITY and OPTIMALITY, ... the analogue of (well-known) previous ones valid in the CONVEX CASE ONLY! Farkas Lemma → Krivine-Stengle KKT-Optimality conditions → Schmüdgen-Putinar

Jean B. Lasserre semidefinite characterization

slide-37
SLIDE 37

SUCH POSITIVITY CERTIFICATES allow to infer GLOBAL Properties of FEASIBILITY and OPTIMALITY, ... the analogue of (well-known) previous ones valid in the CONVEX CASE ONLY! Farkas Lemma → Krivine-Stengle KKT-Optimality conditions → Schmüdgen-Putinar

Jean B. Lasserre semidefinite characterization

slide-38
SLIDE 38

SUCH POSITIVITY CERTIFICATES allow to infer GLOBAL Properties of FEASIBILITY and OPTIMALITY, ... the analogue of (well-known) previous ones valid in the CONVEX CASE ONLY! Farkas Lemma → Krivine-Stengle KKT-Optimality conditions → Schmüdgen-Putinar

Jean B. Lasserre semidefinite characterization

slide-39
SLIDE 39

In addition, polynomials NONNEGATIVE ON A SET K ⊂ Rn are

  • ubiquitous. They also appear in many important applications

(outside optimization), . . . modeled as particular instances of the so called Generalized Moment Problem, among which: Probability, Optimal and Robust Control, Game theory, Signal processing, multivariate integration, etc.

Jean B. Lasserre semidefinite characterization

slide-40
SLIDE 40

The Generalized Moment Problem (GMP) : inf

µi∈M(Ki) { s

  • i=1
  • Ki

fi dµi :

s

  • i=1
  • Ki

hij dµi

= bj, j ∈ J} with M(Ki) space of Borel measures on Ki ⊂ Rni, i = 1, . . . , s. Global OPTIM → inf

µ∈M(K) {

  • K

f dµ :

  • K

1 dµ = 1} is the simplest instance of the GMP!

Jean B. Lasserre semidefinite characterization

slide-41
SLIDE 41

The Generalized Moment Problem (GMP) : inf

µi∈M(Ki) { s

  • i=1
  • Ki

fi dµi :

s

  • i=1
  • Ki

hij dµi

= bj, j ∈ J} with M(Ki) space of Borel measures on Ki ⊂ Rni, i = 1, . . . , s. Global OPTIM → inf

µ∈M(K) {

  • K

f dµ :

  • K

1 dµ = 1} is the simplest instance of the GMP!

Jean B. Lasserre semidefinite characterization

slide-42
SLIDE 42

For instance, one may also want:

  • To approximate sets defined with QUANTIFIERS, like .e.g.,

Rf := {x ∈ B : f(x, y) ≤ 0 for all y such that (x, y) ∈ K} Df := {x ∈ B : f(x, y) ≤ 0 for some y such that (x, y) ∈ K} where f ∈ R[x, y], B is a simple set (box, ellipsoid).

  • To compute convex polynomial underestimators p ≤ f of a

polynomial f on a box B ⊂ Rn. (Very useful in MINLP.)

Jean B. Lasserre semidefinite characterization

slide-43
SLIDE 43

For instance, one may also want:

  • To approximate sets defined with QUANTIFIERS, like .e.g.,

Rf := {x ∈ B : f(x, y) ≤ 0 for all y such that (x, y) ∈ K} Df := {x ∈ B : f(x, y) ≤ 0 for some y such that (x, y) ∈ K} where f ∈ R[x, y], B is a simple set (box, ellipsoid).

  • To compute convex polynomial underestimators p ≤ f of a

polynomial f on a box B ⊂ Rn. (Very useful in MINLP.)

Jean B. Lasserre semidefinite characterization

slide-44
SLIDE 44

The moment-LP and moment-SOS approaches consist of using a certain type of positivity certificate (Krivine-Vasilescu-Handelman’s or Putinar’s certificate) in potentially any application where such a characterization is

  • needed. (Global optimization is only one example.)

In many situations this amounts to solving a HIERARCHY of : LINEAR PROGRAMS, or SEMIDEFINITE PROGRAMS ... of increasing size!.

Jean B. Lasserre semidefinite characterization

slide-45
SLIDE 45

The moment-LP and moment-SOS approaches consist of using a certain type of positivity certificate (Krivine-Vasilescu-Handelman’s or Putinar’s certificate) in potentially any application where such a characterization is

  • needed. (Global optimization is only one example.)

In many situations this amounts to solving a HIERARCHY of : LINEAR PROGRAMS, or SEMIDEFINITE PROGRAMS ... of increasing size!.

Jean B. Lasserre semidefinite characterization

slide-46
SLIDE 46

The moment-LP and moment-SOS approaches consist of using a certain type of positivity certificate (Krivine-Vasilescu-Handelman’s or Putinar’s certificate) in potentially any application where such a characterization is

  • needed. (Global optimization is only one example.)

In many situations this amounts to solving a HIERARCHY of : LINEAR PROGRAMS, or SEMIDEFINITE PROGRAMS ... of increasing size!.

Jean B. Lasserre semidefinite characterization

slide-47
SLIDE 47

The moment-LP and moment-SOS approaches consist of using a certain type of positivity certificate (Krivine-Vasilescu-Handelman’s or Putinar’s certificate) in potentially any application where such a characterization is

  • needed. (Global optimization is only one example.)

In many situations this amounts to solving a HIERARCHY of : LINEAR PROGRAMS, or SEMIDEFINITE PROGRAMS ... of increasing size!.

Jean B. Lasserre semidefinite characterization

slide-48
SLIDE 48

LP- and SDP-hierarchies for optimization

Replace f ∗ = supλ,σj { λ : f(x) − λ ≥ 0 ∀x ∈ K} with: The SDP-hierarchy indexed by d ∈ N: f ∗

d = sup { λ : f − λ = σ0

  • SOS

+

m

  • j=1

σj

  • SOS

gj; deg (σj gj) ≤ 2d }

  • r, the LP-hierarchy indexed by d ∈ N:

θd = sup { λ : f −λ =

  • α,β

cαβ

  • ≥0

m

  • j=1

gj

αj(1−gj)βj;

|α+β| ≤ 2d}

Jean B. Lasserre semidefinite characterization

slide-49
SLIDE 49

LP- and SDP-hierarchies for optimization

Replace f ∗ = supλ,σj { λ : f(x) − λ ≥ 0 ∀x ∈ K} with: The SDP-hierarchy indexed by d ∈ N: f ∗

d = sup { λ : f − λ = σ0

  • SOS

+

m

  • j=1

σj

  • SOS

gj; deg (σj gj) ≤ 2d }

  • r, the LP-hierarchy indexed by d ∈ N:

θd = sup { λ : f −λ =

  • α,β

cαβ

  • ≥0

m

  • j=1

gj

αj(1−gj)βj;

|α+β| ≤ 2d}

Jean B. Lasserre semidefinite characterization

slide-50
SLIDE 50

Theorem Both sequence (f ∗

d ), and (θd), d ∈ N, are MONOTONE NON

DECREASING and when K is compact (and satisfies a technical Archimedean assumption) then: f ∗ = lim

d→∞ f ∗ d =

lim

d→∞ θd.

Jean B. Lasserre semidefinite characterization

slide-51
SLIDE 51
  • What makes this approach exciting is that it is at the

crossroads of several disciplines/applications: Commutative, Non-commutative, and Non-linear ALGEBRA Real algebraic geometry, and Functional Analysis Optimization, Convex Analysis Computational Complexity in Computer Science, which BENEFIT from interactions!

  • As mentioned ... potential applications are ENDLESS!

Jean B. Lasserre semidefinite characterization

slide-52
SLIDE 52
  • What makes this approach exciting is that it is at the

crossroads of several disciplines/applications: Commutative, Non-commutative, and Non-linear ALGEBRA Real algebraic geometry, and Functional Analysis Optimization, Convex Analysis Computational Complexity in Computer Science, which BENEFIT from interactions!

  • As mentioned ... potential applications are ENDLESS!

Jean B. Lasserre semidefinite characterization

slide-53
SLIDE 53
  • Has already been proved useful and successful in

applications with modest problem size, notably in optimization, control, robust control, optimal control, estimation, computer vision, etc. (If sparsity then problems of larger size can be addressed)

  • HAS initiated and stimulated new research issues:

in Convex Algebraic Geometry (e.g. semidefinite representation of convex sets, algebraic degree of semidefinite programming and polynomial optimization) in Computational algebra (e.g., for solving polynomial equations via SDP and Border bases) Computational Complexity where LP- and SDP-HIERARCHIES have become an important tool to analyze Hardness of Approximation for 0/1 combinatorial problems (→ links with quantum computing)

Jean B. Lasserre semidefinite characterization

slide-54
SLIDE 54
  • Has already been proved useful and successful in

applications with modest problem size, notably in optimization, control, robust control, optimal control, estimation, computer vision, etc. (If sparsity then problems of larger size can be addressed)

  • HAS initiated and stimulated new research issues:

in Convex Algebraic Geometry (e.g. semidefinite representation of convex sets, algebraic degree of semidefinite programming and polynomial optimization) in Computational algebra (e.g., for solving polynomial equations via SDP and Border bases) Computational Complexity where LP- and SDP-HIERARCHIES have become an important tool to analyze Hardness of Approximation for 0/1 combinatorial problems (→ links with quantum computing)

Jean B. Lasserre semidefinite characterization

slide-55
SLIDE 55

Recall that both LP- and SDP- hierarchies are GENERAL PURPOSE METHODS .... NOT TAILORED to solving specific hard problems!!

Jean B. Lasserre semidefinite characterization

slide-56
SLIDE 56

Recall that both LP- and SDP- hierarchies are GENERAL PURPOSE METHODS .... NOT TAILORED to solving specific hard problems!!

Jean B. Lasserre semidefinite characterization

slide-57
SLIDE 57

A remarkable property of the SOS hierarchy: I

When solving the optimization problem P : f ∗ = min {f(x) : gj(x) ≥ 0, j = 1, . . . , m}

  • ne does NOT distinguish between CONVEX, CONTINUOUS

NON CONVEX, and 0/1 (and DISCRETE) problems! A boolean variable xi is modelled via the equality constraint “x2

i − xi = 0".

In Non Linear Programming (NLP), modeling a 0/1 variable with the polynomial equality constraint “x2

i − xi = 0"

and applying a standard descent algorithm would be considered “stupid"! Each class of problems has its own ad hoc tailored algorithms.

Jean B. Lasserre semidefinite characterization

slide-58
SLIDE 58

A remarkable property of the SOS hierarchy: I

When solving the optimization problem P : f ∗ = min {f(x) : gj(x) ≥ 0, j = 1, . . . , m}

  • ne does NOT distinguish between CONVEX, CONTINUOUS

NON CONVEX, and 0/1 (and DISCRETE) problems! A boolean variable xi is modelled via the equality constraint “x2

i − xi = 0".

In Non Linear Programming (NLP), modeling a 0/1 variable with the polynomial equality constraint “x2

i − xi = 0"

and applying a standard descent algorithm would be considered “stupid"! Each class of problems has its own ad hoc tailored algorithms.

Jean B. Lasserre semidefinite characterization

slide-59
SLIDE 59

A remarkable property of the SOS hierarchy: I

When solving the optimization problem P : f ∗ = min {f(x) : gj(x) ≥ 0, j = 1, . . . , m}

  • ne does NOT distinguish between CONVEX, CONTINUOUS

NON CONVEX, and 0/1 (and DISCRETE) problems! A boolean variable xi is modelled via the equality constraint “x2

i − xi = 0".

In Non Linear Programming (NLP), modeling a 0/1 variable with the polynomial equality constraint “x2

i − xi = 0"

and applying a standard descent algorithm would be considered “stupid"! Each class of problems has its own ad hoc tailored algorithms.

Jean B. Lasserre semidefinite characterization

slide-60
SLIDE 60

Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ...

Jean B. Lasserre semidefinite characterization

slide-61
SLIDE 61

Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ...

Jean B. Lasserre semidefinite characterization

slide-62
SLIDE 62

Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ...

Jean B. Lasserre semidefinite characterization

slide-63
SLIDE 63

Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ...

Jean B. Lasserre semidefinite characterization

slide-64
SLIDE 64

Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ...

Jean B. Lasserre semidefinite characterization

slide-65
SLIDE 65

Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ...

Jean B. Lasserre semidefinite characterization

slide-66
SLIDE 66

A remarkable property: II

FINITE CONVERGENCE of the SOS-hierarchy is GENERIC! ... and provides a GLOBAL OPTIMALITY CERTIFICATE, the analogue for the NON CONVEX CASE of the KKT-OPTIMALITY conditions in the CONVEX CASE!

Jean B. Lasserre semidefinite characterization

slide-67
SLIDE 67

Theorem (Marshall, Nie) Let x∗ ∈ K be a global minimizer of P : f ∗ = min {f(x) : gj(x) ≥ 0, j = 1, . . . , m}. and assume that: (i) The gradients {∇gj(x∗)} are linearly independent, (ii) Strict complementarity holds (λ∗

j gj(x∗) = 0 for all j.)

(iii) Second-order sufficiency conditions hold at (x∗, λ∗) ∈ K × Rm

+.

Then f(x) − f ∗ = σ∗

0(x) + m

  • j=1

σ∗

j (x)gj(x),

∀x ∈ Rn, for some SOS polynomials {σ∗

j }.

Moreover, the conditions (i)-(ii)-(iii) HOLD GENERICALLY!

Jean B. Lasserre semidefinite characterization

slide-68
SLIDE 68

Theorem (Marshall, Nie) Let x∗ ∈ K be a global minimizer of P : f ∗ = min {f(x) : gj(x) ≥ 0, j = 1, . . . , m}. and assume that: (i) The gradients {∇gj(x∗)} are linearly independent, (ii) Strict complementarity holds (λ∗

j gj(x∗) = 0 for all j.)

(iii) Second-order sufficiency conditions hold at (x∗, λ∗) ∈ K × Rm

+.

Then f(x) − f ∗ = σ∗

0(x) + m

  • j=1

σ∗

j (x)gj(x),

∀x ∈ Rn, for some SOS polynomials {σ∗

j }.

Moreover, the conditions (i)-(ii)-(iii) HOLD GENERICALLY!

Jean B. Lasserre semidefinite characterization

slide-69
SLIDE 69

Certificates of positivity already exist in convex optimization f ∗ = f(x∗) = min { f(x) : gj(x) ≥ 0, j = 1, . . . , m } when f and −gj are CONVEX. Indeed if Slater’s condition holds there exist nonnegative KKT-multipliers λ∗

j ∈ Rm + such that:

∇f(x∗) −

m

  • j=1

λj

∗ gj(x∗) = 0;

λj

∗ gj(x∗) = 0, j = 1, . . . , m.

... and so ... the Lagrangian Lλ∗(x) := f(x) − f ∗ −

  • j=1

λj

∗ gj(x),

satisfies Lλ∗(x∗) = 0 and Lλ∗(x) ≥ 0 for all x. Therefore: Lλ∗(x) ≥ 0 ⇒ f(x) ≥ f ∗ ∀x ∈ K!

Jean B. Lasserre semidefinite characterization

slide-70
SLIDE 70

Certificates of positivity already exist in convex optimization f ∗ = f(x∗) = min { f(x) : gj(x) ≥ 0, j = 1, . . . , m } when f and −gj are CONVEX. Indeed if Slater’s condition holds there exist nonnegative KKT-multipliers λ∗

j ∈ Rm + such that:

∇f(x∗) −

m

  • j=1

λj

∗ gj(x∗) = 0;

λj

∗ gj(x∗) = 0, j = 1, . . . , m.

... and so ... the Lagrangian Lλ∗(x) := f(x) − f ∗ −

  • j=1

λj

∗ gj(x),

satisfies Lλ∗(x∗) = 0 and Lλ∗(x) ≥ 0 for all x. Therefore: Lλ∗(x) ≥ 0 ⇒ f(x) ≥ f ∗ ∀x ∈ K!

Jean B. Lasserre semidefinite characterization

slide-71
SLIDE 71

In summary: KKT-OPTIMALITY PUTINAR’s CERTIFICATE when f and −gj are CONVEX in the non CONVEX CASE ∇f(x∗) −

m

  • j=1

λ∗

j ∇gj(x∗) = 0

∇f(x∗) −

m

  • j=1

σj(x∗)∇gj(x∗) = 0 f(x) − f ∗ −

m

  • j=1

λ∗

j gj(x)

f(x) − f ∗ −

m

  • j=1

σ∗

j (x)gj(x)

≥ 0 for all x ∈ Rn (= σ∗

0(x)) ≥ 0 for all x ∈ Rn.

for some SOS {σ∗

j }, and

σ∗

j (x∗) = λ∗ j .

Jean B. Lasserre semidefinite characterization

slide-72
SLIDE 72

In summary: KKT-OPTIMALITY PUTINAR’s CERTIFICATE when f and −gj are CONVEX in the non CONVEX CASE ∇f(x∗) −

m

  • j=1

λ∗

j ∇gj(x∗) = 0

∇f(x∗) −

m

  • j=1

σj(x∗)∇gj(x∗) = 0 f(x) − f ∗ −

m

  • j=1

λ∗

j gj(x)

f(x) − f ∗ −

m

  • j=1

σ∗

j (x)gj(x)

≥ 0 for all x ∈ Rn (= σ∗

0(x)) ≥ 0 for all x ∈ Rn.

for some SOS {σ∗

j }, and

σ∗

j (x∗) = λ∗ j .

Jean B. Lasserre semidefinite characterization

slide-73
SLIDE 73

So even though both LP- and SDP-relaxations were not designed for solving specific hard problems ... The SDP-relaxations behave reasonably well ("efficiently"?) as they provide the BEST LOWER BOUNDS in very different contexts (in contrast to LP-relaxations). → The Theoretical Computer Science (TCS) community even speaks of a META-ALGORITHM → ... considered as the most promising tool to prove/disprove the Unique Games Conjecture (UGC)

Jean B. Lasserre semidefinite characterization

slide-74
SLIDE 74

So even though both LP- and SDP-relaxations were not designed for solving specific hard problems ... The SDP-relaxations behave reasonably well ("efficiently"?) as they provide the BEST LOWER BOUNDS in very different contexts (in contrast to LP-relaxations). → The Theoretical Computer Science (TCS) community even speaks of a META-ALGORITHM → ... considered as the most promising tool to prove/disprove the Unique Games Conjecture (UGC)

Jean B. Lasserre semidefinite characterization

slide-75
SLIDE 75

So even though both LP- and SDP-relaxations were not designed for solving specific hard problems ... The SDP-relaxations behave reasonably well ("efficiently"?) as they provide the BEST LOWER BOUNDS in very different contexts (in contrast to LP-relaxations). → The Theoretical Computer Science (TCS) community even speaks of a META-ALGORITHM → ... considered as the most promising tool to prove/disprove the Unique Games Conjecture (UGC)

Jean B. Lasserre semidefinite characterization

slide-76
SLIDE 76

A Lagrangian interpretation of LP-relaxations

Consider the optimization problem P : f ∗ = min {f(x) : x ∈ K }, where K is the compact basic semi-algebraic set: K := {x ∈ Rn : gj(x) ≥ 0; j = 1, . . . , m }. Assume that:

  • For every j = 1, . . . , m (and possibly after scaling), gj(x) ≤ 1

for all x ∈ K.

  • The family {gj, 1 − gj} generate R[x].

Jean B. Lasserre semidefinite characterization

slide-77
SLIDE 77

Lagrangian relaxation

The dual method of multipliers, or Lagrangian relaxation consists of solving: ρ := maxu{ G(u) : u ≥ 0 }, with u → G(u) := min

x

  f(x) −

m

  • j=1

uj gj(x)    . Equivalently: ρ = max

u,λ {λ : f(x) − m

  • j=1

uj gj(x) ≥ λ, ∀x.} In general, there is a DUALITY GAP , i.e., ρ < f ∗, except in the CONVEX case where f and −gj are all convex (and under some conditions).

Jean B. Lasserre semidefinite characterization

slide-78
SLIDE 78

With d ∈ N fixed, consider the new optimization problem Pd: f ∗

d = min x

{ f(x) :

m

  • j=1

gj(x)αj(1 − gj(x))βj ≥ 0 ∀ α, β : |α + β| =

j αj + βj ≤ 2d

  • Of course

P and Pd are equivalent and so f ∗

d = f ∗.

... because Pd is just P with additional redundant constraints!

Jean B. Lasserre semidefinite characterization

slide-79
SLIDE 79

The Lagrangian relaxation of Pd consists of solving: ρd = max

u≥0,λ {λ :

f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj ≥ λ, ∀x. |α + β| ≤ 2d} Theorem ρd ≤ f ∗ for all d ∈ N, and if K is compact and the family of polynomials {gj, 1 − gj} generates R[x], then: lim

d→∞ ρd = f ∗.

Jean B. Lasserre semidefinite characterization

slide-80
SLIDE 80

The Lagrangian relaxation of Pd consists of solving: ρd = max

u≥0,λ {λ :

f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj ≥ λ, ∀x. |α + β| ≤ 2d} Theorem ρd ≤ f ∗ for all d ∈ N, and if K is compact and the family of polynomials {gj, 1 − gj} generates R[x], then: lim

d→∞ ρd = f ∗.

Jean B. Lasserre semidefinite characterization

slide-81
SLIDE 81

The previous theorem provides a rationale for the well-known fact that : adding redundant constraints to P helps when doing relaxations! On the other hand ... we don’t know HOW TO COMPUTE ρd!

Jean B. Lasserre semidefinite characterization

slide-82
SLIDE 82

The previous theorem provides a rationale for the well-known fact that : adding redundant constraints to P helps when doing relaxations! On the other hand ... we don’t know HOW TO COMPUTE ρd!

Jean B. Lasserre semidefinite characterization

slide-83
SLIDE 83

The LP-hierarchy may be viewed as the BRUTE FORCE SIMPLIFICATION of ρd = max

u≥0,λ {λ :

f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj ≥ λ, ∀x. |α + β| ≤ 2d} to ... θd = max

u≥0,λ {λ :

f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj − λ = 0, ∀x. |α + β| ≤ 2d}

Jean B. Lasserre semidefinite characterization

slide-84
SLIDE 84

The LP-hierarchy may be viewed as the BRUTE FORCE SIMPLIFICATION of ρd = max

u≥0,λ {λ :

f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj ≥ λ, ∀x. |α + β| ≤ 2d} to ... θd = max

u≥0,λ {λ :

f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj − λ = 0, ∀x. |α + β| ≤ 2d}

Jean B. Lasserre semidefinite characterization

slide-85
SLIDE 85

and indeed, ... with |α + β| ≤ 2d, the set of (u, λ) such that u ≥ 0 and f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj − λ = 0, ∀x. is a CONVEX POLYTOPE! and so, computing θd is solving a Linear Program! and one has f ∗ ≥ ρd ≥ θd for all d.

Jean B. Lasserre semidefinite characterization

slide-86
SLIDE 86

and indeed, ... with |α + β| ≤ 2d, the set of (u, λ) such that u ≥ 0 and f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj − λ = 0, ∀x. is a CONVEX POLYTOPE! and so, computing θd is solving a Linear Program! and one has f ∗ ≥ ρd ≥ θd for all d.

Jean B. Lasserre semidefinite characterization

slide-87
SLIDE 87

However as already mentioned For most easy convex problems (except LP) finite convergence is impossible! Other obstructions to exactness occur Typically, if K is the polytope {x : gj(x) ≥ 0, j = 1, . . . , m} and f ∗ = f(x∗) with gj(x)∗ = 0, j ∈ J(x∗), then finite convergence is impossible as soon as the exists x = x∗ with J(x) = J(x∗) (x not necessarily in K)

Jean B. Lasserre semidefinite characterization

slide-88
SLIDE 88

However as already mentioned For most easy convex problems (except LP) finite convergence is impossible! Other obstructions to exactness occur Typically, if K is the polytope {x : gj(x) ≥ 0, j = 1, . . . , m} and f ∗ = f(x∗) with gj(x)∗ = 0, j ∈ J(x∗), then finite convergence is impossible as soon as the exists x = x∗ with J(x) = J(x∗) (x not necessarily in K)

Jean B. Lasserre semidefinite characterization

slide-89
SLIDE 89

A less brutal simplification

With k ≥ 1 FIXED, consider the LESS BRUTAL SIMPLIFICATION of ρd = max

u≥0,λ {λ :

f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj ≥ λ, ∀x. |α + β| ≤ 2d} to ... ρk

d = max u≥0,λ {λ :

f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj − λ = σ(x), |α + β| ≤ 2d; σ SOS of degree at most 2k}

Jean B. Lasserre semidefinite characterization

slide-90
SLIDE 90

A less brutal simplification

With k ≥ 1 FIXED, consider the LESS BRUTAL SIMPLIFICATION of ρd = max

u≥0,λ {λ :

f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj ≥ λ, ∀x. |α + β| ≤ 2d} to ... ρk

d = max u≥0,λ {λ :

f(x) −

  • α,β

uαβ

m

  • j=1

gj(x)αj(1 − gj(x))βj − λ = σ(x), |α + β| ≤ 2d; σ SOS of degree at most 2k}

Jean B. Lasserre semidefinite characterization

slide-91
SLIDE 91

Why such a simplification?

With k fixed, ρk

d = f ∗ as d → ∞.

Computing ρk

d is now solving an SDP (and not an LP any

more!) However, the size of the LMI constraint of this SDP is n+k

n

  • (fixed) and does not depend on d!

For convex problems where f and −gj are SOS-CONVEX polynomials, the first relaxation in the hierarchy is exact, that is, ρk

1 = f ∗ (never the case for the LP-hierarchy)

  • A polynomial f is SOS-CONVEX if its Hessian ∇2f(x) factors

as L(x) L(x)T for some polynomial matrix L(x). For instance, separable polynomials f(x) = n

i=1 fi(xi), with convex fi’s are

SOS-CONVEX.

Jean B. Lasserre semidefinite characterization

slide-92
SLIDE 92

Why such a simplification?

With k fixed, ρk

d = f ∗ as d → ∞.

Computing ρk

d is now solving an SDP (and not an LP any

more!) However, the size of the LMI constraint of this SDP is n+k

n

  • (fixed) and does not depend on d!

For convex problems where f and −gj are SOS-CONVEX polynomials, the first relaxation in the hierarchy is exact, that is, ρk

1 = f ∗ (never the case for the LP-hierarchy)

  • A polynomial f is SOS-CONVEX if its Hessian ∇2f(x) factors

as L(x) L(x)T for some polynomial matrix L(x). For instance, separable polynomials f(x) = n

i=1 fi(xi), with convex fi’s are

SOS-CONVEX.

Jean B. Lasserre semidefinite characterization

slide-93
SLIDE 93

Why such a simplification?

With k fixed, ρk

d = f ∗ as d → ∞.

Computing ρk

d is now solving an SDP (and not an LP any

more!) However, the size of the LMI constraint of this SDP is n+k

n

  • (fixed) and does not depend on d!

For convex problems where f and −gj are SOS-CONVEX polynomials, the first relaxation in the hierarchy is exact, that is, ρk

1 = f ∗ (never the case for the LP-hierarchy)

  • A polynomial f is SOS-CONVEX if its Hessian ∇2f(x) factors

as L(x) L(x)T for some polynomial matrix L(x). For instance, separable polynomials f(x) = n

i=1 fi(xi), with convex fi’s are

SOS-CONVEX.

Jean B. Lasserre semidefinite characterization

slide-94
SLIDE 94

Why such a simplification?

With k fixed, ρk

d = f ∗ as d → ∞.

Computing ρk

d is now solving an SDP (and not an LP any

more!) However, the size of the LMI constraint of this SDP is n+k

n

  • (fixed) and does not depend on d!

For convex problems where f and −gj are SOS-CONVEX polynomials, the first relaxation in the hierarchy is exact, that is, ρk

1 = f ∗ (never the case for the LP-hierarchy)

  • A polynomial f is SOS-CONVEX if its Hessian ∇2f(x) factors

as L(x) L(x)T for some polynomial matrix L(x). For instance, separable polynomials f(x) = n

i=1 fi(xi), with convex fi’s are

SOS-CONVEX.

Jean B. Lasserre semidefinite characterization

slide-95
SLIDE 95

Why such a simplification?

With k fixed, ρk

d = f ∗ as d → ∞.

Computing ρk

d is now solving an SDP (and not an LP any

more!) However, the size of the LMI constraint of this SDP is n+k

n

  • (fixed) and does not depend on d!

For convex problems where f and −gj are SOS-CONVEX polynomials, the first relaxation in the hierarchy is exact, that is, ρk

1 = f ∗ (never the case for the LP-hierarchy)

  • A polynomial f is SOS-CONVEX if its Hessian ∇2f(x) factors

as L(x) L(x)T for some polynomial matrix L(x). For instance, separable polynomials f(x) = n

i=1 fi(xi), with convex fi’s are

SOS-CONVEX.

Jean B. Lasserre semidefinite characterization

slide-96
SLIDE 96

An alternative moment-approach

Jean B. Lasserre semidefinite characterization

slide-97
SLIDE 97

So far we have considered LP- and SDP-moment approaches based on CERTIFICATES of POSITIVITY on K That is: One approximates FROM INSIDE the (convex cone) Cd(K) of polynomials nonnegative on K: For instance if K = {x : gj(x) ≥ 0, j = 1, . . . , m}, by the convex cones: Ck

d(K)

= { σ0

  • SOS

+

m

  • j=1

σj

  • SOS

gj : deg(σjgj) ≤ 2k } ∩ R[x]d Γk

d(K)

= {

  • (α,β)∈N2m

2k

cαβ

  • ≥0

m

  • j=1

gj

αj(1 − gj)βj } ∩ R[x]d

Jean B. Lasserre semidefinite characterization

slide-98
SLIDE 98

So far we have considered LP- and SDP-moment approaches based on CERTIFICATES of POSITIVITY on K That is: One approximates FROM INSIDE the (convex cone) Cd(K) of polynomials nonnegative on K: For instance if K = {x : gj(x) ≥ 0, j = 1, . . . , m}, by the convex cones: Ck

d(K)

= { σ0

  • SOS

+

m

  • j=1

σj

  • SOS

gj : deg(σjgj) ≤ 2k } ∩ R[x]d Γk

d(K)

= {

  • (α,β)∈N2m

2k

cαβ

  • ≥0

m

  • j=1

gj

αj(1 − gj)βj } ∩ R[x]d

Jean B. Lasserre semidefinite characterization

slide-99
SLIDE 99

An alternative is to try to approximate Cd(K) FROM OUTSIDE! Given a sequence y = (yα), α ∈ Nn:

  • Let Ly : R[x] → R be the Riesz linear functional:

g (=

  • β

gβ xβ) → Ly(g) :=

  • β

gβ yβ

  • Define the localizing matrix Mk(g y) with respect to y and

g ∈ R[x] is the real symmetric matrix with rows and columns indexed by α ∈ Nn and with entries Mk(g y)[α, β] = Ly(xα+β gj), α, β ∈ Nn

k.

⋆ If y comes from a measure µ then Ly(xα+β gj) =

  • xα+β gj(x) dµ.

Jean B. Lasserre semidefinite characterization

slide-100
SLIDE 100

An alternative is to try to approximate Cd(K) FROM OUTSIDE! Given a sequence y = (yα), α ∈ Nn:

  • Let Ly : R[x] → R be the Riesz linear functional:

g (=

  • β

gβ xβ) → Ly(g) :=

  • β

gβ yβ

  • Define the localizing matrix Mk(g y) with respect to y and

g ∈ R[x] is the real symmetric matrix with rows and columns indexed by α ∈ Nn and with entries Mk(g y)[α, β] = Ly(xα+β gj), α, β ∈ Nn

k.

⋆ If y comes from a measure µ then Ly(xα+β gj) =

  • xα+β gj(x) dµ.

Jean B. Lasserre semidefinite characterization

slide-101
SLIDE 101

An alternative is to try to approximate Cd(K) FROM OUTSIDE! Given a sequence y = (yα), α ∈ Nn:

  • Let Ly : R[x] → R be the Riesz linear functional:

g (=

  • β

gβ xβ) → Ly(g) :=

  • β

gβ yβ

  • Define the localizing matrix Mk(g y) with respect to y and

g ∈ R[x] is the real symmetric matrix with rows and columns indexed by α ∈ Nn and with entries Mk(g y)[α, β] = Ly(xα+β gj), α, β ∈ Nn

k.

⋆ If y comes from a measure µ then Ly(xα+β gj) =

  • xα+β gj(x) dµ.

Jean B. Lasserre semidefinite characterization

slide-102
SLIDE 102

Theorem Let K ⊂ Rn be compact and let y = (yα), α ∈ Nn, be the moments of a Borel measure whose support is K. Then a polynomial gj is nonnegative on K if and only if: Mk(gj y) 0, k = 0, 1, . . . So if y is known checking whether Mk(gj y) 0 is just computing the smallest eigenvalue of the matrix Mk(gj y)! The set ∆k ⊂ R[x]d defined by: ∆k := {g ∈ R[x]d : Mk(gj y) 0 }, k = 0, 1, . . . is a convex cone described by a LINEAR MATRIX INEQUALITY (LMI) on its coefficients (gα), α ∈ Nn

d!

Jean B. Lasserre semidefinite characterization

slide-103
SLIDE 103

Theorem Let K ⊂ Rn be compact and let y = (yα), α ∈ Nn, be the moments of a Borel measure whose support is K. Then a polynomial gj is nonnegative on K if and only if: Mk(gj y) 0, k = 0, 1, . . . So if y is known checking whether Mk(gj y) 0 is just computing the smallest eigenvalue of the matrix Mk(gj y)! The set ∆k ⊂ R[x]d defined by: ∆k := {g ∈ R[x]d : Mk(gj y) 0 }, k = 0, 1, . . . is a convex cone described by a LINEAR MATRIX INEQUALITY (LMI) on its coefficients (gα), α ∈ Nn

d!

Jean B. Lasserre semidefinite characterization

slide-104
SLIDE 104

Theorem Let K ⊂ Rn be compact and let y = (yα), α ∈ Nn, be the moments of a Borel measure whose support is K. Then a polynomial gj is nonnegative on K if and only if: Mk(gj y) 0, k = 0, 1, . . . So if y is known checking whether Mk(gj y) 0 is just computing the smallest eigenvalue of the matrix Mk(gj y)! The set ∆k ⊂ R[x]d defined by: ∆k := {g ∈ R[x]d : Mk(gj y) 0 }, k = 0, 1, . . . is a convex cone described by a LINEAR MATRIX INEQUALITY (LMI) on its coefficients (gα), α ∈ Nn

d!

Jean B. Lasserre semidefinite characterization

slide-105
SLIDE 105

Of course Cd(K) ⊂ ∆k ⊂ ∆k−1, for all k = 0, 1, . . ., and so Cd(K) = ∩∞

k=0∆k, i.e.,

The convex cones ∆k form a nested sequence of INNER APPROXIMATIONS of Cd(K). Examples of sets for which the moments of a measure µ can be computed easily include: In the compact case: hyper-Rectangle [a, b]n, Ellipsoid {x : (x − m)TQ(x − m) ≤ 1}, simplex {x ≥ 0 :

i aixi ≤ b},

hypercube {−1, 1}n with µ being uniformly distributed, and in the non-compact case: Rn with dµ = exp(−x2)dx, and Rn

+ with dµ = exp(− i |xi|)dx.

Jean B. Lasserre semidefinite characterization

slide-106
SLIDE 106

Of course Cd(K) ⊂ ∆k ⊂ ∆k−1, for all k = 0, 1, . . ., and so Cd(K) = ∩∞

k=0∆k, i.e.,

The convex cones ∆k form a nested sequence of INNER APPROXIMATIONS of Cd(K). Examples of sets for which the moments of a measure µ can be computed easily include: In the compact case: hyper-Rectangle [a, b]n, Ellipsoid {x : (x − m)TQ(x − m) ≤ 1}, simplex {x ≥ 0 :

i aixi ≤ b},

hypercube {−1, 1}n with µ being uniformly distributed, and in the non-compact case: Rn with dµ = exp(−x2)dx, and Rn

+ with dµ = exp(− i |xi|)dx.

Jean B. Lasserre semidefinite characterization

slide-107
SLIDE 107

Application to optimization

Let f ∗ = minx{f(x) : x ∈ K} and let y = (yα), α ∈ Nn, be the moments of a measure µ whose support is K. For each d ∈ N consider the optimization problem: ρd = max

λ {λ : Md(f y) λ Md(y) }.

with the single unknown λ. Computing ρd is solving a generalized eigenvalue problem associated with Md(f y) and Md(y). ρd ≥ f ∗ for all d and ρd → f ∗ as d → ∞

Jean B. Lasserre semidefinite characterization

slide-108
SLIDE 108

Application to optimization

Let f ∗ = minx{f(x) : x ∈ K} and let y = (yα), α ∈ Nn, be the moments of a measure µ whose support is K. For each d ∈ N consider the optimization problem: ρd = max

λ {λ : Md(f y) λ Md(y) }.

with the single unknown λ. Computing ρd is solving a generalized eigenvalue problem associated with Md(f y) and Md(y). ρd ≥ f ∗ for all d and ρd → f ∗ as d → ∞

Jean B. Lasserre semidefinite characterization

slide-109
SLIDE 109

In other words: the sequence (ρd), d ∈ N, provides a converging sequence of upper bounds on f ∗! Example: MAX-CUT problem: f(x) = xTQ x and K = {−1, 1}n. Take for µ the measure uniformly distributed on K with weights 1/2, and so with moments: yα =

  • {−1,1}n xα dx =

0 if αi is odd for some i 1 otherwise Then build up the localizing matrix Md(f y) and solve ρd = max

λ {λ : Md(f y) λ Md(y) }.

In fact, same as computing the smallest eigenvalue of Md(f y) (keeping only the rows and columns of Md(f y) indexed by square-free monomials (xα).

Jean B. Lasserre semidefinite characterization

slide-110
SLIDE 110

In other words: the sequence (ρd), d ∈ N, provides a converging sequence of upper bounds on f ∗! Example: MAX-CUT problem: f(x) = xTQ x and K = {−1, 1}n. Take for µ the measure uniformly distributed on K with weights 1/2, and so with moments: yα =

  • {−1,1}n xα dx =

0 if αi is odd for some i 1 otherwise Then build up the localizing matrix Md(f y) and solve ρd = max

λ {λ : Md(f y) λ Md(y) }.

In fact, same as computing the smallest eigenvalue of Md(f y) (keeping only the rows and columns of Md(f y) indexed by square-free monomials (xα).

Jean B. Lasserre semidefinite characterization

slide-111
SLIDE 111

In other words: the sequence (ρd), d ∈ N, provides a converging sequence of upper bounds on f ∗! Example: MAX-CUT problem: f(x) = xTQ x and K = {−1, 1}n. Take for µ the measure uniformly distributed on K with weights 1/2, and so with moments: yα =

  • {−1,1}n xα dx =

0 if αi is odd for some i 1 otherwise Then build up the localizing matrix Md(f y) and solve ρd = max

λ {λ : Md(f y) λ Md(y) }.

In fact, same as computing the smallest eigenvalue of Md(f y) (keeping only the rows and columns of Md(f y) indexed by square-free monomials (xα).

Jean B. Lasserre semidefinite characterization

slide-112
SLIDE 112

THANK YOU!!

Jean B. Lasserre semidefinite characterization