Stage Quantile regression by random projections Forecasting energy - - PowerPoint PPT Presentation

stage
SMART_READER_LITE
LIVE PREVIEW

Stage Quantile regression by random projections Forecasting energy - - PowerPoint PPT Presentation

Stage Quantile regression by random projections Forecasting energy prices Involves statistics, probability theory, LP Implementation: easy Taste for theory Supported by grants from Siebel Energy Institute and RTE Could


slide-1
SLIDE 1

Stage

◮ Quantile regression by random projections ◮ Forecasting energy prices ◮ Involves statistics, probability theory, LP ◮ Implementation: easy ◮ Taste for theory ◮ Supported by grants from Siebel Energy Institute

and RTE

◮ Could lead to CIFRE PhD with RTE ◮ Hurry if interested

1 / 16

slide-2
SLIDE 2

Mixed-Integer Nonlinear Programming

Leo Liberti, CNRS LIX Ecole Polytechnique ❧✐❜❡rt✐❅❧✐①✳♣♦❧②t❡❝❤♥✐q✉❡✳❢r PMA@MPRO

2 / 16

slide-3
SLIDE 3

Mathematical Programming Formulations

MPRO — PMA – p. 18

slide-4
SLIDE 4

Mathematical Programming

MP: formal language for expressing optimization problems P

Parameters p =problem input

p also called an instance of P

Decision variables x: encode problem output Objective function min f(p, x) Constraints ∀i ≤ m

gi(p, x) ≤ 0 f, g: explicit mathematical expressions involving symbols p, x If an instance p is given (i.e. an assignment of numbers to the symbols in p is known), write f(x), gi(x) This excludes black-box optimization

MPRO — PMA – p. 19

slide-5
SLIDE 5

Main optimization problem classes

gen. pooling blackbox MINLP linear continuous integer NLP cMINLP MILP cNLP SOCP SDP LP BLP BQP pooling MBQP nonlinear graph drawing

MPRO — PMA – p. 20

slide-6
SLIDE 6

Notation

P: MP formulation with decision variables x = (x1, . . . , xn)

Solution: assignment of values to decision variables,

i.e. a vector v ∈ Rn F(P) =set of feasible solutions x ∈ Rn such that ∀i ≤ m (gi(x) ≤ 0) G(P) =set of globally optimal solutions x ∈ Rn s.t. x ∈ F(P) and ∀y ∈ F(P) (f(x) ≤ f(y))

MPRO — PMA – p. 21

slide-7
SLIDE 7

Citations

Williams, Model building in mathematical programming, 2002 Liberti, Cafieri, Tarissan, Reformulations in Mathematical

Programming: a computational approach, in Abraham et al.

(eds.), Foundations of Comput. Intel., 2009

MPRO — PMA – p. 22

slide-8
SLIDE 8

Haverly’s pooling problem

MPRO — PMA – p. 3

slide-9
SLIDE 9

Description

Given an oil routing network with pools and blenders, unit prices, demands and quality requirements:

Pool Blend 1 Blend 2

x11 3% Sulphur $ 6 x21 1% Sulphur $ 16 x12 2% Sulphur $ 10 y11 y12 y21 y22 ≤ 2.5% Sulphur $ 9 ≤ 100 ≤ 1.5% Sulphur $ 15 ≤ 200

Find the input quantities minimizing the costs and satisfying the constraints: mass balance, sulphur balance, quantity and quality demands

MPRO — PMA – p. 4

slide-10
SLIDE 10

Variables and constraints

Variables: input quantities x, routed quantities y, percentage p of sulphur in pool Every variable must be ≥ 0 (physical quantities) Bilinear terms arise to express sulphur quantities in terms of p, y Sulphur balance constraint: 3x11 + x21 = p(y11 + y12) Quality demands: py11 + 2y21 ≤ 2.5(y11 + y21) py12 + 2y22 ≤ 1.5(y12 + y22) Continuous bilinear formulation ⇒ nonconvex NLP

MPRO — PMA – p. 5

slide-11
SLIDE 11

Formulation

Pool Blend 1 Blend 2

x11

3% Sulphur $ 6

x21

1% Sulphur $ 16

x12

2% Sulphur $ 10

y11 y12 y21 y22 ≤ 2.5% Sulphur

$ 9

≤ 100 ≤ 1.5% Sulphur

$ 15

≤ 200                                          min

x,y,p

6x11 + 16x21 + 10x12− −9(y11 + y21) − 15(y12 + y22)

cost s.t.

x11 + x21 − y11 − y12 = 0

mass balance

x12 − y21 − y22 = 0

mass balance

y11 + y21 ≤ 100

demand

y12 + y22 ≤ 200

demand

3x11 + x21 − p(y11 + y12) = 0

sulphur balance

py11 + 2y21 ≤ 2.5(y11 + y21)

sulphur limit

py12 + 2y22 ≤ 1.5(y12 + y22)

sulphur limit

MPRO — PMA – p. 6

slide-12
SLIDE 12

Network design

Decide whether to install pipes or not (0/1 decision) Associate a binary variable zij with each pipe                                                  min

x,y,p,z

6x11 + 16x21 + 10x12 +

ij θijzij−

−9(y11 + y21) − 15(y12 + y22)

cost

s.t. x11 + x21 − y11 − y12 = 0

mass balance

x12 − y21 − y22 = 0

mass balance

y11 + y21 ≤ 100

demand

y12 + y22 ≤ 200

demand

∀i, j ≤ 2 yij ≤ 200zij

pipe activation: if zij = 0, no flow

3x11 + x21 − p(y11 + y12) = 0

sulphur balance

py11 + 2y21 ≤ 2.5(y11 + y21)

sulphur limit

py12 + 2y22 ≤ 1.5(y12 + y22)

sulphur limit

MPRO — PMA – p. 7

slide-13
SLIDE 13

The optimal network

Pool Blend 1 Blend 2

x11 = 0 3% Sulphur $ 6 x21 = 100 1% Sulphur $ 16 x12 = 100 2% Sulphur $ 10 y12 = 100 y22 = 100 ≤ 2.5% Sulphur $ 9 ≤ 100 ≤ 1.5% Sulphur $ 15 ≤ 200

z11 = 0, z21 = 0 z12 = 1, z22 = 1

MPRO — PMA – p. 8

slide-14
SLIDE 14

Pooling problem network

4 / 16

slide-15
SLIDE 15

Formulation: sets and parameters

◮ Sets

◮ I: index set for input nodes ◮ P: index set for pool nodes ◮ J: index set for output nodes ◮ K: index set for flow attributes ◮ ∀p ∈ P N−(p): index set for inputs → p ◮ ∀p ∈ P N+(p): index set for p → outputs

◮ Parameters

◮ ∀i ∈ I Si = supply at node i ◮ ∀j ∈ J Di = max. demand at node j ◮ ∀i ∈ I, k ∈ K Aik = qty of attribute k in input flow i ◮ ∀j ∈ J, k ∈ K Ljk = min qty attr k at output j ◮ ∀j ∈ J, k ∈ K Ujk = max qty attr k at output j ◮ ∀i ∈ I cI

i = unit acquisition costs at input i

◮ ∀j ∈ J cJ

j = unit selling price at output j

5 / 16

slide-16
SLIDE 16

Formulation: decision variables & objective

◮ Decision variables

◮ ∀i ∈ I, p ∈ P xip = flow in pipe (i, p) ◮ ∀p ∈ P, j ∈ J yip = flow in pipe (p, j) ◮ ∀p ∈ P, k ∈ K qpk = qty attr k in pool p ◮ ∀j ∈ J, k ∈ K Qjk = qty attr k in output j

◮ Objective function

min F(x, y) =

  • p∈P

i∈N−(p)

cI

i xip −

  • p∈P

j∈N+(p)

cJ

j ypj

6 / 16

slide-17
SLIDE 17

Formulation: constraints

◮ Supply: ∀i ∈ I

  • p∈P

i∈N−(p)

xip ≤ Si

◮ Max demand: ∀j ∈ J

  • p∈P

j∈N+(p)

ypj ≤ Dj

◮ Mass balance for flow across pools:

∀p ∈ P

  • i∈N−(p)

xip =

  • j∈N+(p)

ypj

◮ Attr. qty balance input/pools:

∀p ∈ P, k ∈ K

  • i∈N−(p)

Aikxip =

  • i∈N−(p)

qpkxip

◮ Attr. qty balance pools/output:

∀j ∈ J, k ∈ K

  • p∈P

j∈N+(p)

qpkypj =

  • p∈P

j∈N+(p)

Qjkypj

7 / 16

slide-18
SLIDE 18

Generalized pooling problem

◮ Decision variables

◮ ∀i ∈ I, p ∈ P z✐♥

ip = 1 iff pipe (i, p) installed, 0 othw

◮ ∀p ∈ P, j ∈ J z♦✉t

pj = 1 iff pipe (p, j) installed, 0 othw ◮ Objective function

min F(x, y) +

  • p∈P

i∈N−(p)

z✐♥

ip +

  • p∈P

j∈N+(p)

z♦✉t

pj ◮ Constraints

◮ Pipe activation:

∀p ∈ P, i ∈ N−(p) xip ≤ Siz✐♥

ip

∀p ∈ P, j ∈ N+(p) ypj ≤ Djz♦✉t

pj

8 / 16

slide-19
SLIDE 19

Classification in systematics

◮ Attribute constraints involve qpkxip, qpkypj, Qjkypj ◮ Bilinear terms in equations: nonconvex F(P) ◮ ⇒ (nonconvex) NLP ◮ Generalized pooling problem: (nonconvex) MINLP

9 / 16

slide-20
SLIDE 20

Citations

  • 1. C. Haverly, Studies of the behaviour of recursion for the pooling

problem, ACM SIGMAP Bulletin, 1978

  • 2. Adhya, Tawarmalani, Sahinidis, A Lagrangian approach to

the pooling problem, Ind. Eng. Chem., 1999

  • 3. Audet et al., Pooling Problem: Alternate Formulations and

Solution Methods, Manag. Sci., 2004

  • 4. Liberti, Pantelides, An exact reformulation algorithm for large

nonconvex NLPs involving bilinear terms, JOGO, 2006

  • 5. Misener, Floudas, Advances for the pooling problem: modeling,

global optimization, and computational studies,

  • Appl. Comput. Math., 2009
  • 6. D’Ambrosio, Linderoth, Luedtke, Valid inequalities for the

pooling problem with binary variables, LNCS, 2011

MPRO — PMA – p. 9

slide-21
SLIDE 21

Drawing graphs

MPRO — PMA – p. 10

slide-22
SLIDE 22

At a glance

1 5 2 6 3 7 4 8 9 1 5 2 6 3 7 4 8 9 1 5 2 6 3 7 4 8 9

1 5 2 6 3 7 4 8 9

1 5 2 6 3 7 4 8 9

Which graph has most symmetries?

MPRO — PMA – p. 11

slide-23
SLIDE 23

Euclidean graphs

Graph G = (V, E), edge weight function d : E → R+ E.g. V = {1, 2, 3}, E = {{1, 2}, {1, 3}, {2, 3}} d12 = d13 = d23 = 1 Find positions xv = (xv1, xv2) of each v ∈ V in the plane s.t.: ∀{u, v} ∈ E xu − xv2 = duv Generalization to RK for K ∈ N: xv = (xv1, . . . , xvK) 1 2 3

MPRO — PMA – p. 12

slide-24
SLIDE 24

Application to proteomics

An artificial protein test set: lavor-11 7

1 2 3 4 8 5 7 9 6 10 0 / 1.526 1 / 2.49139 2 / 3.8393 3 / 1.526 4 / 2.49139 5 / 3.83142 27 / 3.38763 6 / 1.526 7 / 2.49139 29 / 3.00337 8 / 3.8356 28 / 3.96678 30 / 3.79628 9 / 1.526 32 / 2.10239 10 / 2.49139 31 / 2.60831 33 / 3.15931 11 / 3.03059 34 / 2.68908 12 / 1.526 14 / 2.89935 35 / 3.13225 13 / 2.49139 24 / 1.526 25 / 2.49139 17 / 3.08691 16 / 2.49139 36 / 3.55753 15 / 1.526 21 / 1.526 22 / 2.49139 23 / 2.88882 26 / 1.526 19 / 2.49139 18 / 1.526 20 / 2.78861 37 / 3.22866 MPRO — PMA – p. 13

slide-25
SLIDE 25

Embedding protein data in R3

1aqr: four non-isometric embeddings

MPRO — PMA – p. 14

slide-26
SLIDE 26

Sensor networks in 2D and 3D

MPRO — PMA – p. 15

slide-27
SLIDE 27

Formulation

min

x,t

  • {u,v}∈E

t2

uv

∀{u, v} ∈ E

  • k≤K

(xuk − xvk)2 = d2

uv + tuv

MPRO — PMA – p. 16

slide-28
SLIDE 28

Citations

  • 1. Lavor, Liberti, Maculan, Mucherino, Recent advances on the

discretizable molecular distance geometry problem, Eur. J. of

  • Op. Res., invited survey
  • 2. Liberti, Lavor, Mucherino, Maculan, Molecular distance

geometry methods: from continuous to discrete, Int. Trans. in

  • Op. Res., 18:33-51, 2010
  • 3. Liberti, Lavor, Maculan, Computational experience with the

molecular distance geometry problem, in J. Pintér (ed.), Global Optimization: Scientific and Engineering Case Studies, Springer,

Berlin, 2006

MPRO — PMA – p. 17

slide-29
SLIDE 29

Reformulations

MPRO — PMA – p. 23

slide-30
SLIDE 30

Exact reformulations

The formulation Q is an exact reformulation of P if ∃ an efficiently computable surjective map φ : F(Q) → F(P) s.t. φ|G(Q) is onto G(P) Informally: any optimum of Q can be mapped easily to an

  • ptimum of P , and for any optimum of P there is a corresponding
  • ptimum of Q

P Q F F G G φ φ|G

Construct Q so that it is easier to solve than P

MPRO — PMA – p. 24

slide-31
SLIDE 31

xy when x is binary

If ∃ bilinear term xy where x ∈ {0, 1}, y ∈ [0, 1] We can construct an exact reformulation: Replace each term xy by an added variable w Adjoin Fortet’s reformulation constraints: w ≥ w ≥ x + y − 1 w ≤ x w ≤ y Get a MILP reformulation Solve reformulation using CPLEX: more effective than solving MINLP

MPRO — PMA – p. 25

slide-32
SLIDE 32

“Proof”

MPRO — PMA – p. 26

slide-33
SLIDE 33

Relaxations

The formulation Q is a relaxation of P if min fQ(y) ≤ min fP (x) (∗) Relaxations are used to compute worst-case bounds to the optimum value of the original formulation Construct Q so that it is easy to solve Proving (∗) may not be easy in general

The usual strategy:

Make sure y ⊃ x and F(Q) ⊇ F(P) Make sure ∀x ∈ F(P) (fQ(y) ≤ fP (x)) Then it follows that Q is a relaxation of P Example: convex relaxation F(Q) a convex set containing F(P) fQ a convex underestimator of fP Then Q is a cNLP and can be solve efficiently

MPRO — PMA – p. 27

slide-34
SLIDE 34

xy when x, y continuous

Get bilinear term xy where x ∈ [xL, xU], y ∈ [yL, yU] We can construct a relaxation: Replace each term xy by an added variable w Adjoin following constraints: w ≥ xLy + yLx − xLyL w ≥ xUy + yUx − xUyU w ≤ xUy + yLx − xUyL w ≤ xLy + yUx − xLyU These are called McCormick’s envelopes Get an LP relaxation (solvable in polynomial time)

MPRO — PMA – p. 28

slide-35
SLIDE 35

Software

ROSE (https://projects.coin-or.org/ROSE)

MPRO — PMA – p. 29

slide-36
SLIDE 36

Citations

McCormick, Computability of global solutions to factorable

nonconvex programs: Part I — Convex underestimating problems,

  • Math. Prog. 1976

Liberti, Reformulations in Mathematical Programming: definitions

and systematics, RAIRO-RO 2009

MPRO — PMA – p. 30

slide-37
SLIDE 37

Global Optimization methods

MPRO — PMA – p. 31

slide-38
SLIDE 38

Deterministic / Stochastic

subregion 2 b c a d e subregion 1 discarded as h(c) > f(e) b: local solution of objective function in whole space a: solution of convex relaxation in whole space convex relaxation in whole space

  • bjective function

f(x) g(x) h(x)

Exact = Deterministic

“Exact” in continuous space: ε-approximate (find solution within

pre-determined ε distance from optimum in

  • bj. fun. value)

For some problems, fi- nite convergence to opti- mum (ε = 0)

Heuristic = Stochastic

Find solution with proba- bility 1 in infinite time

MPRO — PMA – p. 32

slide-39
SLIDE 39

Multistart

The easiest GO method

1: f∗ = ∞ 2: x∗ = (∞, . . . , ∞) 3: while ¬ termination do 4:

x′ = (random(), . . . , random())

5:

x = localSolve(P, x′)

6:

if fP(x) < f∗ then

7:

f∗ ← fP(x)

8:

x∗ ← x

9:

end if

10: end while

Termination condition: e.g. repeat k times

MPRO — PMA – p. 33

slide-40
SLIDE 40

Six-hump camelback function

f(x, y) = 4x2 − 2.1x4 + 1

3x6 + xy − 4y2 + 4y4

Global optimum (COUENNE)

MPRO — PMA – p. 34

slide-41
SLIDE 41

Six-hump camelback function

f(x, y) = 4x2 − 2.1x4 + 1

3x6 + xy − 4y2 + 4y4

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • 1
  • 0.5

0.5 1

  • 2
  • 1

1 2 3 4 5 6

Multistart with IPOPT, k = 5

MPRO — PMA – p. 34

slide-42
SLIDE 42

Six-hump camelback function

f(x, y) = 4x2 − 2.1x4 + 1

3x6 + xy − 4y2 + 4y4

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • 1
  • 0.5

0.5 1

  • 2
  • 1

1 2 3 4 5 6

Multistart with IPOPT, k = 10

MPRO — PMA – p. 34

slide-43
SLIDE 43

Six-hump camelback function

f(x, y) = 4x2 − 2.1x4 + 1

3x6 + xy − 4y2 + 4y4

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • 1
  • 0.5

0.5 1

  • 2
  • 1

1 2 3 4 5 6

Multistart with IPOPT, k = 20

MPRO — PMA – p. 34

slide-44
SLIDE 44

Six-hump camelback function

f(x, y) = 4x2 − 2.1x4 + 1

3x6 + xy − 4y2 + 4y4

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • 1
  • 0.5

0.5 1

  • 2
  • 1

1 2 3 4 5 6

Multistart with IPOPT, k = 50

MPRO — PMA – p. 34

slide-45
SLIDE 45

Six-hump camelback function

f(x, y) = 4x2 − 2.1x4 + 1

3x6 + xy − 4y2 + 4y4

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • 1
  • 0.5

0.5 1

  • 2
  • 1

1 2 3 4 5 6

Multistart with SNOPT, k = 20

MPRO — PMA – p. 34

slide-46
SLIDE 46

Citations

Schoen, Two-Phase Methods for Global Optimization, in Pardalos et al. (eds.), Handbook of Global Optimization 2, 2002 Liberti, Kucherenko, Comparison of deterministic and

stochastic approaches to global optimization, ITOR 2005

MPRO — PMA – p. 35

slide-47
SLIDE 47

Section 2 Variable Neighbourhood Search

10 / 16

slide-48
SLIDE 48

Variable Neighbourhood Search

◮ Applicable to discrete and continuous problems ◮ Uses any local search as a black-box ◮ In its basic form, easy to implement ◮ Few configurable parameters ◮ Structure of the problem dealt with by local search ◮ Few lines of code around LS black-box

11 / 16

slide-49
SLIDE 49

Variable Neighbourhood Search

12 / 16

slide-50
SLIDE 50

Variable Neighbourhood Search

1: Input: max no. kmax of neighbourhoods 2: loop 3:

k ← 1, sample rnd. pt. ˜ x, LocSearch(˜ x) = x∗

4:

while k ≤ kmax do

5:

Nk(x∗) neighb. of x∗ s.t. Nk(x∗) ⊃ Nk−1(x∗)

6:

sample rnd. pt. ˜ x from Nk(x∗)

7:

LocSearch(˜ x) = x′

8:

if x′ better than x∗ then

9:

x∗ ← x′, k ← 0

10:

end if

11:

k ← k + 1

12:

if termination condition, then exit

13:

end while

14: end loop

13 / 16

slide-51
SLIDE 51

Neighbourhood structure (continuous vars)

14 / 16

slide-52
SLIDE 52

Neighbourhood structure (binary vars)

◮ y ∈ {0, 1}p binary vars ◮ current incumbent y∗ ∈ {0, 1}p ◮ “neighbourhood” centered at y∗ of radius k ∈ N:

  • i≤p

y∗ i =0

yi +

  • i≤p

y∗ i =1

(1 − yi) ≤ k

◮ Local Branching constraint

allows at most k flips on p bin vars

15 / 16

slide-53
SLIDE 53

Citations

  • 1. L. Liberti, M. Dražić, Variable Neighbourhood Search

for the Global Optimization of Constrained NLPs,

  • Proc. of the Global Optimization Workshop,

Almeria, Spain, 18-22 September 2005

  • 2. L. Liberti, N. Mladenović, G. Nannicini, A recipe for

finding good solutions to MINLPs, Mathematical Programming Computation, 3:349-390, 2011

16 / 16

slide-54
SLIDE 54

spatial Branch-and-Bound (sBB)

MPRO — PMA – p. 36

slide-55
SLIDE 55

Generalities

Tree-like search Explores search space exhaustively but implicitly Builds a sequence of decreasing upper bounds and increasing lower bounds to the global optimum Exponential worst-case Only general-purpose “exact” algorithm for MINLP

Since continuous vars are involved, should say “ε-approximate”

Like BB for MILP , but may branch on continuous vars

Done whenever one is involved in a nonconvex term

MPRO — PMA – p. 37

slide-56
SLIDE 56

Example

xL xU

Original problem P

MPRO — PMA – p. 38

slide-57
SLIDE 57

Example

  • xL

xU x′

Starting point x′

MPRO — PMA – p. 38

slide-58
SLIDE 58

Example

localSolve

  • xL

xU x′ x∗

Local (upper bounding) solution x∗

MPRO — PMA – p. 38

slide-59
SLIDE 59

Example

  • xL

xU x′ x∗ ¯ x f∗ ¯ f

Convex relaxation (lower) bound ¯

f with |f∗ − ¯ f| > ε

MPRO — PMA – p. 38

slide-60
SLIDE 60

Example

  • xL

xU x′ x∗ ¯ x f∗ C1 C2

Branch at x = ¯

x into C1, C2

MPRO — PMA – p. 38

slide-61
SLIDE 61

Example

  • xL

xU x∗ ¯ x f∗ ¯ f C1 C2

Convex relaxation on C1: lower bounding solution ¯

x

MPRO — PMA – p. 38

slide-62
SLIDE 62

Example

localSolve

  • xL

xU x∗ ¯ x f∗ ¯ f C1 C2

  • localSolve. from ¯

x: new upper bounding solution x∗

MPRO — PMA – p. 38

slide-63
SLIDE 63

Example

  • xL

xU x∗ ¯ x f∗ ¯ f C2 C3 C4 |f∗ − ¯ f| > ε: branch at x = ¯ x

MPRO — PMA – p. 38

slide-64
SLIDE 64

Example

  • xL

xU x∗ = ¯ x f∗ = ¯ f C2 C3 C4

Repeat on C3: get ¯

x = x∗ and |f∗ − ¯ f| < ε, no more branching

MPRO — PMA – p. 38

slide-65
SLIDE 65

Example

  • xL

xU x∗ ¯ x f∗ ¯ f C2 C4

Repeat on C2: ¯

f > f∗ (can’t improve x∗ in C2)

MPRO — PMA – p. 38

slide-66
SLIDE 66

Example

  • xL

xU x∗ ¯ x f∗ ¯ f C4

Repeat on C4: ¯

f > f∗ (can’t improve x∗ in C4)

MPRO — PMA – p. 38

slide-67
SLIDE 67

Example

  • xL

xU x∗ f∗

No more subproblems left, return x∗ and terminate

MPRO — PMA – p. 38

slide-68
SLIDE 68

Pruning

  • 1. P was branched into C1, C2
  • 2. C1 was branched into C3, C4
  • 3. C3 was pruned by optimality

(x∗ ∈ G(C3) was found)

  • 4. C2, C4 were pruned by bound

(lower bound for C2 worse than f∗)

  • 5. No more nodes: whole space explored, x∗ ∈ G(P)

Search generates a tree Suproblems are nodes Nodes can be pruned by optimality, bound or

infeasibility (when subproblem is infeasible)

Otherwise, they are branched

MPRO — PMA – p. 39

slide-69
SLIDE 69

Logical flow

Notation:

C = P[xL, xU] is P restricted to x ∈ [xL, xU] x∗: best optimum so far (start with x∗ = ∞) C could be feasible or infeasible If C is feasible, we might find a glob. opt. x′ of C or not If we find glob. opt. x′ improving x∗, update x∗ ← x′ Else, try and show no point in F(C) improves x∗ · Else branch C into two suproblems and recurse

  • n each

subproblems have smaller feasible regions ⇒ “easier”

Else C is infeasible, discard

MPRO — PMA – p. 40

slide-70
SLIDE 70

Correctness

Look at else cases: C infeasible ⇒ can discard C C feasible and no point F(C) improves x∗ ⇒ can discard C Branching ⇒ any subproblem that we’re NOT sure could improve x∗ is considered again later ⇒ If process terminates, we’ll have explored all those parts of F(P) that can contain an optimum better than x∗ If x∗ = ∞, P infeasible, otherwise x∗ ∈ G(P) Might fail to terminate if ε = 0

MPRO — PMA – p. 41

slide-71
SLIDE 71

A recursive version

processSubProblemε(C):

1: if isFeasible(C) then 2:

x′ = globalOpt(C)

3:

if x′ = ∞ then

4:

if fP (x′) < fP (x∗) then

5:

update x∗ ← x′ // improvement

6:

end if

7:

else

8:

if lowerBound(C) < fP (x∗) − ε then

9:

Split [xL, xU] into two hyperrectangles [xL, ˜ x], [x, xU]

10:

processSubProblemε(C[xL, ˜

x])

11:

processSubProblemε(C[x, xU])

12:

end if

13:

end if

14: end if

MPRO — PMA – p. 42

slide-72
SLIDE 72

Bad news

  • 1. If globalOpt(C) works on any problem, why not call

globalOpt(P) and be done with it?

  • 2. For arbitrary C, isFeasible(C) is undecidable
  • 3. How do we compute lowerBound(C)?

MPRO — PMA – p. 43

slide-73
SLIDE 73

Upper bounds

Upper bounds: x∗ can only decrease

Computing the global optima for each subproblem yields candidates for updating x∗ As long as we only update x∗ when x′ improves it, we don’t need x′ to be a global optimum Any “good feasible point” will do Specifically, use feasible local optima ⇒ Replace globalOpt() by localSolve()

MPRO — PMA – p. 44

slide-74
SLIDE 74

Lower bound

Lower bounds: increase over ⊃-chains

Let RP be a relaxation of P such that:

  • 1. RP also involves the decision variables of P

(and perhaps some others)

  • 2. for any range I = [xL, xU],

RP [I] is a relaxation of P[I]

  • 3. if I, I′ are two ranges

I ⊇ I′ → min RP [I] ≤ min RP [I′]

  • 4. For any subproblem C of P,

finding x ∈ G(RC) or showing F(RC) = ∅ is efficient Specifically, ¯ x = localSolve(RC) ∈ G(RC)

Define lowerBound(C) = fRC(¯ x)

MPRO — PMA – p. 45

slide-75
SLIDE 75

A decidable feasibility test

Processing C when it’s infeasible will make sBB slower but not incorrect ⇒ sBB still works if we simply never discard a potentially feasible C Use a “partial feasibility test” isEvidentlyInfeasible(P) If isEvidentlyInfeasible(C) is true, then C is guaranteed to be infeasible, and we can discard it Otherwise, we simply don’t know, and we shall process it

Thm: If RC is infeasible then C is infeasible Proof: ∅ = F(RC) ⊇ F(C) = ∅ isEvidentlyInfeasible(C) =

   true if localSolve(RC) = ∞ false

  • therwise

MPRO — PMA – p. 46

slide-76
SLIDE 76

Choice of best next node

Instead recursion order, process first nodes which are more likely to yield a glob. opt.

Advantages

  • Glob. opt. of P found early

⇒ easier to prune by bound If sBB stopped early, more chance that x∗ ∈ G(P) Indication of a “good subproblem”: if lower bound is lowest Store subproblems in a min-priority queue Q, where

priority(C) is given by a lower bound for C

MPRO — PMA – p. 47

slide-77
SLIDE 77

Software

COUENNE (open source, AMPL interface) (projects.coin-or.org/Couenne)

GlobSol (open source, interval arithmetic bounds)

(http://interval.louisiana.edu/GLOBSOL/)

BARON (commercial, GAMS interface) LGO (commercial, Lipschitz constant bounds) LINDOGLOBAL (commercial) Some research codes (αBB, ooOPS, LaGO, GLOP , Coconut)

MPRO — PMA – p. 48

slide-78
SLIDE 78

Citations

Falk, Soland, An algorithm for separable nonconvex programming problems,

  • Manag. Sci. 1969

Horst, Tuy, Global Optimization, Springer 1990 Adjiman, Floudas et al., A global optimization method, αBB, for general

twice-differentiable nonconvex NLPs, Comp. Chem. Eng. 1998

Ryoo, Sahinidis, Global optimization of nonconvex NLPs and MINLPs with

applications in process design, Comp. Chem. Eng. 1995

Smith, Pantelides, A symbolic reformulation/spatial branch-and-bound algorithm

for the global optimisation of nonconvex MINLPs, Comp. Chem. Eng. 1999

Nowak, Relaxation and decomposition methods for Mixed Integer Nonlinear

Programming, Birkhäuser, 2005

Belotti, Liberti et al., Branching and bounds tightening techniques for nonconvex

MINLP, Opt. Meth. Softw., 2009

MPRO — PMA – p. 49

slide-79
SLIDE 79

To make an sBB work efficiently, you need further tricks

MPRO — PMA – p. 50

slide-80
SLIDE 80

Expression trees

Representation of objective f and constraints g

Encode mathematical expressions in trees or DAGs E.g. x2

1 + x1x2:

^

× + 2 x1 x1 x2

tree

^

× + 2 x1 x2

DAG

MPRO — PMA – p. 51

slide-81
SLIDE 81

Standard form

Identify all nonlinear terms xi ⊗ xj, replace them with a linearizing variable wij Add a defining constraint wij = xi ⊗ xj to the formulation Standard form:

min c⊤(x, w) s.t. A(x, w)

  • b

wij = xi ⊗ij xj for suitable i, j bounds & integrality constraints              x2

1 + x1x2 ⇒

       w11 + w12 w11 = x2

1

w12 = x1x2

:

^

× + 2 x1 x1 x2 → + w11 w12

MPRO — PMA – p. 52

slide-82
SLIDE 82

Convex relaxation

Standard form: all nonlinearities in defining constraints

Each defining constraint wij = xi ⊗ xj is replaced by two convex inequalities: wij ≤

  • verestimator(xi ⊗ xj)

wij ≥ underestimator(xi ⊗ xj) E.g. convex/concave over-, under-estimators for products xixj where x ∈ [−1, 1] (McCormick’s envelope):

A B C D

−1 −0.5 0.5 1 −1 −0.5 0.5 1 −1 −0.5 0.5 1

Convex relaxation is not the tightest possible, but it can be constructed automatically

MPRO — PMA – p. 53

slide-83
SLIDE 83

Summary

ORIGINAL MINLP

minx f(x) g(x) ≤ 0 xL ≤ x ≤ xU

STANDARD FORM

min w1 Aw = b wi = wjwk ∀(i, j, k) ∈ Tblt wi = wj

wk ∀(i, j, k) ∈ Tlft

wi = hij(wj) ∀(i, j) ∈ Tuf wL ≤ w ≤ wU

CONVEX RELAXATION

min w1 Aw = b McCormick’s relaxation Secant relaxation wL ≤ w ≤ wU

Some variables may be integral Easier to perform symbolic algorithms Linearizes nonlinear terms Adds linearizing variables and defining constraints Each defining constraint replaced by convex under- and concave

  • ver-estimators

MPRO — PMA – p. 54

slide-84
SLIDE 84

Eg: conv. rel. of pooling problem

Pool Blend 1 Blend 2

x113% Sulphur

$ 6

x211% Sulphur

$ 16

x122% Sulphur

$ 10

y11 y12 y21 y22 ≤ 2.5% Sulphur

$ 9

≤ 100 ≤ 1.5% Sulphur

$ 15

≤ 200                                          min

x,y,p

6x11 + 16x21 + 10x12− −9(y11 + y21) − 15(y12 + y22)

s.t.

x11 + x21 − y11 − y12 = 0 linear x12 − y21 − y22 = 0 linear y11 + y21 ≤ 100 linear y12 + y22 ≤ 200 linear 3x11 + x21 − p(y11 + y12) = 0 py11 + 2y21 ≤ 2.5(y11 + y21) py12 + 2y22 ≤ 1.5(y12 + y22)                                          min

cost s.t. linear constraints

3x11 + x21 − w1 = 0 w3 + 2y21 ≤ 2.5(y11 + y21) w4 + 2y22 ≤ 1.5(y12 + y22) w2 = y11 + y12 w1 = pw2 w3 = py11 w4 = py12

Replace nonconvex constr. w = uv by McCormick’s envelopes:

w ≥ max{uLv + vLu − uLvL, uUv + vUu − uUvU}, w ≤ min{uUv + vLu − uUvL, uLv + vUu − uLvU}

A B C D −1 −0.5 0.5 1 −1 −0.5 0.5 1 −1 −0.5 0.5 1

MPRO — PMA – p. 55

slide-85
SLIDE 85

Variable ranges

Crucial property for sBB convergence: convex relaxation

tightens as variable range widths decrease

convex/concave under/over-estimator constraints are (convex) functions of xL, xU it makes sense to tighten xL, xU at the sBB root node (trading off speed for efficiency) and at each other node (trading off efficiency for speed)

MPRO — PMA – p. 56

slide-86
SLIDE 86

OBBT and FBBT

In sBB we need to tighten variable bounds at each node

Two methods: Optimization Based Bounds Tightening (OBBT) and Feasibility Based Bounds Tightening (FBBT)

OBBT: for each variable x in P compute min and max{x | conv. rel. constr.}, see e.g. [Caprara et al., MP 2009] FBBT:

propagation of intervals up and down constraint expression trees, with tightening at the root node Example: 5x1 − x2 = 0. Up: × :[5, 5]×[0, 1]=[0, 5]; − :[0, 5]−[0, 1]=[−1, 5]. Root node tightening: [−1, 5] ∩ [0, 0] = [0, 0]. Downwards: × :[0, 0]+[0, 1]=[0, 1]; x1:[0, 1]/[5, 5]=[0, 1

5 ]

− × 5 x1 x2 [5, 5] [0, 0] [0, 1] [0, 1]

Iterating (up/tighten/down) k times yields [0,

1 52k−1 ]

MPRO — PMA – p. 57

slide-87
SLIDE 87

Quadratic problems

All nonlinear terms are quadratic monomials Aim to reduce gap betwen the problem and its convex relaxation ⇒ replace quadratic terms with suitable linear constraints (fewer nonlinear terms to relax) Can be obtained by considering linear relations (called

reduced RLT constraints) between original and linearizing

variables

MPRO — PMA – p. 58

slide-88
SLIDE 88

Reduced RLT Constraints I

For each k ≤ n, let wk = (wk1, . . . , wkn) Multiply Ax = b by each xk, substitute linearizing variables wk, get

reduced RLT constraint system (RRCS)

∀k ≤ n (Awk = bxk) ∀ i, k ≤ n define zki = wki − xixk, let zk = (zk1, . . . , zkn) Substitute b = Ax in RRCS, get ∀k ≤ n(A(wk − xkx) = 0), i.e. ∀k ≤ n(Azk = 0). Let B, N be the sets of basic and nonbasic variables of this system Setting zki = 0 for each nonbasic variable implies that the RRCS is satisfied ⇒ It suffices to enforce quadratic constraints wki = xixk for (i, k) ∈ N (replace those for (i, k) ∈ B with the linear RRCS)

MPRO — PMA – p. 59

slide-89
SLIDE 89

Reduced RLT Constraints II

10 5

  • 5
  • 10

y 4 2

  • 2
  • 4

x 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2

F(P) = {(x, y, w) | w = xy∧ x = 1}

10 8 6 4 2

  • 2
  • 4
  • 6
  • 8
  • 10

y 4 2

  • 2
  • 4

x 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2

McCormick’s rel.

10 5

  • 5
  • 10

y 4 2

  • 2
  • 4

x 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2

rRLT constraint: multiply x = 1 by y, get xy = y, replace xy by w, get w = y F(P) described linearly

MPRO — PMA – p. 60

slide-90
SLIDE 90

Reduced RLT Constraints III

If |E| = 1

2n(n + 1) (all possible quadratic terms), get |B|

fewer quadratic terms in reformulation Otherwise, judicious choice of multiplier variable set {xk | k ∈ K} and multiplied linear equation constraint subsystem must be performed.

MPRO — PMA – p. 61

slide-91
SLIDE 91

Citations

Sherali, Alameddine, A new reformulation-linearization technique for bilinear

programming problems, JOGO, 1991

Smith, Pantelides, A symbolic reformulation/spatial branch-and-bound algorithm

for the global optimisation of nonconvex MINLPs, Comp. Chem. Eng. 1999

Liberti, Reduction Constraints for the Global Optimization of NLPs, ITOR, 2004 Liberti, Linearity embedded in nonconvex programs, JOGO, 2005 Liberti, Pantelides, An exact reformulation algorithm for large nonconvex NLPs

involving bilinear terms, JOGO, 2006

Belotti, Liberti et al., Branching and bounds tightening techniques for nonconvex

MINLP, Opt. Meth. Softw., 2009

Sherali, Dalkiran, Liberti, Reduced RLT representations for nonconvex

polynomial programming problems, JOGO (to appear)

MPRO — PMA – p. 62

slide-92
SLIDE 92

The end

MPRO — PMA – p. 64