An Effective Branch-and-Bound Algorithm for Convex Quadratic Integer - - PowerPoint PPT Presentation

an effective branch and bound algorithm for convex
SMART_READER_LITE
LIVE PREVIEW

An Effective Branch-and-Bound Algorithm for Convex Quadratic Integer - - PowerPoint PPT Presentation

An Effective Branch-and-Bound Algorithm for Convex Quadratic Integer Programming Christoph Buchheim Fakult at f ur Mathematik, TU Dortmund DEIS, Universit` a di Bologna Alberto Caprara, Andrea Lodi DEIS, Universit` a di Bologna


slide-1
SLIDE 1

An Effective Branch-and-Bound Algorithm for Convex Quadratic Integer Programming

Christoph Buchheim Fakult¨ at f¨ ur Mathematik, TU Dortmund DEIS, Universit` a di Bologna Alberto Caprara, Andrea Lodi DEIS, Universit` a di Bologna

slide-2
SLIDE 2

Problem

For Q ≻ 0, find min f(x) = x⊤Qx + L⊤x + c s.d. x ∈ D ⊆ Zn NP-hard, even if D = {0, 1}n. In this talk: – we restrict ourselves to D = Zn or D = {l, . . . , u}n – we aim at fast branch-and-bound algorithms – need strong dual bounds that can be computed quickly

slide-3
SLIDE 3

Problem

For Q ≻ 0, find min f(x) = x⊤Qx + L⊤x + c s.d. x ∈ D ⊆ Zn NP-hard, even if D = {0, 1}n. In this talk: – we restrict ourselves to D = Zn or D = {l, . . . , u}n – we aim at fast branch-and-bound algorithms – need strong dual bounds that can be computed quickly

slide-4
SLIDE 4

Outline

An application in electronics Straightforward branch-and-bound algorithm Improvement of lower bounds Improvement of running time per node Some experimental results

slide-5
SLIDE 5

Application

Modern chips contain both digital and analog modules; analog modules interact with the physical environment. Need to generate analog signals by digital modules, e.g., to actuate AC motors: = ⇒

slide-6
SLIDE 6

Application

Problem: For a given analog target signal, determine a digital signal approximating the former as closely as possible.

1 −1

slide-7
SLIDE 7

Application

Problem: For a given analog target signal, determine a digital signal approximating the former as closely as possible.

1 −1

slide-8
SLIDE 8

Application

Problem: For a given analog target signal, determine a digital signal approximating the former as closely as possible.

1 −1

slide-9
SLIDE 9

Application

= ⇒

slide-10
SLIDE 10

Application

⇒ FILTER ⇒

slide-11
SLIDE 11

Application

Explicit or implicit transformation by a filter:

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0

g ˜ g

explicit filtering: adds desired properties to the signal implicit filtering: caused by, e.g., inertia

slide-12
SLIDE 12

Application

For given h: R → R, determine x1, . . . , xn ∈ {−1, 0, 1} such that

n

  • k=1

xkg(t − k∆t) approximates the signal h as closely as possible after filtering. = ⇒ minimize (square of) standard deviation f(x) = n− ∆t

2

− ∆t

2

  • n
  • k=1

xk ˜ g(t − k∆t)

  • filtered signal

− h(t) 2 dt .

slide-13
SLIDE 13

Application

For given h: R → R, determine x1, . . . , xn ∈ {−1, 0, 1} such that

n

  • k=1

xkg(t − k∆t) approximates the signal h as closely as possible after filtering. = ⇒ minimize (square of) standard deviation f(x) = n− ∆t

2

− ∆t

2

  • n
  • k=1

xk ˜ g(t − k∆t)

  • filtered signal

− h(t) 2 dt .

slide-14
SLIDE 14

Application

= ⇒ minimize the quadratic function f(x) = x⊤Qx + L⊤x + c where Qjk = n− ∆t

2

− ∆t

2

˜ g(t − j∆t)˜ g(t − k∆t)dt Lj = n− ∆t

2

− ∆t

2

˜ g(t − j∆t)h(t)dt c = n− ∆t

2

− ∆t

2

h(t)2dt In this application, Q is always positive definite.

slide-15
SLIDE 15

Problem

For given Q ≻ 0, L ∈ Rn, and c ∈ R, determine min f(x) = x⊤Qx + L⊤x + c s.d. x ∈ D = {−1, 0, 1}n Global minimum of f over Rn is f(¯ x) = c − 1

4L⊤Q−1L,

¯ x = − 1

2Q−1L

= ⇒ trivial lower bound for f over D = ⇒ straightforward branch-and-bound algorithm In our application, we usually have ¯ x ∈ [−1, 1]n...

slide-16
SLIDE 16

Problem

For given Q ≻ 0, L ∈ Rn, and c ∈ R, determine min f(x) = x⊤Qx + L⊤x + c s.d. x ∈ D = {−1, 0, 1}n Global minimum of f over Rn is f(¯ x) = c − 1

4L⊤Q−1L,

¯ x = − 1

2Q−1L

= ⇒ trivial lower bound for f over D = ⇒ straightforward branch-and-bound algorithm In our application, we usually have ¯ x ∈ [−1, 1]n...

slide-17
SLIDE 17

Problem

For given Q ≻ 0, L ∈ Rn, and c ∈ R, determine min f(x) = x⊤Qx + L⊤x + c s.d. x ∈ D = {−1, 0, 1}n Global minimum of f over Rn is f(¯ x) = c − 1

4L⊤Q−1L,

¯ x = − 1

2Q−1L

= ⇒ trivial lower bound for f over D = ⇒ straightforward branch-and-bound algorithm In our application, we usually have ¯ x ∈ [−1, 1]n...

slide-18
SLIDE 18

Improvement of Lower Bounds

slide-19
SLIDE 19

Improvement of Lower Bounds

E(Q) := {x ∈ Rn | x⊤Qx ≤ 1} µ(Q, ¯ x) := max{µ | (¯ x + int µE(Q)) ∩ Zn = ∅} = ⇒ min{f(x) | x ∈ Zn} = f(¯ x) + µ(Q, ¯ x)2

slide-20
SLIDE 20

Improvement of Lower Bounds

E(Q) := {x ∈ Rn | x⊤Qx ≤ 1} µ(Q, ¯ x) := max{µ | (¯ x + int µE(Q)) ∩ Zn = ∅} = ⇒ min{f(x) | x ∈ Zn} = f(¯ x) + µ(Q, ¯ x)2

slide-21
SLIDE 21

Improvement of Lower Bounds

µ(Q′, ¯ x) := max{µ | (¯ x + int µE(Q′)) ∩ Zn = ∅} µ(Q, Q′) := max{µ | µE(Q) ⊆ E(Q′)} = max{µ | Q − µ2Q′ 0} f(x) ≥ f(¯ x)+[µ(Q, Q′)µ(Q′, ¯ x)]2 for all x ∈ Zn

slide-22
SLIDE 22

Improvement of Lower Bounds

µ(Q′, ¯ x) := max{µ | (¯ x + int µE(Q′)) ∩ Zn = ∅} µ(Q, Q′) := max{µ | µE(Q) ⊆ E(Q′)} = max{µ | Q − µ2Q′ 0} f(x) ≥ f(¯ x)+[µ(Q, Q′)µ(Q′, ¯ x)]2 for all x ∈ Zn

slide-23
SLIDE 23

Linear Time per Node

After fixing d variables, the problem reduces to minimization of ¯ f : Zn−d → R, x → x⊤ ¯ Qx + ¯ L⊤x + ¯ c Idea: fix variables in predetermined order = ⇒ ¯ Q only depends on d [not on specific fixings] = ⇒ only n different matrices ¯ Qd [one per depth d] = ⇒ move expensive calculations for ¯ Qd into preprocessing – computation of ¯ Q−1

d

for finding continuous minima – computation of µ( ¯ Qd, Q′

d) for improved bounds

After preprocessing, running time is O((n − d)2) per node!

slide-24
SLIDE 24

Linear Time per Node

After fixing d variables, the problem reduces to minimization of ¯ f : Zn−d → R, x → x⊤ ¯ Qx + ¯ L⊤x + ¯ c Idea: fix variables in predetermined order = ⇒ ¯ Q only depends on d [not on specific fixings] = ⇒ only n different matrices ¯ Qd [one per depth d] = ⇒ move expensive calculations for ¯ Qd into preprocessing – computation of ¯ Q−1

d

for finding continuous minima – computation of µ( ¯ Qd, Q′

d) for improved bounds

After preprocessing, running time is O((n − d)2) per node!

slide-25
SLIDE 25

Linear Time per Node

After fixing d variables, the problem reduces to minimization of ¯ f : Zn−d → R, x → x⊤ ¯ Qx + ¯ L⊤x + ¯ c Idea: fix variables in predetermined order = ⇒ ¯ Q only depends on d [not on specific fixings] = ⇒ only n different matrices ¯ Qd [one per depth d] = ⇒ move expensive calculations for ¯ Qd into preprocessing – computation of ¯ Q−1

d

for finding continuous minima – computation of µ( ¯ Qd, Q′

d) for improved bounds

After preprocessing, running time is O((n − d)2) per node!

slide-26
SLIDE 26

Sublinear Time per Node

zd – compute all zd ∈ Rn−d in the preprocessing phase – compute ¯ x incrementally in O(n − d) time – compute f(¯ x) incrementally in O(n − d) time – compute all µ(Q′

d, ¯

x) incrementally (if compatible)

slide-27
SLIDE 27

Sublinear Time per Node

zd – compute all zd ∈ Rn−d in the preprocessing phase – compute ¯ x incrementally in O(n − d) time – compute f(¯ x) incrementally in O(n − d) time – compute all µ(Q′

d, ¯

x) incrementally (if compatible)

slide-28
SLIDE 28

Experimental Results

For the closest vector problem (CVP):

B., Caprara, Lodi [2010] CPLEX 12.1 n # tt/s pt/s nt/µs nodes # tt/s nt/µs nodes 20 10 0.01 0.01 1.63 4.22e+03 10 0.29 89.77 3.25e+03 25 10 0.03 0.02 1.62 1.57e+04 10 1.06 89.49 1.19e+04 30 10 0.17 0.03 1.25 1.13e+05 10 9.47 104.73 8.99e+04 35 10 0.85 0.05 1.40 5.70e+05 10 33.77 120.46 2.81e+05 40 10 6.13 0.09 1.59 3.80e+06 10 339.52 144.61 2.31e+06 45 10 66.52 0.14 1.78 3.67e+07 10 2547.37 171.04 1.45e+07 50 10 483.46 0.22 2.04 2.34e+08 8 10856.23 207.24 5.11e+07 55 10 1773.14 0.32 2.16 8.24e+08 3 18180.84 242.59 7.47e+07 60 7 15602.33 0.45 2.46 6.39e+09 — — — 65 5 16727.11 0.64 2.54 6.58e+09 — — — 70 — — — — — — — [Running times on Intel Xeon at 2.33 GHz, time limit 8h]

= ⇒ number of nodes 2 : 1, time per node 1 : 100

slide-29
SLIDE 29

Experimental Results

For the electronics application (2nd order Butterworth filters):

B., Caprara, Lodi [2010] CPLEX 12.1 n tt/s pt/s nt/µs nodes tt/s nt/µs nodes 30 0.04 0.04 0.00 1.85e+03 1.13 94.21 1.20e+04 40 0.11 0.09 1.18 1.69e+04 61.71 120.12 5.14e+05 50 0.24 0.21 1.00 3.00e+04 20995.52 174.38 1.20e+08 60 0.77 0.43 1.29 2.62e+05 — — — 70 2.49 0.82 1.53 1.09e+06 — — — 80 2.38 1.46 1.82 5.05e+05 — — — 90 22.72 2.45 1.85 1.09e+07 — — — 100 104.58 3.93 1.96 5.12e+07 — — — 110 1039.53 6.05 2.07 4.99e+08 — — — 120 7815.37 9.04 2.18 3.58e+09 — — — [Running times on Intel Xeon at 2.33 GHz, time limit 8h]

= ⇒ number of nodes 1 : 10.000, time per node 1 : 100 With a little help for CPLEX: number of nodes 1 : 1.000...

slide-30
SLIDE 30

−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0 −1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0

slide-31
SLIDE 31

−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0 −1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0

slide-32
SLIDE 32

−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0 −1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0

slide-33
SLIDE 33

−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0 −1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0

slide-34
SLIDE 34

−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0 −1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0

slide-35
SLIDE 35

CPLEX 12.1 n = 50 350 cpu-mins

−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0 −1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0

Emin = 5.39 · 10−3

  • ur algorithm

n = 100 1.75 cpu-mins

−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0 −1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0

Emin = 1.40 · 10−3

slide-36
SLIDE 36

Summary

The presented branch-and-bound algorithm is – much faster than CPLEX 12.1 [by several orders of magnitude] – much faster than other software [by even more orders of magnitude] – easy to implement [few hundred lines of code, about 0.5 nights] – only linear space used [using depth first strategy] – numerically robust [only vector multiplications] – easily extendible [mixed-integer problems etc.]