First-order representations for integer programming James Cussens, - - PowerPoint PPT Presentation

first order representations for integer programming
SMART_READER_LITE
LIVE PREVIEW

First-order representations for integer programming James Cussens, - - PowerPoint PPT Presentation

First-order representations for integer programming James Cussens, University of York Vienna, 2014-07-17 James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 1 / 28 Outline A running example Introduction to (mixed)


slide-1
SLIDE 1

First-order representations for integer programming

James Cussens, University of York Vienna, 2014-07-17

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 1 / 28

slide-2
SLIDE 2

Outline

A running example Introduction to (mixed) integer programming Solving MIPs Logical methods in integer programming Clausal cuts and resolution Prospects for first-order techniques

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 2 / 28

slide-3
SLIDE 3

A running example

The MAP problem

A | B | A | D | B | C | C | D |

  • | - | ----
  • | - | ----
  • | - | ----
  • | - | ----

0 | 0 | 0.10 0 | 0 | 0.90 0 | 0 | 0.40 0 | 0 | 0.50 0 | 1 | 0.20 * 0 | 1 | 0.20 * 0 | 1 | 0.70 * 0 | 1 | 0.20 1 | 0 | 0.30 1 | 0 | 0.70 1 | 0 | 0.30 1 | 0 | 0.40 1 | 1 | 0.20 1 | 1 | 0.10 1 | 1 | 0.10 1 | 1 | 0.10

◮ This is a four clique Markov network ◮ Which joint instantiation of A, B, C and D has maximal probability?

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 3 / 28

slide-4
SLIDE 4

A running example

Encoding the problem

A | B | A | D | B | C | C | D |

  • | - | ----
  • | - | ----
  • | - | ----
  • | - | ----

0 | 0 | 0.10 0 | 0 | 0.90 0 | 0 | 0.40 0 | 0 | 0.50 0 | 1 | 0.20 * 0 | 1 | 0.20 * 0 | 1 | 0.70 * 0 | 1 | 0.20 1 | 0 | 0.30 1 | 0 | 0.70 1 | 0 | 0.30 1 | 0 | 0.40 1 | 1 | 0.20 1 | 1 | 0.10 1 | 1 | 0.10 1 | 1 | 0.10

◮ Have 16 (‘weighted’) binary variables, one for each choice of clique

variables instantiation.

◮ Have 4 binary variables, for the 4 instantiations of A, B, C and D. ◮ Variable and clique instantiations must match up. ◮ (A suboptimal encoding btw!)

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 4 / 28

slide-5
SLIDE 5

Introduction to (mixed) integer programming

A ZIMPL representation of the MAP problem for graphical models

var x[CLIQUE_INSTS] binary; var y[VAR_INSTS] binary; maximize prob: sum <c,ci> in CLIQUE_INSTS: d[c,ci]*x[c,ci]; subto convex_clique: forall <c> in CLIQUES do sum <c,ci> in CLIQUE_INSTS: x[c,ci] == 1; subto convex_vars: forall <v> in VARS do sum <v,vi> in VAR_INSTS: y[v,vi] == 1; subto incidence: forall <c> in CLIQUES do forall <v> in VARS_IN[c] do forall <v,vi> in VAR_INSTS do sum <c,ci> in CLIQUE_INSTS with <c,ci> in CONSIS[v,vi] : x[c,ci] == y[v,vi];

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 5 / 28

slide-6
SLIDE 6

Introduction to (mixed) integer programming

Defining relations extensionally

This is for a bigger MAP problem . . . set VARS := { 0..439 }; set CLIQUES := { 0..859 }; set CLIQUE_INSTS := { <0,0>, <0,1>, <1,0>, <1,1>, <2,0>, ... set VARS_IN[CLIQUES] := <0> {420}, <1> {421}, <2> {422}, ... ... <578> {365,418,405}, <579> {385,419,405}

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 6 / 28

slide-7
SLIDE 7

Introduction to (mixed) integer programming

Standard ‘ground’ MIP representation

Maximize prob: +2.2999999544 x#0#1 +2.2999999544 x#1#1 +2.2999999544 x#2#1 .. +1.40000000818 x#23#1 ... ... Subject to convex_clique_1: + x#0#1 + x#0#0 = 1 ... incidence_1071:

  • y#400#0 + x#478#6 + x#478#4 + x#478#2 + x#478#0 = 0

....

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 7 / 28

slide-8
SLIDE 8

Solving MIPs

Solving the linear relaxation

◮ x = 4, y = 2 is the optimal integer solution. ◮ x = 2.5, y = 2.8 is the solution to the linear relaxation. ◮ Linear relaxation can be solved quickly, and provides an upper bound.

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 8 / 28

slide-9
SLIDE 9

Solving MIPs

Separating the LP solution with a cutting plane

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 9 / 28

slide-10
SLIDE 10

Solving MIPs

Separating the LP solution with a cutting plane

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 9 / 28

slide-11
SLIDE 11

Solving MIPs

Facets: Not all cuts are equal

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 10 / 28

slide-12
SLIDE 12

Solving MIPs

Facets: Not all cuts are equal

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 10 / 28

slide-13
SLIDE 13

Solving MIPs

Facets: Not all cuts are equal

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 10 / 28

slide-14
SLIDE 14

Solving MIPs

Branch-and-bound

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 11 / 28

slide-15
SLIDE 15

Solving MIPs

Branch-and-bound

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 11 / 28

slide-16
SLIDE 16

Solving MIPs

Branch and cut

  • 1. Let x* be the LP solution.
  • 2. If x* worse than incumbent then exit.
  • 3. If there are valid inequalities

not satisfied by x* add them and go to 1. Else if x* is integer-valued then the current problem is solved Else branch on a variable with non-integer value in x* to create two new sub-problems (propagate if possible)

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 12 / 28

slide-17
SLIDE 17

Solving MIPs

Column generation

◮ In the simplex algorithm for solving the linear relaxation one

repeatedly moves from the current vertex to a neighbouring one with a better objective value.

◮ Algebraically this corresponds to choosing a variable with ‘reduced

cost’ whose value is currently zero and making it positive (it ‘enters the basis’).

◮ In the column (i.e. variable) generation approach, a variable is not

explicitly represented until it enters the basis.

◮ An algorithm (a ‘pricer’) is used to choose which variable(s) to create

and put into the basis.

◮ Such an approach allows one to formulate problems with very many

(implicitly defined) variables.

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 13 / 28

slide-18
SLIDE 18

Logical methods in integer programming

Recall our ZIMPL representation . . .

var x[CLIQUE_INSTS] binary; var y[VAR_INSTS] binary; maximize prob: sum <c,ci> in CLIQUE_INSTS: d[c,ci]*x[c,ci]; subto convex_clique: forall <c> in CLIQUES do sum <c,ci> in CLIQUE_INSTS: x[c,ci] == 1; subto convex_vars: forall <v> in VARS do sum <v,vi> in VAR_INSTS: y[v,vi] == 1; subto incidence: forall <c> in CLIQUES do forall <v> in VARS_IN[c] do forall <v,vi> in VAR_INSTS do sum <c,ci> in CLIQUE_INSTS with <c,ci> in CONSIS[v,vi] : x[c,ci] == y[v,vi];

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 14 / 28

slide-19
SLIDE 19

Logical methods in integer programming

A na¨ ıve first-order approach

◮ Suppose the sets CLIQUE INSTS, CLIQUES, VARS IN, etc were too big

to ‘ground out’ before solving.

◮ Suppose we had a way of enumerating these sets, i.e. an algorithm for

generating every ground instance of the the corresponding predicates and relations.

◮ A na¨

ıve column generation algorithm generates variables until it comes up with one with reduced cost.

◮ A na¨

ıve cutting plane algorithm generates (ground) linear inequalities until it comes up with one which separates the current LP relaxation.

◮ Can we exploit techniques from first-order logic to do better?

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 15 / 28

slide-20
SLIDE 20

Logical methods in integer programming

Some general points

◮ Often one wants to generate cutting planes (‘valid inequalities’) not

in the original problem definition.

◮ The goal is to get a tight linear relaxation, so facets of the convex

hull are ideal.

◮ So just spitting out ground instances of inequalities in the problem

definition is typically not good enough.

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 16 / 28

slide-21
SLIDE 21

Logical methods in integer programming

Some general points

◮ Often one wants to generate cutting planes (‘valid inequalities’) not

in the original problem definition.

◮ The goal is to get a tight linear relaxation, so facets of the convex

hull are ideal.

◮ So just spitting out ground instances of inequalities in the problem

definition is typically not good enough.

◮ Column generation and separation are dual problems. ◮ Note also that if the original problem is NP-hard then the separation

problem will be also.

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 16 / 28

slide-22
SLIDE 22

Clausal cuts and resolution

Clausal constraints

◮ Continuing the the 4-variable, 4-clique MAP running example, let

a0d0 be the binary variable indicating that in the clique for {A, D} the instantiation A = 0, D = 0 was chosen.

◮ The following 4 clauses are given in the problem definition.

¬a0d0 ∨ a0b0 ∨ a0b1 ¬a0b0 ∨ b0c0 ∨ b0c1 ¬b0c1 ∨ c1d0 ∨ c1d1 ¬a0d0 ∨ ¬c1d1

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 17 / 28

slide-23
SLIDE 23

Clausal cuts and resolution

Clauses are linear inequalities

These two representations are equivalent: ¬a0d0 ∨ a0b0 ∨ a0b1 ¬a0b0 ∨ b0c0 ∨ b0c1 ¬b0c1 ∨ c1d0 ∨ c1d1 ¬a0d0 ∨ ¬c1d1 (1 − a0d0) + a0b0 + a0b1 ≥ 1 (1 − a0b0) + b0c0 + b0c1 ≥ 1 (1 − b0c1) + c1d0 + c1d1 ≥ 1 (1 − a0d0) + (1 − c1d1) ≥ 1

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 18 / 28

slide-24
SLIDE 24

Clausal cuts and resolution

Logic versus arithmetic

◮ It is interesting to compare numerical inequalities to logic formulae . . .

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 19 / 28

slide-25
SLIDE 25

Clausal cuts and resolution

Solving the linear relaxation

If we solve the linear relaxation of our MAP problem we get the following values for these 4 clauses: 1 ¬a0d0 ∨ a0b0 ∨ a0b1 (1) 3/2 ¬a0b0 ∨ b0c0 ∨ b0c1 (2) 1 ¬b0c1 ∨ c1d0 ∨ c1d1 (3) 3/2 ¬a0d0 ∨ ¬c1d1 (4)

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 20 / 28

slide-26
SLIDE 26

Clausal cuts and resolution

Using resolution to generate a cutting plane

By applying resolution on: 1 ¬a0d0 ∨ a0b0 ∨ a0b1 (5) 3/2 ¬a0b0 ∨ b0c0 ∨ b0c1 (6) 1 ¬b0c1 ∨ c1d0 ∨ c1d1 (7) 3/2 ¬a0d0 ∨ ¬c1d1 (8)

  • ne can derive

1/2 ¬a0d0 ∨ a0b1 ∨ b0c0 ∨ c1d0 (9) which is a facet (no less!) for the four clique MAP problem and separates the solution of the linear relaxation.

◮ How might we generate other clausal cuts?

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 21 / 28

slide-27
SLIDE 27

Clausal cuts and resolution

Separating resolvents

John Hooker has looked into how to use resolution to find cutting planes: An alternative to generating all resolvents, or all input resolvents, is to generate only separating resolvents. There are efficient algorithms for doing so. Generating separating resolvents is usually much faster than generating all resolvents, but one may be obliged to solve several linear relaxations before accumulating enough resolvents to obtain a tight relaxation. [Hooker, 2007]

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 22 / 28

slide-28
SLIDE 28

Clausal cuts and resolution

Guiding the search

Hooker provides this useful theorem: If ¯ x is a solution of linear(S) for clause set S, then a clause C in S can be a parent of a separating resolvent of S only if

  • j ¯

xC

j < 2. Furthermore, a separating resolvent can be obtained

from C only by resolving on a variable xk for which xk is

  • fractional. [Hooker, 2007]

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 23 / 28

slide-29
SLIDE 29

Clausal cuts and resolution

First-order resolution

Just as ¬a0d0 ∨ a0b0 ∨ a0b1 ¬a0b0 ∨ b0c0 ∨ b0c1 ¬b0c1 ∨ c1d0 ∨ c1d1 ¬a0d0 ∨ ¬c1d1 implies ¬a0d0 ∨ a0b1 ∨ b0c0 ∨ c1d0 (10)

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 24 / 28

slide-30
SLIDE 30

Clausal cuts and resolution

First-order resolution

¬axdx ∨ axbx ∨ axbf (x) ¬axbx ∨ bxcx ∨ bxcf (x) ¬bxcf (x) ∨ cf (x)dx ∨ cf (x)df (x) ¬axdx ∨ ¬cf (x)df (x) implies ¬axdx ∨ axbf (x) ∨ bxcx ∨ cf (x)dx (10)

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 24 / 28

slide-31
SLIDE 31

Clausal cuts and resolution

First-order resolution

  • r, in normal notation,

¬ad(X, X) ∨ ab(X, X) ∨ ab(X, f (X)) ¬ab(X, X) ∨ bc(X, X) ∨ bc(X, f (X)) ¬bc(X, f (X)) ∨ cd(f (X), X) ∨ cd(f (X), f (X)) ¬ad(X, X) ∨ ¬cd(f (X), f (X)) implies ¬ad(X, X) ∨ ab(X, f (X)) ∨ bc(X, X) ∨ cd(f (X), X) (10)

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 24 / 28

slide-32
SLIDE 32

Clausal cuts and resolution

Two valid inequalities for the price of one

Together with the equalities f (0) = 1 f (1) = 0 ¬ad(X, X) ∨ ab(X, f (X)) ∨ bc(X, X) ∨ cd(f (X), X) represents two ‘ground’ valid inequalities.

◮ By symmetry, either both or neither are facets.

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 25 / 28

slide-33
SLIDE 33

Clausal cuts and resolution

Two valid inequalities for the price of one

Together with the equalities f (0) = 1 f (1) = 0 ¬ad(X, X) ∨ ab(X, f (X)) ∨ bc(X, X) ∨ cd(f (X), X) represents two ‘ground’ valid inequalities.

◮ By symmetry, either both or neither are facets. ◮ In fact both are facets.

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 25 / 28

slide-34
SLIDE 34

Clausal cuts and resolution

Exploting symmetry further

It’s easy to see that this clause can be generalised ¬ad(X, X) ∨ ab(X, f (X)) ∨ bc(X, X) ∨ cd(f (X), X) (11) to (something like) this: B = D ∧ A = C → ¬p(A, D, X, X) ∨ p(A, B, X, f (X)) ∨ p(B, C, X, X) ∨ p(C, D, f (X), X) (12)

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 26 / 28

slide-35
SLIDE 35

Clausal cuts and resolution

First-order linear programs for first-order MDPs

◮ In Practical solution techniques for first-order MDPs, Sanner &

Boutilier (SB) represent a value function as a weighted sum of k first-order basis functions [Sanner and Boutilier, 2009].

◮ Example first-order basis function for state s:

if ∃b : BoxIn(b, paris, s) value = 1; otherwise value = 0

◮ To get the optimal value function (with the given basis functions) SB

use a ‘first-order linear program’ to get the optimal weights for the k basis functions.

◮ Each constraint is for every state:

0 ≥ casej,1( w, s) ⊕ · · · ⊕ casej,l(j)( w, s); ∀s

◮ First-order resolution is used to search for a ‘most violated constraint’

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 27 / 28

slide-36
SLIDE 36

Prospects for first-order techniques

Conclusion

◮ First-order representations and algorithms are already being used. ◮ First-order techniques (e.g. resolution) have to altered to be effective

. . .

◮ . . . and they do not make the tough problems disappear. ◮ But they provide an attractive approach to exploiting symmetry in

MIPs.

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 28 / 28

slide-37
SLIDE 37

Prospects for first-order techniques

Hooker, J. (2007). Integrated Methods for Optimization. Springer. Sanner, S. and Boutilier, C. (2009). Practical solution techniques for first-order MDPs. Artificial Intelligence, 173:748–788.

James Cussens, University of York 1st-order logic for IP Vienna, 2014-07-17 28 / 28