Computational Optimization Duality Theory (MW 12.9) Prof. K. - - PowerPoint PPT Presentation

computational optimization
SMART_READER_LITE
LIVE PREVIEW

Computational Optimization Duality Theory (MW 12.9) Prof. K. - - PowerPoint PPT Presentation

Computational Optimization Duality Theory (MW 12.9) Prof. K. Bennett Bennek@rpi.edu http://www.rpi.edu/~bennek/compopt/ DRUG TRIVIA In 1999 USA $25B/yr for R&D of pharmaceuticals (33% clinicals) Worth their weight in gold


slide-1
SLIDE 1

Computational Optimization

Duality Theory (MW 12.9)

  • Prof. K. Bennett

Bennek@rpi.edu

http://www.rpi.edu/~bennek/compopt/

slide-2
SLIDE 2

RENSSELAER

DRUG TRIVIA

  • In 1999 USA $25B/yr for R&D of pharmaceuticals (33% clinicals)
  • Worth their weight in gold
  • 10-15 years from conception market for drug
  • Development cost 0.5B/drug 1999 now over 1.5 billion
  • First-year sales > $1B/drug
  • 1 drug approved/5000 compounds tested
  • 1 out of 100 drugs succeeds to market
  • 19 Alzheimer’s drugs in development
  • 20,000,000 Americans with Alzheimer by 2050
slide-3
SLIDE 3

HIV Reverse-Transcriptase Inhibition modeling: Have a few Molecules that have been tested: Can we predict if new molecule will inhibit HIV? HELP WORLD HIV PROBLEM

N N HN X R R1

S HN N O O O HO R

N N O O R2 O OTBDMS S O O O H2N TBDMSO R1 N O OTBDMS S O O O H2N TBDMSO N N R1 R2

N N S O N O R2 R1

slide-4
SLIDE 4

The bioactivities of a small set of molecules Many descriptors for each molecules: Molecular Weight Electrostatic Potential Ionization Potential Can we predict molecules bioactivity?

What do we know?

slide-5
SLIDE 5

Drug Discovery Application

GOAL: Predict bioactivities of molecules in order to decrease need for expensive lab experiments. Given: molecule descriptors xi∈Rn known bioactivity yi = 1 or -1 Find predictive model: ( )

( )

1

( ) '

n j j j

f x sign w x b sign w x b y

=

⎛ ⎞ = − = − ≈ ⎜ ⎟ ⎝ ⎠

slide-6
SLIDE 6

Best Linear Separator?

slide-7
SLIDE 7

Best Linear Separator?

slide-8
SLIDE 8

Best Linear Separator?

slide-9
SLIDE 9

Best Linear Separator?

slide-10
SLIDE 10

Best Linear Separator?

slide-11
SLIDE 11

Find Closest Points in Convex Hulls

c d

slide-12
SLIDE 12

Plane Bisect Closest Points

x w b w d c ⋅ = = −

d c

slide-13
SLIDE 13

Find using quadratic program

2 1 2 1 1 1 1

m in 1 1 . . 1, ...,

i i i i i i i i i i i

c d c x d x s t i n

α

α α α α α

∈ ∈ − ∈ ∈ −

− = = = = ≥ =

∑ ∑ ∑ ∑

Quadratic objective with linear constraints

slide-14
SLIDE 14

Best Linear Separator: Supporting Plane Method

x w δ ⋅ = x w β ⋅ =

Maximize distance Between two para supporting planes Distance = “Margin” =

|| || w δ β −

slide-15
SLIDE 15

Maximize margin using quadratic program

1 2 2 , ,

min || || 1 . 1

w i i

x w w s t x i Cl w i Class ass

δ β

δ β δ β ⋅ ≥ ∈ − + ⋅ ≤ ∈ −

slide-16
SLIDE 16

Dual of Closest Points Method is Support Plane Method

( ) ( )

2 2 , , 1 1 1

1 1 2 2 . . 1 1 . .

|| || || || min min

i i i w i i i i i i i

y x w s t s t x w x w

α δ β

α δ β δ α α α β

= ∈ ∈−

− + = = ⇔ ⋅ − ≥ ≥ − ⋅ + ≥

∑ ∑ ∑

  • Solution only depends on support vectors:

i

α >

1

1 1 : 1 1

i i i i i

i Class w y x y i Class α

=

∈ ⎧ ⎫ = = ⎨ ⎬ − ∈ − ⎩ ⎭

slide-17
SLIDE 17

Two views – one problem

The closest point formulation yields same solution as parallel plains. This is not a coincidence Duality theory is a systematic way to formulate and investigate these problems. Many different kinds: Lagrangian, Wolfe, Conjugate

slide-18
SLIDE 18

Recall Lagrangian Function

min ( ) : diff : diff ( ) . . 1, ,

n r n i i

f x f R R g R R g x s t i m → → ≥ = …

1 1

( , ) ( ) ( ( )) ( , ) ( ) ( ( ))

j j j j m x x i x j i

L x u f x g x L x u f x g x λ λ

= =

= − ∇ = ∇ − ∇

∑ ∑

slide-19
SLIDE 19

KKT Conditions

( ) 1, ,

i i

g x i m λ = = …

1

( , ) ( ) ( ( ))

m x i x j i

L x f x g x

λ

λ λ λ

=

∇ = ∇ − ∇ = ≥

Primal Feasibility Dual Feasibility

( ) 1, ,

i

g x i m ≥ = …

Complimentarity

slide-20
SLIDE 20

Lagrangian Duality

Base Problem Lagrangian function

min ( ) . . ( ) f x s t g x x ≥ ∈Χ

1

( , ) ( ) ( ( ))

j j j j

L x u f x g x λ

=

= −∑

slide-21
SLIDE 21

PRIMAL Problem

Primal objective: Primal problem (min max):

*

( ) max ( , ) L x L x

λ

λ

=

*

min ( ) min max ( , )

x X x X

L x L x

λ

λ

∈ ∈ ≥

=

slide-22
SLIDE 22

Same as original

Primal objective Primal problem

*

( ) max ( , ) ( ) if ( ) max ( ) ' ( ) ( ) L x L x f x g x f x g x g x

λ λ

λ λ

≥ ≥

= ≥ = − = ∞ <

*

min ( ) min max ( , ) min ( ) . . ( )

x X x X

L x L x f x s t g x

λ

λ

∈ ∈ ≥

= = ≥

slide-23
SLIDE 23

Dual Problem

Dual objective: Dual problem (max min):

*( )

min ( , )

x

L x L x λ

∈Χ

=

*

max ( ) max min ( , )

x X

L x L x

λ λ

λ

∈ ≥ ≥

=

slide-24
SLIDE 24

Dual Problem

Dual objective: Dual problem (max min):

*( )

min ( , )

x

L x L x λ

∈Χ

=

*

max ( ) max min ( , )

x X

L x L x

λ λ

λ

∈ ≥ ≥

=

L* is always concave!!!!!!

slide-25
SLIDE 25

Explicit Form of Dual

Some problems have explicit form of dual objective Exploit differentiability and convexity

*

1 ( , ) 1/ ( ) log(1/ ) ( 1/ 1) log( ) 1

xL x

x x L λ λ λ λ λ λ λ λ λ ∇ = − + = ⇒ = = − − − + = − −

*

max log( ) . . 1 ( ) min log( ) ( 1)

x

x s t x L x x λ λ ≤ = − − − +

1 2 3 4 5 6 7 8 9 10
  • 8
  • 7
  • 6
  • 5
  • 4
  • 3
  • 2
  • 1
x log(x)-x
slide-26
SLIDE 26

Weak Duality

For any feasible primal feasible x And dual feasible the following holds

( ) ( , ) ( ) ' ( ) f x L x f x g x λ λ ≥ = − arg min ( ) ' ( )

x X

and x f x g x λ λ

≥ ∈ − ( ) g x ≥

( , ) x λ

slide-27
SLIDE 27

Wolfe Duality

For differentiable Convex program can simplify Lagrangian duality

,

max ( , ) ( , ) . .

x x

L x L x s t

λ

λ λ λ ∇ = ≥

slide-28
SLIDE 28

Lagrangian Duality

Dual Function If convex problem

*( )

min ( , )

x

L L x λ λ =

* 1

( ) min ( , ) ( , ) ( ) ( ( ))

x m x x i x j i

L L x L x u f x g x λ λ λ

=

= ⇒ ∇ = ∇ − ∇ =

slide-29
SLIDE 29

Dual Problem

Dual Problem If convex problem simplifies to

*

max ( ) max min ( , )

x

L L x

λ λ

λ λ

≥ ≥

=

,

max ( , ) . . ( , )

x x

L x s t L x u

λ

λ λ ∇ = ≥

slide-30
SLIDE 30

Weak Duality

For any feasible primal feasible x* And dual feasible x,λ the following holds

( *) ( , ) ( ) ' ( ) f x L x f x g x λ λ ≥ = − ( , )

xL x λ

λ ∇ = ≥ ( *) g x ≥

If they equal they must be optimal!

slide-31
SLIDE 31

General Ideal

Primal – minimizes the primal function subject to primal constraints Dual maximizes the dual function with respect to the dual variables λ≥0 At optimality the primal and dual functions are equal (requires assumptions – strong duality)

slide-32
SLIDE 32

Primal QP

1 2 2 , , 1 2 2 1 1

1 ( ) min || || . 1 ( , , ) || || ( )

w i i i i i i i i

w s t x w i Class L x w i Class w x w x w w

δ β

δ δ β β δ β δ α β α α β

∈ ∈−

⋅ ≥ − + ⋅ ≤ ∈ − = − + − ⋅ ∈ − ⋅ − −

∑ ∑

slide-33
SLIDE 33

KKT

1 1 1 1

Primal Feasibility: 1 Dual Feasibility: ( , , ) ( , , ) 1 ( , , ) 1 plus complementarity 1

i i w i i i i i i i i i i i

x w i C x w i lass L w Class L w w x w x L

δ β

β α δ β α δ β δ β α δ α α

∈ ∈− ∈ ∈−

⋅ ≤ ∈ − ≥ ∇ = − + = ∇ = ⋅ ≥ − = ∇ = − ∈ + =

∑ ∑ ∑ ∑

slide-34
SLIDE 34

Dual Problem

1 1 1 1

max ( , , ) 1 1

i i i i i i i i i i i

x x L α δ α α β α α

∈ ∈− ∈ ∈−

− = = ≥

∑ ∑ ∑ ∑

Remove w by substitution and simplify Convert to min problem

slide-35
SLIDE 35

Recall quadratic program

2 1 2 1 1 1 1

m in 1 1 . . 1, ...,

i i i i i i i i i i i i i i i

y x c x d x s t i

α

α α α α α α τ

∈ ∈ ∈ ∈ −

= = = = ≥ =

∑ ∑ ∑ ∑ ∑

Quadratic objective with linear constraints

slide-36
SLIDE 36

Linear Programming Duality =Special Case

Primal

min ' . . ( , ) ' '( )

x b x

s t Ax c L x y b x y Ax c >= = − −

slide-37
SLIDE 37

Linear Programming Duality =Special Case

Dual Simplify by

,

max ' '( ) . . ( , ) '

x y x

b x y Ax c s t L x y b A y y − − ∇ = − = ≥ max ' . . '

y c y

s t A y b y = ≥ ' ( , ) '( ' )

x

x L x y x b A y ∇ = − =

slide-38
SLIDE 38

Works for equality/inequality

Primal Dual

min ( ) . . ( ) ( )

x

f x s t g x h x ≥ =

, ,

max ( ) ' ( ) ' ( ) . . ( ) ( ) ( ) 0,

x u v x i x i i x j i j

f x u g x v h x s t f x u g x v h x u v unconstrained − − ∇ − ∇ − ∇ = ≥

∑ ∑

slide-39
SLIDE 39

Why Dual Problem?

May have nicer structure like easier constraints or function Dual problem always is max of concave

  • function. Catch may have “duality gap”

Dual provides lower bound on primal function – use to check optimality and generate cuts/constraints Exploit in algorithms - e.g. augmented Lagrangian and primal dual interior point algorithms.