5 duality
play

5. Duality Lagrange dual problem weak and strong duality geometric - PowerPoint PPT Presentation

Convex Optimization Boyd & Vandenberghe 5. Duality Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized


  1. Convex Optimization — Boyd & Vandenberghe 5. Duality • Lagrange dual problem • weak and strong duality • geometric interpretation • optimality conditions • perturbation and sensitivity analysis • examples • generalized inequalities 5–1

  2. Lagrangian standard form problem (not necessarily convex) minimize f 0 ( x ) subject to f i ( x ) ≤ 0 , i = 1 , . . . , m h i ( x ) = 0 , i = 1 , . . . , p variable x ∈ R n , domain D , optimal value p ⋆ Lagrangian: L : R n × R m × R p → R , with dom L = D × R m × R p , p m � � L ( x, λ, ν ) = f 0 ( x ) + λ i f i ( x ) + ν i h i ( x ) i =1 i =1 • weighted sum of objective and constraint functions • λ i is Lagrange multiplier associated with f i ( x ) ≤ 0 • ν i is Lagrange multiplier associated with h i ( x ) = 0 Duality 5–2

  3. Lagrange dual function Lagrange dual function: g : R m × R p → R , g ( λ, ν ) = x ∈D L ( x, λ, ν ) inf p � m � � � = inf f 0 ( x ) + λ i f i ( x ) + ν i h i ( x ) x ∈D i =1 i =1 g is concave, can be −∞ for some λ , ν lower bound property: if λ � 0 , then g ( λ, ν ) ≤ p ⋆ proof: if ˜ x is feasible and λ � 0 , then f 0 (˜ x ) ≥ L (˜ x, λ, ν ) ≥ inf x ∈D L ( x, λ, ν ) = g ( λ, ν ) x gives p ⋆ ≥ g ( λ, ν ) minimizing over all feasible ˜ Duality 5–3

  4. Least-norm solution of linear equations x T x minimize subject to Ax = b dual function • Lagrangian is L ( x, ν ) = x T x + ν T ( Ax − b ) • to minimize L over x , set gradient equal to zero: ∇ x L ( x, ν ) = 2 x + A T ν = 0 x = − (1 / 2) A T ν = ⇒ • plug in in L to obtain g : g ( ν ) = L (( − 1 / 2) A T ν, ν ) = − 1 4 ν T AA T ν − b T ν a concave function of ν lower bound property : p ⋆ ≥ − (1 / 4) ν T AA T ν − b T ν for all ν Duality 5–4

  5. Standard form LP c T x minimize subject to Ax = b, x � 0 dual function • Lagrangian is c T x + ν T ( Ax − b ) − λ T x L ( x, λ, ν ) = − b T ν + ( c + A T ν − λ ) T x = • L is affine in x , hence − b T ν A T ν − λ + c = 0 � g ( λ, ν ) = inf x L ( x, λ, ν ) = −∞ otherwise g is linear on affine domain { ( λ, ν ) | A T ν − λ + c = 0 } , hence concave lower bound property : p ⋆ ≥ − b T ν if A T ν + c � 0 Duality 5–5

  6. Equality constrained norm minimization minimize � x � subject to Ax = b dual function b T ν � A T ν � ∗ ≤ 1 � x ( � x � − ν T Ax + b T ν ) = g ( ν ) = inf −∞ otherwise where � v � ∗ = sup � u �≤ 1 u T v is dual norm of � · � proof: follows from inf x ( � x � − y T x ) = 0 if � y � ∗ ≤ 1 , −∞ otherwise • if � y � ∗ ≤ 1 , then � x � − y T x ≥ 0 for all x , with equality if x = 0 • if � y � ∗ > 1 , choose x = tu where � u � ≤ 1 , u T y = � y � ∗ > 1 : � x � − y T x = t ( � u � − � y � ∗ ) → −∞ as t → ∞ lower bound property: p ⋆ ≥ b T ν if � A T ν � ∗ ≤ 1 Duality 5–6

  7. Two-way partitioning x T Wx minimize x 2 subject to i = 1 , i = 1 , . . . , n • a nonconvex problem; feasible set contains 2 n discrete points • interpretation: partition { 1 , . . . , n } in two sets; W ij is cost of assigning i , j to the same set; − W ij is cost of assigning to different sets dual function � x ( x T Wx + ν i ( x 2 x x T ( W + diag ( ν )) x − 1 T ν g ( ν ) = inf i − 1)) = inf i � − 1 T ν W + diag ( ν ) � 0 = −∞ otherwise lower bound property : p ⋆ ≥ − 1 T ν if W + diag ( ν ) � 0 example: ν = − λ min ( W ) 1 gives bound p ⋆ ≥ nλ min ( W ) Duality 5–7

  8. Lagrange dual and conjugate function minimize f 0 ( x ) subject to Ax � b, Cx = d dual function f 0 ( x ) + ( A T λ + C T ν ) T x − b T λ − d T ν � � g ( λ, ν ) = inf x ∈ dom f 0 − f ∗ 0 ( − A T λ − C T ν ) − b T λ − d T ν = • recall definition of conjugate f ∗ ( y ) = sup x ∈ dom f ( y T x − f ( x )) • simplifies derivation of dual if conjugate of f 0 is known example: entropy maximization n n � � f ∗ e y i − 1 f 0 ( x ) = x i log x i , 0 ( y ) = i =1 i =1 Duality 5–8

  9. The dual problem Lagrange dual problem maximize g ( λ, ν ) subject to λ � 0 • finds best lower bound on p ⋆ , obtained from Lagrange dual function • a convex optimization problem; optimal value denoted d ⋆ • λ , ν are dual feasible if λ � 0 , ( λ, ν ) ∈ dom g • often simplified by making implicit constraint ( λ, ν ) ∈ dom g explicit example: standard form LP and its dual (page 5–5) c T x − b T ν minimize maximize A T ν + c � 0 subject to Ax = b subject to x � 0 Duality 5–9

  10. Weak and strong duality weak duality: d ⋆ ≤ p ⋆ • always holds (for convex and nonconvex problems) • can be used to find nontrivial lower bounds for difficult problems for example, solving the SDP − 1 T ν maximize subject to W + diag ( ν ) � 0 gives a lower bound for the two-way partitioning problem on page 5–7 strong duality: d ⋆ = p ⋆ • does not hold in general • (usually) holds for convex problems • conditions that guarantee strong duality in convex problems are called constraint qualifications Duality 5–10

  11. Slater’s constraint qualification strong duality holds for a convex problem minimize f 0 ( x ) subject to f i ( x ) ≤ 0 , i = 1 , . . . , m Ax = b if it is strictly feasible, i.e. , ∃ x ∈ int D : f i ( x ) < 0 , i = 1 , . . . , m, Ax = b • also guarantees that the dual optimum is attained (if p ⋆ > −∞ ) • can be sharpened: e.g. , can replace int D with relint D (interior relative to affine hull); linear inequalities do not need to hold with strict inequality, . . . • there exist many other types of constraint qualifications Duality 5–11

  12. Inequality form LP primal problem c T x minimize subject to Ax � b dual function − b T λ A T λ + c = 0 � ( c + A T λ ) T x − b T λ � � g ( λ ) = inf = −∞ otherwise x dual problem − b T λ maximize A T λ + c = 0 , subject to λ � 0 • from Slater’s condition: p ⋆ = d ⋆ if A ˜ x ≺ b for some ˜ x • in fact, p ⋆ = d ⋆ except when primal and dual are infeasible Duality 5–12

  13. Quadratic program primal problem (assume P ∈ S n ++ ) x T Px minimize subject to Ax � b dual function = − 1 x T Px + λ T ( Ax − b ) 4 λ T AP − 1 A T λ − b T λ � � g ( λ ) = inf x dual problem − (1 / 4) λ T AP − 1 A T λ − b T λ maximize subject to λ � 0 • from Slater’s condition: p ⋆ = d ⋆ if A ˜ x ≺ b for some ˜ x • in fact, p ⋆ = d ⋆ always Duality 5–13

  14. A nonconvex problem with strong duality x T Ax + 2 b T x minimize x T x ≤ 1 subject to A �� 0 , hence nonconvex dual function: g ( λ ) = inf x ( x T ( A + λI ) x + 2 b T x − λ ) • unbounded below if A + λI �� 0 or if A + λI � 0 and b �∈ R ( A + λI ) • minimized by x = − ( A + λI ) † b otherwise: g ( λ ) = − b T ( A + λI ) † b − λ dual problem and equivalent SDP: − b T ( A + λI ) † b − λ maximize maximize − t − λ � A + λI � subject to A + λI � 0 b subject to � 0 b T b ∈ R ( A + λI ) t strong duality although primal problem is not convex (not easy to show) Duality 5–14

  15. Geometric interpretation for simplicity, consider problem with one constraint f 1 ( x ) ≤ 0 interpretation of dual function: g ( λ ) = ( u,t ) ∈G ( t + λu ) , inf where G = { ( f 1 ( x ) , f 0 ( x )) | x ∈ D} t t G G p ⋆ p ⋆ d ⋆ λu + t = g ( λ ) g ( λ ) u u • λu + t = g ( λ ) is (non-vertical) supporting hyperplane to G • hyperplane intersects t -axis at t = g ( λ ) Duality 5–15

  16. epigraph variation: same interpretation if G is replaced with A = { ( u, t ) | f 1 ( x ) ≤ u, f 0 ( x ) ≤ t for some x ∈ D} t A p ⋆ λu + t = g ( λ ) g ( λ ) u strong duality • holds if there is a non-vertical supporting hyperplane to A at (0 , p ⋆ ) • for convex problem, A is convex, hence has supp. hyperplane at (0 , p ⋆ ) u, ˜ • Slater’s condition: if there exist (˜ t ) ∈ A with ˜ u < 0 , then supporting hyperplanes at (0 , p ⋆ ) must be non-vertical Duality 5–16

  17. Complementary slackness assume strong duality holds, x ⋆ is primal optimal, ( λ ⋆ , ν ⋆ ) is dual optimal p m � � � � f 0 ( x ⋆ ) = g ( λ ⋆ , ν ⋆ ) λ ⋆ ν ⋆ = inf f 0 ( x ) + i f i ( x ) + i h i ( x ) x i =1 i =1 m p � � f 0 ( x ⋆ ) + λ ⋆ i f i ( x ⋆ ) + ν ⋆ i h i ( x ⋆ ) ≤ i =1 i =1 f 0 ( x ⋆ ) ≤ hence, the two inequalities hold with equality • x ⋆ minimizes L ( x, λ ⋆ , ν ⋆ ) • λ ⋆ i f i ( x ⋆ ) = 0 for i = 1 , . . . , m (known as complementary slackness): λ ⋆ ⇒ f i ( x ⋆ ) = 0 , f i ( x ⋆ ) < 0 = ⇒ λ ⋆ i > 0 = i = 0 Duality 5–17

  18. Karush-Kuhn-Tucker (KKT) conditions the following four conditions are called KKT conditions (for a problem with differentiable f i , h i ): 1. primal constraints: f i ( x ) ≤ 0 , i = 1 , . . . , m , h i ( x ) = 0 , i = 1 , . . . , p 2. dual constraints: λ � 0 3. complementary slackness: λ i f i ( x ) = 0 , i = 1 , . . . , m 4. gradient of Lagrangian with respect to x vanishes: p m � � ∇ f 0 ( x ) + λ i ∇ f i ( x ) + ν i ∇ h i ( x ) = 0 i =1 i =1 from page 5–17: if strong duality holds and x , λ , ν are optimal, then they must satisfy the KKT conditions Duality 5–18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend