convex optimization
play

Convex Optimization 5. Duality Prof. Ying Cui Department of - PowerPoint PPT Presentation

Convex Optimization 5. Duality Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2018 SJTU Ying Cui 1 / 46 Outline Lagrange dual function Lagrange dual problem Geometric interpretation Optimality


  1. Convex Optimization 5. Duality Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2018 SJTU Ying Cui 1 / 46

  2. Outline Lagrange dual function Lagrange dual problem Geometric interpretation Optimality conditions Perturbation and sensitivity analysis Examples Generalized inequalities SJTU Ying Cui 2 / 46

  3. Lagrangian standard form problem (not necessarily convex) min f 0 ( x ) x s . t . f i ( x ) ≤ 0 , i = 1 , ..., m h i ( x ) = 0 , i = 1 , ..., p domain D = � m i =0 dom f i ∩ � p i =1 dom h i and optimal value p ∗ ◮ basic idea in Lagrangian duality: take the constraints into account by augmenting the objective function with a weighted sum of the constraint functions Lagrangian: L : R n × R m × R p → R , with dom L = D × R m × R p , m p � � L ( x , λ, ν ) = f 0 ( x ) + λ i f i ( x ) + ν i h i ( x ) i =1 i =1 ◮ weighted sum of objective and constraint functions ◮ λ i is Lagrange multiplier associated with f i ( x ) ≤ 0 ◮ ν i is Lagrange multiplier associated with h i ( x ) = 0 SJTU Ying Cui 3 / 46

  4. Lagrange dual function Lagrange dual function (or dual function): g : R m × R p → R p � m � � � g ( λ, ν ) = inf x ∈D L ( x , λ, ν ) = inf f 0 ( x ) + λ i f i ( x ) + ν i h i ( x ) x ∈D i =1 i =1 ◮ g is concave even when problem is not convex, as it is pointwise infimum of a family of affine functions of ( λ, ν ) ◮ pointwise minimum or infimum of concave functions is concave ◮ g can be −∞ when L is unbounded below in x SJTU Ying Cui 4 / 46

  5. Lower bound property The dual function yields lower bounds on the optimal value of the primal problem, i.e., for any λ � 0 and any ν , g ( λ, ν ) ≤ p ∗ ◮ the inequality holds but is vacuous when g ( λ, ν ) = −∞ ◮ the dual function gives a nontrivial lower bound only when λ � 0 and ( λ, ν ) ∈ dom g , i.e., g ( λ, ν ) > −∞ ◮ refer to ( λ, ν ) with λ � 0 , ( λ, ν ) ∈ dom g as dual feasible proof: Suppose ˜ x is feasible, i.e., f i (˜ x ) ≤ 0 and h i (˜ x ) = 0, and λ � 0. Then, we have m p � � λ i f i (˜ x ) + ν i h i (˜ x ) ≤ 0 = ⇒ L (˜ x , λ, ν ) ≤ f 0 (˜ x ) i =1 i =1 Hence, g ( λ, ν ) = inf x ∈D L ( x , λ, ν ) ≤ L (˜ x , λ, ν ) ≤ f 0 (˜ x ) x gives p ∗ ≥ g ( λ, ν ). Minimizing over all feasible ˜ SJTU Ying Cui 5 / 46

  6. Examples Least-norm solution of linear equations x T x min x s . t . Ax = b dual function: ◮ to minimize L ( x , ν ) = x T x + ν T ( Ax − b ) over x (unconstrained convex problem), set gradient equal to zero: x = − (1 / 2) A T ν ∇ x L ( x , ν ) = 2 x + A T ν = 0 ⇒ ◮ plug in L ( x , ν ) to obtain g : g ( ν ) = L (( − 1 / 2) A T ν, ν ) = ( − 1 / 4) ν T AA T ν − b T ν which is a concave quadratic function of ν , as − AA T � 0 lower bound property: p ∗ ≥ ( − 1 / 4) ν T AA T ν − b T ν, for all ν SJTU Ying Cui 6 / 46

  7. Examples Standard form LP c T x min x s . t . Ax = b , x � 0 dual function: ◮ Lagrangian L ( x , λ, ν ) = c T x + ν T ( Ax − b ) − λ T x = − b T ν + ( c + A T ν − λ ) T x is affine in x (bounded below only when identically zero) ◮ dual function � − b T ν, A T ν − λ + c = 0 g ( λ, ν ) = inf x L ( x , λ, ν ) = −∞ , otherwise lower bound property: nontrivial only when λ � 0 and A T ν − λ + c = 0, and hence p ∗ ≥ − b T ν if A T ν + c � 0 SJTU Ying Cui 7 / 46

  8. Examples Two-way partitioning problem ( W ∈ S n ) x T Wx min x x 2 s . t . i = 1 , i = 1 , ..., n ◮ a nonconvex problem with 2 n discrete feasible points ◮ find the two-way partition of { 1 , ..., n } with least total cost ◮ W ij is cost of assigning i , j to the same set ◮ − W ij is cost of assigning i , j to different sets dual function: x ( x T Wx + � ν i ( x 2 g ( ν ) = inf i − 1)) i � − 1 T ν, W + diag ( ν ) � 0 x x T ( W + diag ( ν )) x − 1 T ν = = inf −∞ , otherwise lower bound property: p ∗ ≥ − 1 T ν if W + diag ( ν ) � 0 example: ν = − λ min ( W ) 1 gives bound p ∗ ≥ n λ min ( W ) SJTU Ying Cui 8 / 46

  9. Lagrange dual function and conjugate function ◮ conjugate f ∗ of a function f : R n → R : f ∗ ( y ) = ( y T x − f ( x )) sup x ∈ dom f ◮ dual function of min f 0 ( x ) x s . t . x = 0 x (( − ν ) T x − f ( x )) x ( f ( x ) + ν T x ) = − sup g ( ν ) = inf ◮ relationship: g ( ν ) = − f ∗ ( − ν ) ◮ conjugate of any function is convex ◮ dual function of any problem is concave SJTU Ying Cui 9 / 46

  10. Lagrange dual function and conjugate function more generally (and more usefully), consider an optimization problem with linear inequality and equality constraints min f 0 ( x ) x s . t . Ax � b , Cx = d dual function: � � f 0 ( x ) + λ T ( Ax − b ) + ν T ( Cx − d ) g ( λ, ν ) = inf x ∈ dom f 0 � � f 0 ( x ) + ( A T λ + C T ν ) T x − b T λ − d T ν = inf x ∈ dom f 0 = − f ∗ 0 ( − A T λ − C T ν ) − b T λ − d T ν domain of g follows from domain of f ∗ 0 : dom g = { ( λ, µ ) | − A T λ − C T ν ∈ dom f ∗ 0 } ◮ simplify derivation of dual function if conjugate of f 0 is known SJTU Ying Cui 10 / 46

  11. Examples Equality constrained norm minimization min � x � x s . t . Ax = b dual function: � − b T ν, || A T ν || ∗ ≤ 1 g ( ν ) = − b T ν − f ∗ 0 ( − A T ν ) = −∞ , otherwise ◮ conjugate of f 0 = || · || : � 0 , || y || ∗ ≤ 1 f ∗ 0 ( y ) = ∞ , otherwise i.e., the indicator function of the dual norm unit ball, where � y � ∗ = sup � u �≤ 1 u T y is dual norm of � · � SJTU Ying Cui 11 / 46

  12. Lagrange dual problem max g ( λ, ν ) λ,ν s . t . λ � 0 ◮ find best lower bound on p ∗ , obtained from Lagrange dual function ◮ always a convex optimization problem (maximize a concave function over a convex set), regardless of convexity of primal problem, optimal value denoted by d ∗ ◮ λ, ν are dual feasible if λ � 0 and g ( λ, ν ) > −∞ (i.e., ( λ, ν ) ∈ dom g = { ( λ, ν ) | g ( λ, ν ) > −∞} ) ◮ can often be simplified by making implicit constraint ( λ, ν ) ∈ dom g explicit, e.g., ◮ standard form LP and its dual c T x − b T ν min max x ν A T ν + c � 0 s . t . Ax = b , x � 0 s . t . SJTU Ying Cui 12 / 46

  13. Weak duality and strong duality weak duality: d ∗ ≤ p ∗ ◮ always holds (for convex and nonconvex problems) ◮ can be used to find nontrivial lower bounds for difficult problems, e.g., ◮ solving the SDP − 1 T ν max ν s . t . W + diag ( ν ) � 0 gives a lower bound for the two-way partitioning problem strong duality: d ∗ = p ∗ ◮ does not hold in general ◮ (usually) holds for convex problems ◮ conditions that guarantee strong duality in convex problems are called constraint qualifications ◮ there exist many types of constraint qualifications SJTU Ying Cui 13 / 46

  14. Slater’s constraint qualification One simple constraint qualification is Slater’s condition (Slater’s constraint qualification): convex problem is strictly feasible, i . e . , there exists an x ∈ int D such that f i ( x ) < 0 , i = 1 , · · · , m , Ax = b ◮ can be refined, e.g., ◮ can replace int D with relint D (interior relative to affine hull) ◮ affine inequalities do not need to hold with strict inequality ◮ reduce to feasibility when the constraints are all affine equalities and inequalities ◮ implies strong duality for convex problems ◮ implies that the dual value is attained when d ∗ > −∞ , i.e., there exists a dual feasible ( λ ∗ , ν ∗ ) with g ( λ ∗ , ν ∗ ) = d ∗ = p ∗ SJTU Ying Cui 14 / 46

  15. Examples Inequality form LP primal problem: c T x min x s . t . Ax � b dual function: � − b T λ, A T λ + c = 0 ( c + A T λ ) T x − b T λ � � g ( λ ) = inf x = −∞ , otherwise dual problem: − b T λ max λ A T λ + c = 0 , λ � 0 s . t . ◮ from weaker form of Slater’s condition: strong duality holds for any LP provided the primal problem is feasible, implying strong duality holds for LPs if the dual is feasible ◮ in fact, p ∗ = d ∗ except when primal and dual are infeasible SJTU Ying Cui 15 / 46

  16. Examples Quadratic program : P ∈ S n ++ x T Px min x s . t . Ax � b dual function: g ( λ ) = inf x ( x T Px + λ T ( Ax − b )) = − 1 4 λ T AP − 1 A T λ − b T λ dual problem: − (1 / 4) λ T AP − 1 A T λ − b T λ max λ s . t . λ � 0 ◮ from weaker form of Slater’s condition: strong duality holds provided the primal problem is feasible ◮ in fact, p ∗ = d ∗ always holds SJTU Ying Cui 16 / 46

  17. Examples A nonconvex problem with strong duality : A �� 0 x T Ax + 2 b T x min x x T x ≤ 1 s . t . dual function: g ( λ ) =inf x ( x T ( A + λ I ) x + 2 b T x − λ ) � − b T ( A + λ I ) † b − λ, A + λ I � 0 , b ∈ R ( A + λ I ) = −∞ , otherwise dual problem and equivalent SDP: − b T ( A + λ I ) † b − λ max max λ, t − t − λ λ � A + λ I � b s . t . A + λ I � 0 , b ∈ R ( A + λ I ) s . t . � 0 b T t ◮ strong duality holds although primal problem is nonconvex (difficult to show) SJTU Ying Cui 17 / 46

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend