new primal dual subgradient methods for convex problems
play

New primal-dual subgradient methods for Convex Problems with - PowerPoint PPT Presentation

New primal-dual subgradient methods for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) January 12, 2015 (Les Houches) Yu. Nesterov New primal-dual methods for functional constraints 1/19 Outline 1 Constrained


  1. Lagrange multipliers: interpretation Yu. Nesterov New primal-dual methods for functional constraints 5/19

  2. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Yu. Nesterov New primal-dual methods for functional constraints 5/19

  3. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Denote f I ( x ) = f 0 ( x ) + � λ ( i ) ∗ f i ( x ). i ∈I Yu. Nesterov New primal-dual methods for functional constraints 5/19

  4. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Denote f I ( x ) = f 0 ( x ) + � λ ( i ) ∗ f i ( x ). Consider the problem i ∈I Yu. Nesterov New primal-dual methods for functional constraints 5/19

  5. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Denote f I ( x ) = f 0 ( x ) + � λ ( i ) ∗ f i ( x ). Consider the problem i ∈I P I : min x ∈ Q { f I ( x ) : f i ( x ) ≤ 0 , i �∈ I} . Yu. Nesterov New primal-dual methods for functional constraints 5/19

  6. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Denote f I ( x ) = f 0 ( x ) + � λ ( i ) ∗ f i ( x ). Consider the problem i ∈I P I : min x ∈ Q { f I ( x ) : f i ( x ) ≤ 0 , i �∈ I} . Observation: in any case, x ∗ is the optimal solution of problem P I . Yu. Nesterov New primal-dual methods for functional constraints 5/19

  7. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Denote f I ( x ) = f 0 ( x ) + � λ ( i ) ∗ f i ( x ). Consider the problem i ∈I P I : x ∈ Q { f I ( x ) : f i ( x ) ≤ 0 , i �∈ I} . min Observation: in any case, x ∗ is the optimal solution of problem P I . Interpretation: λ ( i ) are the shadow prices for resources. ∗ Yu. Nesterov New primal-dual methods for functional constraints 5/19

  8. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Denote f I ( x ) = f 0 ( x ) + � λ ( i ) ∗ f i ( x ). Consider the problem i ∈I P I : min x ∈ Q { f I ( x ) : f i ( x ) ≤ 0 , i �∈ I} . Observation: in any case, x ∗ is the optimal solution of problem P I . Interpretation: λ ( i ) are the shadow prices for resources. ∗ (Kantorovich, 1939) Yu. Nesterov New primal-dual methods for functional constraints 5/19

  9. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Denote f I ( x ) = f 0 ( x ) + � λ ( i ) ∗ f i ( x ). Consider the problem i ∈I P I : min x ∈ Q { f I ( x ) : f i ( x ) ≤ 0 , i �∈ I} . Observation: in any case, x ∗ is the optimal solution of problem P I . Interpretation: λ ( i ) are the shadow prices for resources. ∗ (Kantorovich, 1939) Application examples: Yu. Nesterov New primal-dual methods for functional constraints 5/19

  10. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Denote f I ( x ) = f 0 ( x ) + � λ ( i ) ∗ f i ( x ). Consider the problem i ∈I P I : min x ∈ Q { f I ( x ) : f i ( x ) ≤ 0 , i �∈ I} . Observation: in any case, x ∗ is the optimal solution of problem P I . Interpretation: λ ( i ) are the shadow prices for resources. ∗ (Kantorovich, 1939) Application examples: Traffic congestion: car flows on roads ⇔ size of queues. Yu. Nesterov New primal-dual methods for functional constraints 5/19

  11. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Denote f I ( x ) = f 0 ( x ) + � λ ( i ) ∗ f i ( x ). Consider the problem i ∈I P I : min x ∈ Q { f I ( x ) : f i ( x ) ≤ 0 , i �∈ I} . Observation: in any case, x ∗ is the optimal solution of problem P I . Interpretation: λ ( i ) are the shadow prices for resources. ∗ (Kantorovich, 1939) Application examples: Traffic congestion: car flows on roads ⇔ size of queues. Electrical networks: currents in the wires ⇔ voltage potentials, etc. Yu. Nesterov New primal-dual methods for functional constraints 5/19

  12. Lagrange multipliers: interpretation Let I ⊆ { 1 , . . . , m } be an arbitrary set of indexes. Denote f I ( x ) = f 0 ( x ) + � λ ( i ) ∗ f i ( x ). Consider the problem i ∈I P I : min x ∈ Q { f I ( x ) : f i ( x ) ≤ 0 , i �∈ I} . Observation: in any case, x ∗ is the optimal solution of problem P I . Interpretation: λ ( i ) are the shadow prices for resources. ∗ (Kantorovich, 1939) Application examples: Traffic congestion: car flows on roads ⇔ size of queues. Electrical networks: currents in the wires ⇔ voltage potentials, etc. Main question: How to compute ( x ∗ , λ ∗ )? Yu. Nesterov New primal-dual methods for functional constraints 5/19

  13. Algebraic interpretation Yu. Nesterov New primal-dual methods for functional constraints 6/19

  14. Algebraic interpretation � m λ ( i ) f i ( x ). Consider the Lagrangian L ( x , λ ) = f 0 ( x ) + i =1 Yu. Nesterov New primal-dual methods for functional constraints 6/19

  15. Algebraic interpretation � m λ ( i ) f i ( x ). Consider the Lagrangian L ( x , λ ) = f 0 ( x ) + i =1 m � λ ( i ) Condition KKT(1): �∇ f 0 ( x ∗ ) + ∗ ∇ f i ( x ∗ ) , x − x ∗ � ≥ 0, i =1 ∀ x ∈ Q , Yu. Nesterov New primal-dual methods for functional constraints 6/19

  16. Algebraic interpretation � m λ ( i ) f i ( x ). Consider the Lagrangian L ( x , λ ) = f 0 ( x ) + i =1 m � λ ( i ) Condition KKT(1): �∇ f 0 ( x ∗ ) + ∗ ∇ f i ( x ∗ ) , x − x ∗ � ≥ 0, i =1 ∀ x ∈ Q , implies x ∗ ∈ Arg min x ∈ Q L ( x , λ ∗ ). Yu. Nesterov New primal-dual methods for functional constraints 6/19

  17. Algebraic interpretation � m λ ( i ) f i ( x ). Consider the Lagrangian L ( x , λ ) = f 0 ( x ) + i =1 m � λ ( i ) Condition KKT(1): �∇ f 0 ( x ∗ ) + ∗ ∇ f i ( x ∗ ) , x − x ∗ � ≥ 0, i =1 ∀ x ∈ Q , implies x ∗ ∈ Arg min x ∈ Q L ( x , λ ∗ ). Define the dual function φ ( λ ) = min x ∈ Q L ( x , λ ), λ ≥ 0. Yu. Nesterov New primal-dual methods for functional constraints 6/19

  18. Algebraic interpretation � m λ ( i ) f i ( x ). Consider the Lagrangian L ( x , λ ) = f 0 ( x ) + i =1 m � λ ( i ) Condition KKT(1): �∇ f 0 ( x ∗ ) + ∗ ∇ f i ( x ∗ ) , x − x ∗ � ≥ 0, i =1 ∀ x ∈ Q , implies x ∗ ∈ Arg min x ∈ Q L ( x , λ ∗ ). Define the dual function φ ( λ ) = min x ∈ Q L ( x , λ ), λ ≥ 0. It is concave! Yu. Nesterov New primal-dual methods for functional constraints 6/19

  19. Algebraic interpretation � m λ ( i ) f i ( x ). Consider the Lagrangian L ( x , λ ) = f 0 ( x ) + i =1 m � λ ( i ) Condition KKT(1): �∇ f 0 ( x ∗ ) + ∗ ∇ f i ( x ∗ ) , x − x ∗ � ≥ 0, i =1 ∀ x ∈ Q , implies x ∗ ∈ Arg min x ∈ Q L ( x , λ ∗ ). Define the dual function φ ( λ ) = min x ∈ Q L ( x , λ ), λ ≥ 0. It is concave! By Danskin’s Theorem, ∇ φ ( λ ) = ( f 1 ( x ( λ )) , . . . , f m ( x ( λ )), with x ( λ ) ∈ Arg min x ∈ Q L ( x , λ ). Yu. Nesterov New primal-dual methods for functional constraints 6/19

  20. Algebraic interpretation � m λ ( i ) f i ( x ). Consider the Lagrangian L ( x , λ ) = f 0 ( x ) + i =1 m � λ ( i ) Condition KKT(1): �∇ f 0 ( x ∗ ) + ∗ ∇ f i ( x ∗ ) , x − x ∗ � ≥ 0, i =1 ∀ x ∈ Q , implies x ∗ ∈ Arg min x ∈ Q L ( x , λ ∗ ). Define the dual function φ ( λ ) = min x ∈ Q L ( x , λ ), λ ≥ 0. It is concave! By Danskin’s Theorem, ∇ φ ( λ ) = ( f 1 ( x ( λ )) , . . . , f m ( x ( λ )), with x ( λ ) ∈ Arg min x ∈ Q L ( x , λ ). Conditions KKT(2,3): f i ( x ∗ ) ≤ 0, λ ( i ) ∗ f i ( x ∗ ) = 0, i = 1 , . . . , m , Yu. Nesterov New primal-dual methods for functional constraints 6/19

  21. Algebraic interpretation � m λ ( i ) f i ( x ). Consider the Lagrangian L ( x , λ ) = f 0 ( x ) + i =1 m � λ ( i ) Condition KKT(1): �∇ f 0 ( x ∗ ) + ∗ ∇ f i ( x ∗ ) , x − x ∗ � ≥ 0, i =1 ∀ x ∈ Q , implies x ∗ ∈ Arg min x ∈ Q L ( x , λ ∗ ). Define the dual function φ ( λ ) = min x ∈ Q L ( x , λ ), λ ≥ 0. It is concave! By Danskin’s Theorem, ∇ φ ( λ ) = ( f 1 ( x ( λ )) , . . . , f m ( x ( λ )), with x ( λ ) ∈ Arg min x ∈ Q L ( x , λ ). Conditions KKT(2,3): f i ( x ∗ ) ≤ 0, λ ( i ) ∗ f i ( x ∗ ) = 0, i = 1 , . . . , m , imply ( x ∗ = x ( λ ∗ )) λ ∗ ∈ Arg max λ ≥ 0 φ ( λ ). Yu. Nesterov New primal-dual methods for functional constraints 6/19

  22. Algorithmic aspects Yu. Nesterov New primal-dual methods for functional constraints 7/19

  23. Algorithmic aspects Main idea: solve the dual problem max λ ≥ 0 φ ( λ ) Yu. Nesterov New primal-dual methods for functional constraints 7/19

  24. Algorithmic aspects Main idea: solve the dual problem max λ ≥ 0 φ ( λ ) by the subgradient method : Yu. Nesterov New primal-dual methods for functional constraints 7/19

  25. Algorithmic aspects Main idea: solve the dual problem max λ ≥ 0 φ ( λ ) by the subgradient method : 1 . Compute x ( λ k ) and define ∇ φ ( λ k ) = ( f 1 ( x ( λ k )) , . . . , f m ( x ( λ k ))). Yu. Nesterov New primal-dual methods for functional constraints 7/19

  26. Algorithmic aspects Main idea: solve the dual problem max λ ≥ 0 φ ( λ ) by the subgradient method : 1 . Compute x ( λ k ) and define ∇ φ ( λ k ) = ( f 1 ( x ( λ k )) , . . . , f m ( x ( λ k ))). 2 . Update λ k +1 = Project R n + ( λ k + h k ∇ φ ( λ k )). Yu. Nesterov New primal-dual methods for functional constraints 7/19

  27. Algorithmic aspects Main idea: solve the dual problem max λ ≥ 0 φ ( λ ) by the subgradient method : 1 . Compute x ( λ k ) and define ∇ φ ( λ k ) = ( f 1 ( x ( λ k )) , . . . , f m ( x ( λ k ))). 2 . Update λ k +1 = Project R n + ( λ k + h k ∇ φ ( λ k )). Stepsizes h k > 0 are defined in the usual way. Yu. Nesterov New primal-dual methods for functional constraints 7/19

  28. Algorithmic aspects Main idea: solve the dual problem max λ ≥ 0 φ ( λ ) by the subgradient method : 1 . Compute x ( λ k ) and define ∇ φ ( λ k ) = ( f 1 ( x ( λ k )) , . . . , f m ( x ( λ k ))). 2 . Update λ k +1 = Project R n + ( λ k + h k ∇ φ ( λ k )). Stepsizes h k > 0 are defined in the usual way. Main difficulties: Yu. Nesterov New primal-dual methods for functional constraints 7/19

  29. Algorithmic aspects Main idea: solve the dual problem max λ ≥ 0 φ ( λ ) by the subgradient method : 1 . Compute x ( λ k ) and define ∇ φ ( λ k ) = ( f 1 ( x ( λ k )) , . . . , f m ( x ( λ k ))). 2 . Update λ k +1 = Project R n + ( λ k + h k ∇ φ ( λ k )). Stepsizes h k > 0 are defined in the usual way. Main difficulties: Each iteration is time consuming. Yu. Nesterov New primal-dual methods for functional constraints 7/19

  30. Algorithmic aspects Main idea: solve the dual problem max λ ≥ 0 φ ( λ ) by the subgradient method : 1 . Compute x ( λ k ) and define ∇ φ ( λ k ) = ( f 1 ( x ( λ k )) , . . . , f m ( x ( λ k ))). 2 . Update λ k +1 = Project R n + ( λ k + h k ∇ φ ( λ k )). Stepsizes h k > 0 are defined in the usual way. Main difficulties: Each iteration is time consuming. Unclear termination criterion. Yu. Nesterov New primal-dual methods for functional constraints 7/19

  31. Algorithmic aspects Main idea: solve the dual problem max λ ≥ 0 φ ( λ ) by the subgradient method : 1 . Compute x ( λ k ) and define ∇ φ ( λ k ) = ( f 1 ( x ( λ k )) , . . . , f m ( x ( λ k ))). 2 . Update λ k +1 = Project R n + ( λ k + h k ∇ φ ( λ k )). Stepsizes h k > 0 are defined in the usual way. Main difficulties: Each iteration is time consuming. Unclear termination criterion. Low rate of convergence Yu. Nesterov New primal-dual methods for functional constraints 7/19

  32. Algorithmic aspects Main idea: solve the dual problem max λ ≥ 0 φ ( λ ) by the subgradient method : 1 . Compute x ( λ k ) and define ∇ φ ( λ k ) = ( f 1 ( x ( λ k )) , . . . , f m ( x ( λ k ))). 2 . Update λ k +1 = Project R n + ( λ k + h k ∇ φ ( λ k )). Stepsizes h k > 0 are defined in the usual way. Main difficulties: Each iteration is time consuming. Unclear termination criterion. � 1 � Low rate of convergence ( O upper-level iterations). ǫ 2 Yu. Nesterov New primal-dual methods for functional constraints 7/19

  33. Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . . ] Yu. Nesterov New primal-dual methods for functional constraints 8/19

  34. Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . . ] Define the Augmented Lagrangian m � � 2 � λ ( i ) + Kf i ( x ) � 1 2 K � λ � 2 1 λ ∈ R m , L K ( x , λ ) = f 0 ( x ) + + − 2 , 2 K i =1 where K > 0 is a penalty parameter. Yu. Nesterov New primal-dual methods for functional constraints 8/19

  35. Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . . ] Define the Augmented Lagrangian m � � 2 � λ ( i ) + Kf i ( x ) � 1 2 K � λ � 2 1 λ ∈ R m , L K ( x , λ ) = f 0 ( x ) + + − 2 , 2 K i =1 where K > 0 is a penalty parameter. Consider the dual function ˆ � φ ( λ ) = min L ( x , λ ). x ∈ Q Yu. Nesterov New primal-dual methods for functional constraints 8/19

  36. Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . . ] Define the Augmented Lagrangian m � � 2 � λ ( i ) + Kf i ( x ) � 1 2 K � λ � 2 1 λ ∈ R m , L K ( x , λ ) = f 0 ( x ) + + − 2 , 2 K i =1 where K > 0 is a penalty parameter. Consider the dual function ˆ � φ ( λ ) = min L ( x , λ ). x ∈ Q Main properties. Function ˆ φ is concave. Yu. Nesterov New primal-dual methods for functional constraints 8/19

  37. Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . . ] Define the Augmented Lagrangian m � � 2 � λ ( i ) + Kf i ( x ) � 1 2 K � λ � 2 1 λ ∈ R m , L K ( x , λ ) = f 0 ( x ) + + − 2 , 2 K i =1 where K > 0 is a penalty parameter. Consider the dual function ˆ � φ ( λ ) = min L ( x , λ ). x ∈ Q Main properties. Function ˆ φ is concave. Its gradient is Lipschitz continuous with constant 1 K . Yu. Nesterov New primal-dual methods for functional constraints 8/19

  38. Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . . ] Define the Augmented Lagrangian m � � 2 � λ ( i ) + Kf i ( x ) � 1 2 K � λ � 2 1 λ ∈ R m , L K ( x , λ ) = f 0 ( x ) + + − 2 , 2 K i =1 where K > 0 is a penalty parameter. Consider the dual function ˆ � φ ( λ ) = min L ( x , λ ). x ∈ Q Main properties. Function ˆ φ is concave. Its gradient is Lipschitz continuous with constant 1 K . Its unconstrained maximum is attained at the optimal dual solution. Yu. Nesterov New primal-dual methods for functional constraints 8/19

  39. Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . . ] Define the Augmented Lagrangian m � � 2 � λ ( i ) + Kf i ( x ) � 1 2 K � λ � 2 1 λ ∈ R m , L K ( x , λ ) = f 0 ( x ) + + − 2 , 2 K i =1 where K > 0 is a penalty parameter. Consider the dual function ˆ � φ ( λ ) = min L ( x , λ ). x ∈ Q Main properties. Function ˆ φ is concave. Its gradient is Lipschitz continuous with constant 1 K . Its unconstrained maximum is attained at the optimal dual solution. The corresponding point ˆ x ( λ ∗ ) is the optimal primal solution. Yu. Nesterov New primal-dual methods for functional constraints 8/19

  40. Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . . ] Define the Augmented Lagrangian m � � 2 � λ ( i ) + Kf i ( x ) � 1 2 K � λ � 2 1 λ ∈ R m , L K ( x , λ ) = f 0 ( x ) + + − 2 , 2 K i =1 where K > 0 is a penalty parameter. Consider the dual function ˆ � φ ( λ ) = min L ( x , λ ). x ∈ Q Main properties. Function ˆ φ is concave. Its gradient is Lipschitz continuous with constant 1 K . Its unconstrained maximum is attained at the optimal dual solution. The corresponding point ˆ x ( λ ∗ ) is the optimal primal solution. � � λ ( i ) + Kf i ( x ) + = λ ( i ) Hint: Check that the equation is equivalent to KKT(2,3). Yu. Nesterov New primal-dual methods for functional constraints 8/19

  41. Method of Augmented Lagrangians Yu. Nesterov New primal-dual methods for functional constraints 9/19

  42. Method of Augmented Lagrangians � � λ ( i ) + Kf i ( x ) Note that ∇ ˆ φ ( λ ) = 1 + − 1 K λ . K Yu. Nesterov New primal-dual methods for functional constraints 9/19

  43. Method of Augmented Lagrangians � � λ ( i ) + Kf i ( x ) Note that ∇ ˆ φ ( λ ) = 1 + − 1 K λ . K Therefore, the usual gradient method λ k +1 = λ k + K ∇ ˆ φ ( λ k ) is exactly as follows: Yu. Nesterov New primal-dual methods for functional constraints 9/19

  44. Method of Augmented Lagrangians � � λ ( i ) + Kf i ( x ) Note that ∇ ˆ φ ( λ ) = 1 + − 1 K λ . K Therefore, the usual gradient method λ k +1 = λ k + K ∇ ˆ φ ( λ k ) is exactly as follows: Method: λ k +1 = ( λ k + Kf (ˆ x ( λ k ))) + . Yu. Nesterov New primal-dual methods for functional constraints 9/19

  45. Method of Augmented Lagrangians � � λ ( i ) + Kf i ( x ) Note that ∇ ˆ φ ( λ ) = 1 + − 1 K λ . K Therefore, the usual gradient method λ k +1 = λ k + K ∇ ˆ φ ( λ k ) is exactly as follows: Method: λ k +1 = ( λ k + Kf (ˆ x ( λ k ))) + . Advantage: Fast convergence of the dual process. Yu. Nesterov New primal-dual methods for functional constraints 9/19

  46. Method of Augmented Lagrangians � � λ ( i ) + Kf i ( x ) Note that ∇ ˆ φ ( λ ) = 1 + − 1 K λ . K Therefore, the usual gradient method λ k +1 = λ k + K ∇ ˆ φ ( λ k ) is exactly as follows: Method: λ k +1 = ( λ k + Kf (ˆ x ( λ k ))) + . Advantage: Fast convergence of the dual process. Disadvantages: Yu. Nesterov New primal-dual methods for functional constraints 9/19

  47. Method of Augmented Lagrangians � � λ ( i ) + Kf i ( x ) Note that ∇ ˆ φ ( λ ) = 1 + − 1 K λ . K Therefore, the usual gradient method λ k +1 = λ k + K ∇ ˆ φ ( λ k ) is exactly as follows: Method: λ k +1 = ( λ k + Kf (ˆ x ( λ k ))) + . Advantage: Fast convergence of the dual process. Disadvantages: Difficult iteration. Yu. Nesterov New primal-dual methods for functional constraints 9/19

  48. Method of Augmented Lagrangians � � λ ( i ) + Kf i ( x ) Note that ∇ ˆ φ ( λ ) = 1 + − 1 K λ . K Therefore, the usual gradient method λ k +1 = λ k + K ∇ ˆ φ ( λ k ) is exactly as follows: Method: λ k +1 = ( λ k + Kf (ˆ x ( λ k ))) + . Advantage: Fast convergence of the dual process. Disadvantages: Difficult iteration. Unclear termination. Yu. Nesterov New primal-dual methods for functional constraints 9/19

  49. Method of Augmented Lagrangians � � λ ( i ) + Kf i ( x ) Note that ∇ ˆ φ ( λ ) = 1 + − 1 K λ . K Therefore, the usual gradient method λ k +1 = λ k + K ∇ ˆ φ ( λ k ) is exactly as follows: Method: λ k +1 = ( λ k + Kf (ˆ x ( λ k ))) + . Advantage: Fast convergence of the dual process. Disadvantages: Difficult iteration. Unclear termination. No global complexity analysis. Yu. Nesterov New primal-dual methods for functional constraints 9/19

  50. Method of Augmented Lagrangians � � λ ( i ) + Kf i ( x ) Note that ∇ ˆ φ ( λ ) = 1 + − 1 K λ . K Therefore, the usual gradient method λ k +1 = λ k + K ∇ ˆ φ ( λ k ) is exactly as follows: Method: λ k +1 = ( λ k + Kf (ˆ x ( λ k ))) + . Advantage: Fast convergence of the dual process. Disadvantages: Difficult iteration. Unclear termination. No global complexity analysis. Do we have an alternative? Yu. Nesterov New primal-dual methods for functional constraints 9/19

  51. Problem formulation Yu. Nesterov New primal-dual methods for functional constraints 10/19

  52. Problem formulation f ∗ = inf Problem: x ∈ Q { f 0 ( x ) : f i ( x ) ≤ 0 , i = 1 , . . . , m } , Yu. Nesterov New primal-dual methods for functional constraints 10/19

  53. Problem formulation f ∗ = inf Problem: x ∈ Q { f 0 ( x ) : f i ( x ) ≤ 0 , i = 1 , . . . , m } , where f i ( x ), i = 0 , . . . , m , are closed convex functions on Q endowed with a first-order black-box oracles, Yu. Nesterov New primal-dual methods for functional constraints 10/19

  54. Problem formulation f ∗ = inf Problem: x ∈ Q { f 0 ( x ) : f i ( x ) ≤ 0 , i = 1 , . . . , m } , where f i ( x ), i = 0 , . . . , m , are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. Yu. Nesterov New primal-dual methods for functional constraints 10/19

  55. Problem formulation f ∗ = inf Problem: x ∈ Q { f 0 ( x ) : f i ( x ) ≤ 0 , i = 1 , . . . , m } , where f i ( x ), i = 0 , . . . , m , are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q .) Yu. Nesterov New primal-dual methods for functional constraints 10/19

  56. Problem formulation f ∗ = inf Problem: x ∈ Q { f 0 ( x ) : f i ( x ) ≤ 0 , i = 1 , . . . , m } , where f i ( x ), i = 0 , . . . , m , are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q .) Defining the Lagrangian � m λ ( i ) f i ( x ) , x ∈ Q , λ ∈ R m L ( x , λ ) = f 0 ( x ) + + , i =1 Yu. Nesterov New primal-dual methods for functional constraints 10/19

  57. Problem formulation f ∗ = inf Problem: x ∈ Q { f 0 ( x ) : f i ( x ) ≤ 0 , i = 1 , . . . , m } , where f i ( x ), i = 0 , . . . , m , are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q .) Defining the Lagrangian � m λ ( i ) f i ( x ) , x ∈ Q , λ ∈ R m L ( x , λ ) = f 0 ( x ) + + , i =1 def we can introduce the Lagrangian dual problem f ∗ = sup φ ( λ ), λ ∈ R m + Yu. Nesterov New primal-dual methods for functional constraints 10/19

  58. Problem formulation f ∗ = inf Problem: x ∈ Q { f 0 ( x ) : f i ( x ) ≤ 0 , i = 1 , . . . , m } , where f i ( x ), i = 0 , . . . , m , are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q .) Defining the Lagrangian � m λ ( i ) f i ( x ) , x ∈ Q , λ ∈ R m L ( x , λ ) = f 0 ( x ) + + , i =1 def we can introduce the Lagrangian dual problem f ∗ = sup φ ( λ ), λ ∈ R m + where φ ( λ ) def = inf x ∈ Q L ( x , λ ). Yu. Nesterov New primal-dual methods for functional constraints 10/19

  59. Problem formulation f ∗ = inf Problem: x ∈ Q { f 0 ( x ) : f i ( x ) ≤ 0 , i = 1 , . . . , m } , where f i ( x ), i = 0 , . . . , m , are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q .) Defining the Lagrangian � m λ ( i ) f i ( x ) , x ∈ Q , λ ∈ R m L ( x , λ ) = f 0 ( x ) + + , i =1 def we can introduce the Lagrangian dual problem f ∗ = sup φ ( λ ), λ ∈ R m + where φ ( λ ) def = inf x ∈ Q L ( x , λ ). Clearly, f ∗ ≥ f ∗ . Yu. Nesterov New primal-dual methods for functional constraints 10/19

  60. Problem formulation f ∗ = inf Problem: x ∈ Q { f 0 ( x ) : f i ( x ) ≤ 0 , i = 1 , . . . , m } , where f i ( x ), i = 0 , . . . , m , are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q .) Defining the Lagrangian � m λ ( i ) f i ( x ) , x ∈ Q , λ ∈ R m L ( x , λ ) = f 0 ( x ) + + , i =1 def we can introduce the Lagrangian dual problem f ∗ = sup φ ( λ ), λ ∈ R m + where φ ( λ ) def = inf x ∈ Q L ( x , λ ). Clearly, f ∗ ≥ f ∗ . Later, we will show f ∗ = f ∗ algorithmically . Yu. Nesterov New primal-dual methods for functional constraints 10/19

  61. Bregman distances Yu. Nesterov New primal-dual methods for functional constraints 11/19

  62. Bregman distances Prox-function: d ( · ) is strongly convex on Q with parameter one: d ( y ) ≥ d ( x ) + �∇ d ( x ) , y − x � + 1 2 � y − x � 2 , x , y ∈ Q . Yu. Nesterov New primal-dual methods for functional constraints 11/19

  63. Bregman distances Prox-function: d ( · ) is strongly convex on Q with parameter one: d ( y ) ≥ d ( x ) + �∇ d ( x ) , y − x � + 1 2 � y − x � 2 , x , y ∈ Q . Denote by x 0 the prox-center of the set Q : x 0 = arg min x ∈ Q d ( x ). Yu. Nesterov New primal-dual methods for functional constraints 11/19

  64. Bregman distances Prox-function: d ( · ) is strongly convex on Q with parameter one: d ( y ) ≥ d ( x ) + �∇ d ( x ) , y − x � + 1 2 � y − x � 2 , x , y ∈ Q . Denote by x 0 the prox-center of the set Q : x 0 = arg min x ∈ Q d ( x ). Assume d ( x 0 ) = 0. Yu. Nesterov New primal-dual methods for functional constraints 11/19

  65. Bregman distances Prox-function: d ( · ) is strongly convex on Q with parameter one: d ( y ) ≥ d ( x ) + �∇ d ( x ) , y − x � + 1 2 � y − x � 2 , x , y ∈ Q . Denote by x 0 the prox-center of the set Q : x 0 = arg min x ∈ Q d ( x ). Assume d ( x 0 ) = 0. Bregman distance: β ( x , y ) = d ( y ) − d ( x ) − �∇ d ( x ) , y − x � , x , y ∈ Q . Yu. Nesterov New primal-dual methods for functional constraints 11/19

  66. Bregman distances Prox-function: d ( · ) is strongly convex on Q with parameter one: d ( y ) ≥ d ( x ) + �∇ d ( x ) , y − x � + 1 2 � y − x � 2 , x , y ∈ Q . Denote by x 0 the prox-center of the set Q : x 0 = arg min x ∈ Q d ( x ). Assume d ( x 0 ) = 0. Bregman distance: β ( x , y ) = d ( y ) − d ( x ) − �∇ d ( x ) , y − x � , x , y ∈ Q . 2 � x − y � 2 for all x , y ∈ Q . Clearly, β ( x , y ) ≥ 1 Yu. Nesterov New primal-dual methods for functional constraints 11/19

  67. Bregman distances Prox-function: d ( · ) is strongly convex on Q with parameter one: d ( y ) ≥ d ( x ) + �∇ d ( x ) , y − x � + 1 2 � y − x � 2 , x , y ∈ Q . Denote by x 0 the prox-center of the set Q : x 0 = arg min x ∈ Q d ( x ). Assume d ( x 0 ) = 0. Bregman distance: β ( x , y ) = d ( y ) − d ( x ) − �∇ d ( x ) , y − x � , x , y ∈ Q . 2 � x − y � 2 for all x , y ∈ Q . Clearly, β ( x , y ) ≥ 1 Bregman mapping: for x ∈ Q , g ∈ E ∗ and h > 0 Yu. Nesterov New primal-dual methods for functional constraints 11/19

  68. Bregman distances Prox-function: d ( · ) is strongly convex on Q with parameter one: d ( y ) ≥ d ( x ) + �∇ d ( x ) , y − x � + 1 2 � y − x � 2 , x , y ∈ Q . Denote by x 0 the prox-center of the set Q : x 0 = arg min x ∈ Q d ( x ). Assume d ( x 0 ) = 0. Bregman distance: β ( x , y ) = d ( y ) − d ( x ) − �∇ d ( x ) , y − x � , x , y ∈ Q . 2 � x − y � 2 for all x , y ∈ Q . Clearly, β ( x , y ) ≥ 1 Bregman mapping: for x ∈ Q , g ∈ E ∗ and h > 0 define B h ( x , g ) = arg min y ∈ Q { h � g , y − x � + β ( x , y ) } . Yu. Nesterov New primal-dual methods for functional constraints 11/19

  69. Bregman distances Prox-function: d ( · ) is strongly convex on Q with parameter one: d ( y ) ≥ d ( x ) + �∇ d ( x ) , y − x � + 1 2 � y − x � 2 , x , y ∈ Q . Denote by x 0 the prox-center of the set Q : x 0 = arg min x ∈ Q d ( x ). Assume d ( x 0 ) = 0. Bregman distance: β ( x , y ) = d ( y ) − d ( x ) − �∇ d ( x ) , y − x � , x , y ∈ Q . 2 � x − y � 2 for all x , y ∈ Q . Clearly, β ( x , y ) ≥ 1 Bregman mapping: for x ∈ Q , g ∈ E ∗ and h > 0 define B h ( x , g ) = arg min y ∈ Q { h � g , y − x � + β ( x , y ) } . Examples: Euclidean distance, Entropy distance, etc. Yu. Nesterov New primal-dual methods for functional constraints 11/19

  70. Switching subgradient methods: Primal Method Yu. Nesterov New primal-dual methods for functional constraints 12/19

  71. Switching subgradient methods: Primal Method Input parameter: the step size h > 0. Yu. Nesterov New primal-dual methods for functional constraints 12/19

  72. Switching subgradient methods: Primal Method Input parameter: the step size h > 0. Initialization : Compute the prox-center x 0 . Yu. Nesterov New primal-dual methods for functional constraints 12/19

  73. Switching subgradient methods: Primal Method Input parameter: the step size h > 0. Initialization : Compute the prox-center x 0 . Iteration k ≥ 0 : Yu. Nesterov New primal-dual methods for functional constraints 12/19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend