operator splitting techniques and their application to
play

Operator splitting techniques and their application to embedded - PowerPoint PPT Presentation

Operator splitting techniques and their application to embedded optimization problems Puya Latafat (Joint work with Panagiotis Patrinos) IMT School for Advanced Studies Lucca puya.latafat@imtlucca.it Department of Electrical Engineering


  1. Operator splitting techniques and their application to embedded optimization problems Puya Latafat (Joint work with Panagiotis Patrinos) IMT School for Advanced Studies Lucca puya.latafat@imtlucca.it Department of Electrical Engineering (ESAT-STADIUS), KU Leuven panos.patrinos@esat.kuleuven.be September 1, 2016 Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 1 / 19

  2. Outline ◮ structured optimization problem ◮ monotone operators and the splitting principle ◮ a primal-dual algorithm ◮ application: distributed optimization Based on 1. Latafat and Patrinos. "Asymmetric Forward-Backward-Adjoint Splitting for Solving Monotone Inclusions Involving Three Operators." arXiv preprint arXiv:1602.08729 (2016). 2. Latafat, Stella, Patrinos. "New Primal-Dual Proximal Algorithms for Distributed Optimization" accepted for 56th IEEE Conference on Decision and Control (2016) Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 2 / 19

  3. Structured Optimization Problem minimize f ( x ) + g ( Lx ) + h ( x ) R n x ∈ I ���� � �� � ���� nonsmooth nonsmooth smooth R n → ¯ R m → ¯ ◮ f : I I R , g : I I R are proper closed convex functions with easy to compute proximal maps R m to I R n ◮ L is a linear operator from I ◮ h is a differentiable function and ∇ h ( · ) is β -Lipschitz example 1: MPC formulations, h being the quadratic function, L encoding the dynamics, g and f indicator functions for constraint on states and inputs example 2: distributed optimization over graphs (In this talk) more examples: machine learning and signal processing ◮ Goal : find the solution as a fix point of an operator Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 3 / 19

  4. Example: Generalized Lasso with Box Constraint 1 2 � Ax − b � 2 minimize 2 + λ � Lx � 1 R n x ∈ I l ≤ x ≤ u , subject to 1 2 � Ax − b � 2 minimize + λ � Lx � 1 + δ l ≤ x ≤ u ( x ) 2 R n x ∈ I � �� � � �� � � �� � g ( Lx ) f ( x ) h ( x ) ◮ we want algorithms that involve only L , L ⊤ , prox f , prox g , ∇ h ◮ prox g ◦ L is not trivial (unless L ⊤ L = α Id ) ◮ no inner loops or linear systems to solve ◮ no need to introduce dummy variables Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 4 / 19

  5. Subgradients and Monotone Operators ◮ the subdifferential of f is the set valued operator: R n | ( ∀ y ∈ dom f ) � y − x , u � + f ( x ) ≤ f ( y ) } ∂ f : x �→ { u ∈ I example: if 0 ∈ ∂ f ( x ⋆ ) ⇒ f ( x ⋆ ) ≤ f ( y ) ∀ y ∈ dom f example: for differentiable f , ∂ f = ∇ f f ( x ) f ( x 1 ) + � u 2 , x − x 1 � f ( x 1 ) + � u 1 , x − x 1 � x 1 Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 5 / 19

  6. ◮ set valued mapping A is monotone if � x − y , u − v � ≥ 0 ∀ x , y , u ∈ Ax , v ∈ Ay example: ∂ f for proper convex function f ◮ A is maximally monotone if it is not contained in graph of another mapping example: ∂ f for proper closed convex function f y x

  7. ◮ set valued mapping A is monotone if � x − y , u − v � ≥ 0 ∀ x , y , u ∈ Ax , v ∈ Ay example: ∂ f for proper convex function f ◮ A is maximally monotone if it is not contained in graph of another mapping example: ∂ f for proper closed convex function f y x Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 6 / 19

  8. ◮ proximal mapping of proper closed convex function f : � � f ( z ) + 1 2 γ � x − z � 2 prox γ f ( x ) = argmin z ◮ unique minimizer ◮ closed form solution for many functions such as l 1 , l 2 norms, quadratic, log barrier,... example: f = δ C ⇒ prox γ f = P C f ( x ) = 1 2 x ⊤ Qx + q ⊤ x ⇒ prox γ f = ( I + γ Q ) − 1 ( x − γ q ) ◮ equivalently 0 ∈ z − x + γ∂ f ( z ), the resolvent of ∂ f J γ∂ f = ( Id + γ∂ f ) − 1 = prox γ f ◮ not every monotone operator can be written as subgradient of a function example: a linear skew symmetric matrix is monotone but not subgradient of any function Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 7 / 19

  9. Operator Splitting framework minimize f ( x ) + g ( Lx ) + h ( x ) R n x ∈ I ���� � �� � ���� nonsmooth nonsmooth smooth ◮ unconstrained minimization ◮ optimality condition 0 ∈ ∂ f ( x ) + L ∗ ∂ g ( Lx ) + ∇ h ( x ) ◮ monotone inclusion form: 0 ∈ Ax + L ∗ BLx + Cx ◮ A , B , ∂ f , ∂ g are set valued. C is single valued Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 8 / 19

  10. Initial Value Problem and Euler’s methods the path following problem dx ( t ) = −∇ h ( x ( t )) x (0) = x 0 dt x ( t ) → x ⋆ such that x ⋆ minimizes h ( · ) ◮ Euler’s forward method (explicit) x ( t + △ t ) − x ( t ) = −∇ h ( x ( t )) ⇒ x k +1 = x k − γ ∇ h ( x k ) ∼ △ t ◮ Euler’s backward method (implicit) x ( t + △ t ) − x ( t ) = −∇ h ( x ( t + △ t )) ⇒ x k +1 = x k − γ ∇ h ( x k +1 ) ∼ △ t ◮ implicit method is known to have better stability properties Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 9 / 19

  11. ◮ the big idea is to generalize this to the inclusion problem 0 ∈ dx ( t ) + Tx ( t ) dt ◮ the forward step (explicit) ⇒ x k +1 ∈ x k − γ Tx k ◮ the backward step (implicit, less sensitive to ill conditioning) x k +1 ∈ ( Id + γ T ) − 1 x k � �� � resolvent ◮ the splitting principle find x such that 0 ∈ Tx ◮ the idea of splitting principle is to combine these basic operations, also borrowed from finite differences ◮ the backward step J γ T = ( Id + γ T ) − 1 might not be easy to compute ◮ split T = A + B + · · · with one or more having easy to compute resolvent Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 10 / 19

  12. Operator Splittings ◮ two term splittings: ◮ forward-backward splitting (Mercier, 1979) ◮ Douglas-Rachford splitting (Lions and Mercier, 1979) ◮ Tseng’s forward-backward-forward splitting (Tseng 2000) ◮ three term splittings: ◮ three-operator splitting (Davis and Yin, 2015) ◮ V˜ u-Condat’s primal-dual Algorithm, equivalent to forward-backward in a certain space (V˜ u and Condat, 2013) ◮ forward-Douglas-Rachford splitting, only when third operator is a normal cone operator (Briceño-Arias, 2013) splitting our proposed method : A symmetric F orward- B ackward- A djoint splitting (AFBA) Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 11 / 19

  13. Forward-Backward Spliting ◮ monotone inclusion 0 ∈ Ax + Bx ◮ A maximally monotone, B single valued and cocoercive. ◮ minimization problem minimize f ( x ) + h ( x ) R n x ∈ I ���� ���� nonsmooth smooth ◮ forward-backward iteration x n +1 = ( Id + γ A ) − 1 ( Id − γ B ) x n ◮ proximal gradient method: x n +1 = prox γ f ( x n − γ ∇ h ( x n )) Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 12 / 19

  14. A Primal-Dual algorithm Algorithm 1 R n , y 0 ∈ I R m Inputs: x 0 ∈ I for n = 0 , . . . do x n = prox γ 1 f ( x n − γ 1 L ⊤ y n − γ 1 ∇ h ( x n )) ¯ y n +1 = prox γ 2 g ∗ ( y n + γ 2 L ¯ x n ) x n − γ 1 L ⊤ ( y n +1 − y n ) x n +1 = ¯ ◮ Arrow-Hurwicz updates � ◮ it converges if β h γ 1 < 2 − γ 1 γ 2 � L � 2 − γ 1 γ 2 � L � 2 . ◮ 2 matrix vector products ◮ new algorithm ◮ it generalizes Drori and Sabach and Teboulle, 2015, to include a nonsmooth term f Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 13 / 19

  15. Example: Generalized Lasso with Box Constraint 1 2 � Ax − b � 2 minimize 2 + λ � Lx � 1 R n x ∈ I l ≤ x ≤ u , subject to 1 2 � Ax − b � 2 minimize + λ � Lx � 1 + δ l ≤ x ≤ u ( x ) 2 R n x ∈ I � �� � � �� � � �� � g ( Lx ) f ( x ) h ( x ) The first two steps become x n = P l ≤ x ≤ u ( x n − γ 1 L ⊤ y n − γ 1 A ⊤ ( Ax n − b )) ¯ x n − γ − 1 x n ) max {| ¯ x n | − 1 , 0 } y n +1 = ¯ sign (¯ 2 x n − γ 1 L ⊤ ( y n +1 − y n ) x n +1 = ¯ Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 14 / 19

  16. Distributed Optimization ◮ large networks each having it’s own data processing unit ◮ each agent can communicate with its neighbors and is not aware of other agents in the network ◮ plug and play or distributed reconfiguration: with addition or removal of new agents, only the neighbors are ◮ possibility of asynchronous algorithms including transmission delays Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 15 / 19

  17. Application: AFBA and Distributed Optimization � N minimize f i ( x ) + g i ( L i x ) + h i ( x ) i =1 R n x ∈ I � �� � � �� � � �� � nonsmooth nonsmooth smooth 6 5 N 1 = { 2 , 4 , 6 } ◮ N agents, each only with private f i , g i , L i , h i 1 1 ◮ undirected connected graph f 1 , g 1 , L 1 , h 1 G = ( V , E ) 3 ◮ each agent i , can communicate with its neighbors j ∈ N i = { j ∈ V | ( i , j ) ∈ E } 2 4 ◮ goal : minimize the aggregate of private cost functions over a connected graph N � minimize f i ( x i ) + g i ( L i x i ) + h i ( x i ) x i ∈ I R n i =1 subject to x i = x j ( i , j ) ∈ E Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 16 / 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend