Recent Developments of Alternating Direction Method of Multipliers - PowerPoint PPT Presentation

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Shiqian Ma Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop on Optimization for Modern Computation BICMR, Beijing, China September 2, 2014 Shiqian Ma Multi-Block ADMM

Outline ADMM for N = 2 Existing work on ADMM for N ≥ 3 Convergence Rates of ADMM for N ≥ 3 BSUM-M Shiqian Ma Multi-Block ADMM

Alternating Direction Method of Multipliers (ADMM) Convex optimization min f 1 ( x 1 ) + f 2 ( x 2 ) + . . . + f N ( x N ) s.t. A 1 x 1 + A 2 x 2 + . . . + A N x N = b x j ∈ X j , j = 1 , 2 , . . . , N . f j : closed convex function X j : closed convex set Augmented Lagrangian function N N N A j x j − b � + γ � � � A j x j − b � 2 L γ ( x 1 , . . . , x N ; λ ) := f j ( x j ) −� λ, 2 � 2 j =1 j =1 j =1 Shiqian Ma Multi-Block ADMM

Multi-Block ADMM Augmented Lagrangian function N N N A j x j − b � + γ � � � A j x j − b � 2 L γ ( x 1 , . . . , x N ; λ ) := f j ( x j ) −� λ, 2 � 2 j =1 j =1 j =1 Multi-Block ADMM x k +1 argmin x 1 ∈X 1 L γ ( x 1 , x k 2 , . . . , x k N ; λ k )  := 1   x k +1 argmin x 2 ∈X 2 L γ ( x k +1 , x 2 , x k 3 , . . . , x k N ; λ k ) :=   2 1   .  . . x k +1 argmin x N ∈X N L γ ( x k +1 , x k +1 , . . . , x k +1 N − 1 , x N ; λ k ) :=   N 1 2   λ k − γ �� N �  λ k +1 j =1 A j x k +1  := − b .  j Update the primal variables in a Gauss-Seidel manner. Shiqian Ma Multi-Block ADMM

ADMM for N = 2 ADMM for N = 2 x k +1  argmin x 1 ∈X 1 L γ ( x 1 , x k 2 ; λ k ) := 1   x k +1 argmin x 2 ∈X 2 L γ ( x k +1 , x 2 ; λ k ) := 2 1 λ k − γ � � A 1 x k +1 + A 2 x k +1 λ k +1 := − b .   1 2 Long history goes back to variational methods for PDEs in 1950s; Relate to Douglas-Rachford and Peaceman-Rachford Operator Splitting Methods for finding zero of monotone operators. Find x , s.t. , 0 ∈ A ( x ) + B ( x ) . Revisited recently for sparse optimization [Wang-Yang-Yin-Zhang-2008] [Goldstein-Osher-2009] [Boyd-etal-2011] Shiqian Ma Multi-Block ADMM

Global Convergence of ADMM for N = 2 ADMM for N = 2 x k +1  argmin x 1 ∈X 1 L γ ( x 1 , x k 2 ; λ k ) := 1   x k +1 argmin x 2 ∈X 2 L γ ( x k +1 , x 2 ; λ k ) := 2 1 � � λ k − γ A 1 x k +1 + A 2 x k +1 λ k +1  := − b .  1 2 Global convergence for any γ > 0. (Fortin-Glowinski-1983; Gabay-1983; Glowinski-Le Tallec-1989; Eckstein-Bertsekas-1992) ADMM for N = 2 with fixed dual step size x k +1 argmin x 1 ∈X 1 L γ ( x 1 , x k 2 ; λ k )  := 1   x k +1 argmin x 2 ∈X 2 L γ ( x k +1 , x 2 ; λ k ) := 2 1 � � λ k − αγ A 1 x k +1 + A 2 x k +1 λ k +1  := − b .  1 2 α > 0 is a fixed dual step size √ Global convergence for any γ > 0 and α ∈ (0 , 1+ 5 ). 2 Shiqian Ma Multi-Block ADMM

Sublinear Convergence of ADMM for N = 2 Ergodic O (1 / k ) convergence (He-Yuan-2012) Non-Ergodic O (1 / k ) convergence (He-Yuan-2012) Ergodic O (1 / k ) convergence (Monteiro-Svaiter-2013) Shiqian Ma Multi-Block ADMM

Linear Convergence Rate of ADMM for N = 2 Douglas-Rachford splitting method converges linearly if B is coercive and Lipschitz (Lions-Mercier-1979) Linear convergence for solving linear programs (Eckstein-Bertsekas-1990) Linear convergence for quadratic programs (Han-Yuan-2013; Boley-2013) Shiqian Ma Multi-Block ADMM

Generalized ADMM Generalized ADMM for N = 2 (Deng-Yin-2012) x k +1 argmin x 1 ∈X 1 L γ ( x 1 , x k 2 ; λ k ) + 1 2 � x 1 − x k 1 � 2  := 1 P   x k +1 argmin x 2 ∈X 2 L γ ( x k +1 , x 2 ; λ k ) + 1 2 � 2 2 � x 2 − x k := 2 1 Q � � λ k − αγ A 1 x k +1 + A 2 x k +1 λ k +1  := − b .  1 2 One sufficient condition for guaranteeing global linear convergence: P = Q = 0, α = 1, f 2 strongly convex, ∇ f 2 Lipschtiz continuous, A 2 full row rank. Shiqian Ma Multi-Block ADMM

ADMM for N ≥ 3: a counter example A negative result (Chen-He-Ye-Yuan-2013): Direct extension of multi-block ADMM is not necessarily convergent A counter example:   1 1 1 A 1 x 1 + A 2 x 2 + A 3 x 3 = 0 , where A = ( A 1 , A 2 , A 3 ) = 1 1 2   1 2 2 The update of multi-block ADMM with γ = 1 is  3 0 0 0 0 0   0 − 4 − 5 1 1 1  x k +1 x k 4 6 0 0 0 0   0 0 − 7 1 1 2   1 1       x k +1   x k 5 7 9 0 0 0 0 0 0 1 2 2        2  2  =         x k +1 x k 1 1 1 1 0 0 0 0 0 1 0 0       3  3     λ k λ k +1 1 1 2 0 1 0 0 0 0 0 1 0     1 2 2 0 0 1 0 0 0 0 0 1 Shiqian Ma Multi-Block ADMM

ADMM for N ≥ 3: a counter example Equivalently, x k +1 x k     2 2  = M  , x k +1 x k where   3 3 λ k +1 λ k  144 − 9 − 9 − 9 18  8 157 − 5 13 − 8   1   M = 64 122 122 − 58 − 64 .   162   56 − 35 − 35 91 56   − 88 − 26 − 26 − 62 88 Note that ρ ( M ) > 1. Theorem (Chen-He-Ye-Yuan-2013) There existing an example where the direct extension of ADMM of three blocks with a real number initial point is not necessarily convergent for any choice of γ > 0. Shiqian Ma Multi-Block ADMM

ADMM for N ≥ 3: Strong convexity? 0 . 05 x 2 1 + 0 . 05 x 2 2 + 0 . 05 x 2 min 3     1 1 1 x 1  = 0 . s.t. 1 1 2 x 2    1 2 2 x 3 For γ = 1, ρ ( M ) = 1 . 0087 > 1 Able to find a proper initial point such that ADMM diverges Even for strongly convex programming, the extended ADMM is not necessarily convergent for a certain γ > 0. Shiqian Ma Multi-Block ADMM

ADMM for N ≥ 3: Strong convexity works! Global convergence Theorem (Han-Yuan-2012) If f i , i = 1 , . . . , N are strongly convex with parameter σ i ’s, and � � 2 σ i 0 < γ < min , 3( N − 1) λ max ( A ⊤ i A i ) i =1 ,..., N then multi-block ADMM globally converges. Convergence Rate? Shiqian Ma Multi-Block ADMM

ADMM for N ≥ 3: weaker condition and convergence rate t t 1 1 λ t = � x k +1 , 1 ≤ i ≤ N , ¯ � λ k +1 . x t u := ( x 1 , . . . , x N ) , ¯ i = i t + 1 t + 1 k =0 k =0 Theorem (Lin-Ma-Zhang-2014a) If f 2 , . . . , f N are strongly convex, f 1 is convex, and � 2 σ i 2 σ N � γ ≤ min i A i ) , , (2 N − i )( i − 1) λ max ( A ⊤ ( N − 2)( N + 1) λ max ( A ⊤ N A N ) 2 ≤ i ≤ N − 1 � � N u t ) − f ( u ∗ ) | = O (1 / t ), and � x t � � then | f (¯ A i ¯ i − b � = O (1 / t ). � � � i =1 Weaker condition Ergodic O (1 / t ) convergence rate in terms of objective value and primal feasibility Shiqian Ma Multi-Block ADMM

ADMM for N ≥ 3: non-ergodic convergence rate Optimality measure: if A 2 x k +1  − A 2 x k 2 = 0 , 2  A 3 x k +1 − A 3 x k 3 = 0 , 3 A 1 x k +1 + A 2 x k +1 + A 3 x k +1  − b = 0 , 1 2 3 then ( x k +1 , x k +1 , x k +1 , λ k +1 ) is optimal. 1 2 3 Define � A 1 x k +1 + A 2 x k +1 + A 3 x k +1 − b � 2 R k +1 := 1 2 3 2 � 2 + 3 � A 3 x k +1 +2 � A 2 x k +1 − A 2 x k − A 3 x k 3 � 2 . 2 3 We can prove: R k = o (1 / k ) Shiqian Ma Multi-Block ADMM

ADMM for N ≥ 3: non-ergodic convergence rate Theorem (Lin-Ma-Zhang-2014a) If f 2 and f 3 are strongly convex, and � σ 2 σ 3 � γ ≤ min 2 A 2 ) , , 2 λ max ( A ⊤ 2 λ max ( A ⊤ 3 A 3 ) then ∞ � R k < + ∞ and R k = o (1 / k ) . k =1 Shiqian Ma Multi-Block ADMM

ADMM for N ≥ 3: non-ergodic convergence rate Theorem (Lin-Ma-Zhang-2014a) If f 2 , . . . , f N are strongly convex, and � � 2 σ i 2 σ N γ ≤ min i A i ) , , (2 N − i )( i − 1) λ max ( A ⊤ ( N − 2)( N + 1) λ max ( A ⊤ N A N ) 2 ≤ i ≤ N − 1 then ∞ � R k < + ∞ and R k = o (1 / k ) , k =1 where 2 � N � N (2 N − i )( i − 1) � � � A i x k +1 � i − A i x k +1 � 2 . � A i x k R k +1 := − b + � � i i 2 � � � i =1 � i =2 Shiqian Ma Multi-Block ADMM

ADMM for N ≥ 3: global linear convergence Globally linear convergence of ADMM for N ≥ 3 (Lin-Ma-Zhang-2014b) s.c. Lipschitz full row rank full column rank 1 f 2 , · · · , f N ∇ f N A N — 2 f 1 , · · · , f N ∇ f 1 , · · · , ∇ f N — — 3 f 2 , · · · , f N ∇ f 1 , · · · , ∇ f N — A 1 Table: Three scenarios leading to global linear convergence Reduce to the conditions in (Deng-Yin-2012) when N = 2 Shiqian Ma Multi-Block ADMM

Recent Developments of Alternating Direction Method of Multipliers - PowerPoint PPT Presentation

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Shiqian Ma Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop on Optimization for Modern

A.C. generates an alternating field Alternating field generates eddy currents in

Alternating Permutations Richard P. Stanley M.I.T. Alternating Permutations p. 1

Alternating Permutations Richard P. Stanley M.I.T. Alternating Permutations p. Basic

Alternating Direction Method of Multipliers Prof S. Boyd NIPS Workshop on Optimization for

Alternating Direction Method of Multipliers Prof S. Boyd HYCON 2, Trento, 23/6/11 source:

Alternating Current Slide 2 / 69 Topics to be covered Sources of alternating EMF Transformers

Alternating Permutations Richard P. Stanley M.I.T. Alternating Permutations p. 1 Basic

Unit 10: Alternating-current circuits Introduction. Alternating current features. Phasor

Alternating-time temporal logic Mehdi Dastani BBL-521 M.M.Dastani@uu.nl ATL: Alternating-time

Alternating offers bargaining with risk of breakdown Julio D avila 2009 Julio D avila

Two-Way Alternating Automata and Finite Models Tedious proofs of irrelevant results Mikolaj

Seminar on Seminar on Recent Developments in Project Management Recent Developments in Project

WINE & DINE Direction Munich Rotkreuz Direction Frankfurt Direction Basle Vorarlberg

On the O (1 / k ) Convergence of Asynchronous Distributed Alternating Direction Method of

Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization Feihu

Recent Developments in Disjunctive Programming Egon Balas Carnegie Mellon University Recent

QCD - introduction lagrangian, symmetries, running coupling, Coulomb gauge Lagrangian Quantum

ENVIRONMENTAL GEOMECHANICS CE-641 Lecture No. 9 Prof. D N Singh Department of Civil Engineering

Evaluating management scenarios for fished ressources of the New Caledonian lagoon using a

Open source your daily work! What happens if your daily work is happening in the open?

A Parallel Bundle Method for Asynchronous Subspace Optimization in Lagrangian Relaxation Frank

Symbolic Aggregate Case of Interval . . . ApproXimation (SAX) How Measurement . . . How

Counterexample-Guided Polynomial Quantitative Loop Invariants by Lagrange Interpolation Yu-Fang

EC400 Part II, Math for Micro: Lecture 5 Leonardo Felli NAB.SZT 15 September 2010 One

Sambuz

Useful Links

Newsletter

Mail Us

Recent Developments of Alternating Direction Method of Multipliers - PowerPoint PPT Presentation

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Shiqian Ma Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop on Optimization for Modern

A.C. generates an alternating field Alternating field generates eddy currents in

Alternating Permutations Richard P. Stanley M.I.T. Alternating Permutations p. 1

Alternating Permutations Richard P. Stanley M.I.T. Alternating Permutations p. Basic

Alternating Direction Method of Multipliers Prof S. Boyd NIPS Workshop on Optimization for

Alternating Direction Method of Multipliers Prof S. Boyd HYCON 2, Trento, 23/6/11 source:

Alternating Current Slide 2 / 69 Topics to be covered Sources of alternating EMF Transformers

Alternating Permutations Richard P. Stanley M.I.T. Alternating Permutations p. 1 Basic

Unit 10: Alternating-current circuits Introduction. Alternating current features. Phasor

Alternating-time temporal logic Mehdi Dastani BBL-521 M.M.Dastani@uu.nl ATL: Alternating-time

Alternating offers bargaining with risk of breakdown Julio D avila 2009 Julio D avila

Two-Way Alternating Automata and Finite Models Tedious proofs of irrelevant results Mikolaj

Seminar on Seminar on Recent Developments in Project Management Recent Developments in Project

WINE &amp; DINE Direction Munich Rotkreuz Direction Frankfurt Direction Basle Vorarlberg

On the O (1 / k ) Convergence of Asynchronous Distributed Alternating Direction Method of

Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization Feihu

Recent Developments in Disjunctive Programming Egon Balas Carnegie Mellon University Recent

QCD - introduction lagrangian, symmetries, running coupling, Coulomb gauge Lagrangian Quantum

ENVIRONMENTAL GEOMECHANICS CE-641 Lecture No. 9 Prof. D N Singh Department of Civil Engineering

Evaluating management scenarios for fished ressources of the New Caledonian lagoon using a

Open source your daily work! What happens if your daily work is happening in the open?

A Parallel Bundle Method for Asynchronous Subspace Optimization in Lagrangian Relaxation Frank

Symbolic Aggregate Case of Interval . . . ApproXimation (SAX) How Measurement . . . How

Counterexample-Guided Polynomial Quantitative Loop Invariants by Lagrange Interpolation Yu-Fang

EC400 Part II, Math for Micro: Lecture 5 Leonardo Felli NAB.SZT 15 September 2010 One

Sambuz

Useful Links

Newsletter

Mail Us

WINE & DINE Direction Munich Rotkreuz Direction Frankfurt Direction Basle Vorarlberg