CPCA: A Chebyshev Proxy and Consensus based Algorithm for General - PowerPoint PPT Presentation

CPCA: A Chebyshev Proxy and Consensus based Algorithm for General Distributed Optimization (Accepted by 2020 American Control Conference) Zhiyu He, Jianping He*, Cailian Chen and Xinping Guan Shanghai Jiao Tong University March 2020 *Corresponding author: Jianping He, Email: jphe@sjtu.edu.cn 1 / 22

Introduction Background Distributed Optimization ◮ What is Distributed Optimization? Distributed Optimization enables multiple agents in a network to collaboratively solve the problem of optimizing the average of local objective functions. ◮ Why not Centralized Optimization? possible lack of central authority Source: http://php.scripts.psu.edu/muz16/image/slide/dis_opt_slide.png efficiency, privacy-preserving and Figure 1 The Illustration of Distributed Optimization robustness issues 1 1A. Nedi´ c et al. , “Distributed optimization for control,” Annual Review of Control, Robotics, and Autonomous Systems , vol. 1, pp. 77–103, 2018 2 / 22

Introduction Background Distributed Optimization: Application Scenarios • Distributed Optimization empowers networked multi-agent systems (a) Distributed Machine Learning 2 (b) Distributed Localization in Wireless Sensor Networks 3 (c) Distributed Coordination in Smart Grid 4 (d) Distributed Management of Multi-robot Formations 5 Figure 2 Application Ranges of Distributed Optimization 2S. Boyd et al. , Found. Trends Mach. Learn. , 2011, 3 Y. Zhang et al. , IEEE Trans. Wireless Commun. , 2015, 4 C. Zhao et al. , IEEE Trans. Smart Grid , 2016, 5 W. Ren et al. , ROBOT 3 / 22 AUTON SYST. , 2008.

Introduction Existing Works Distributed Optimization: Existing Works ◮ We classify existing distributed optimization algorithms into three categories: Primal Methods: DGD 6 , EXTRA 7 , Acc-DNGD 8 , . . . Dual Methods: MSDA 9 , Distributed FGM 10 , . . . Primal-Dual Methods: DCS 11 , MSPD 12 , . . . ◮ Is there any deficiency? Convexity assumptions on the objectives are prerequisites. 1 Computational costs can be large, as certain local computations are constantly performed by every 2 agent at every iteration. 6A. Nedic et al. , IEEE Trans. Autom. Control , 2009, 7W. Shi et al. , SIAM J. Optim. , 2015, 8G. Qu et al. , IEEE Trans. Autom. Control , 2019, 9K. Scaman et al. , in Proc. Int. Conf. Mach. Learn. , 2017, 10C. A. Uribe et al. , arXiv e-prints , 2018, 11G. Lan et al. , Math. Program. , 2017, 12K. Scaman et al. , in Adv Neural Inf Process Syst , 2018. 4 / 22

Introduction Motivation and Contributions Motivation and Contributions Motivation to develop distributed optimization algorithms that have low computational costs handle problems with non-convex objectives Our Contributions be the first to obtain ǫ globally optimal solutions of constrained distributed optimization problems without convex-objective assumptions propose a novel algorithm, CPCA, based on Chebyshev polynomial approximation and consensus provide comprehensive analysis of the accuracy and complexities of CPCA 5 / 22

Our Algorithm: CPCA Problem Formulation Problem Formulation The constrained distributed optimization problem we considered is N f ( x ) = 1 � min f i ( x ) , N x i =1 (1) N � s.t. x ∈ X = X i , X i ⊂ R . i =1 Assumptions Note G is a static, connected and undirected graph. Convexity assumptions on the objectives are 1 dropped. Every f i ( x ) is Lipschitz continuous on X i . 2 The problem is a constrained one. All X i are the same closed interval [ a, b ] . 3 6 / 22

Our Algorithm: CPCA Overview of CPCA Key Ideas Inspirations Researchers use Chebyshev polynomial approximation to substitute for the target function defined on an interval, so as to make the study of its property much easier. m � f ( x ) ≈ p ( x ) = c i T i ( x ) , x ∈ [ − 1 , 1] i =0 Source: T. A. Driscoll et al. , Chebfun guide , 2014 Insights turn to optimize the approximation (i.e. the proxy) of the global objective, to obtain ǫ -optimal solutions for any given error tolerance ǫ use average consensus to enable every agent to get such a global proxy compute the optimal value of the global proxy based on its stationary points 7 / 22

Our Algorithm: CPCA Overview of CPCA Overview of CPCA Algorithm 2: Consensus Iteration Algorithm 1: Initialization 𝑑 10 ··· 𝑑 1𝑛 1 Adaptive Chebyshev local objective ··· 𝑔 𝑗 (𝑦) 𝑞 𝑗 (𝑦) Processing function Interpolation (ACI) 𝑑 𝑗0 ··· 𝑑 𝑗𝑛 𝑗 0 local vector 𝑞 𝑗 Chebyshev polynomial average approximation (proxy) 𝑂 Average Consensus 𝑞 = 1 0 𝑂 𝑞 𝑗 𝑗=1 𝑑 0 ··· 𝑑 𝑛 stationary points Algorithm 3: Finding Minima based method ··· 𝐿 𝐿 𝑦 local vector 𝑞 𝑗 ∗ , 𝑌 𝑓 ∗ 𝑔 𝑞 𝑗 Processing Find Minima 𝑑 0 ··· 𝑑 𝑛 𝑓 converge to 𝑞 proxy for global objective 𝑔(𝑦) Figure 3 The Architecture of CPCA 8 / 22

Our Algorithm: CPCA Algorithm Development Initialization: Construction of Approximations Goal Construct the Chebyshev polynomial approximation p i ( x ) for f i ( x ) , such that | f i ( x ) − p i ( x ) | ≤ ǫ 1 , ∀ x ∈ [ a, b ] . Then, get p 0 i storing the information of the Chebyshev coefficients with additional computations. Details Use Adaptive Chebyshev Interpolation 13 to get p i ( x ) . 1 Through certain recurrence formula, compute p 0 i storing the coefficients of the derivative of 2 p i ( x ) . (To guarantee that the closeness between vectors translates to the closeness between functions.) 13J. P. Boyd, Solving Transcendental Equations: The Chebyshev Polynomial Proxy and Other Numerical Rootfinders, Perturbation Series, and Oracles . SIAM, 2014, vol. 139. 9 / 22

Our Algorithm: CPCA Algorithm Development Initialization: Construction of Approximations • Examples ◮ Setup: precision requirement ǫ 1 = 10 − 6 , constraint set X = [ − 3 , 3] ◦ Case I ◦ Case II 2 e 0 . 1 x + 1 4 x 4 + 2 3 x 3 − 1 2 x 2 − 2 x 1 f 2 ( x ) = 1 2 e − 0 . 1 x f 1 ( x ) = Adaptive Interpolation Adaptive Interpolation � x � x p 1 ( x ) = � 4 p 2 ( x ) = � 4 � � j =0 c j T j j =0 c j T j 3 3 recurrence formula recurrence formula ′ ′ p 0 1 = [1 . 0226 , 0 , 0 . 0303 , 0 , 1 . 1301 × 10 − 4 ] p 0 2 = [5 . 3437 , 7 , 17 . 25 , 9 , 6 . 75] (In fact, | f 1 ( x ) − p 1 ( x ) | ≤ 4 . 8893 × 10 − 8 , x ∈ X .) (In fact, | f 2 ( x ) − p 2 ( x ) | ≤ 1 . 7036 × 10 − 14 , x ∈ X .) 10 / 22

Our Algorithm: CPCA Algorithm Development Iteration: Consensus-based Update of Local Vectors Goal Make local vectors p K p of all the initial values p 0 converge to the average ¯ i , i.e., i � � p K � max i − ¯ p ∞ ≤ δ, � i ∈V where ǫ 2 δ = 1 + b − a ln m + 3 � � 2 2 is proportional to the given precision ǫ 2 , with m = max i ∈V m i . Strategies Run linear time average consensus 14 for certain rounds. 14A. Olshevsky, “Linear time average consensus and distributed optimization on fixed graphs,” SIAM J. Optim. , vol. 55, no. 6, pp. 3990–4014, 2017. 11 / 22

Our Algorithm: CPCA Algorithm Development Iteration: Consensus-based Update of Local Vectors Further Assumption: Every agent in the network knows an upper bound U on N . Iteration Rules  q k − 1 − q k − 1 + 1 i = q k − 1 � j i  p k max( d i , d j ) ,   i 2   j ∈N i � � 2  i − p k − 1  q k i = p k ( p k i + 1 − ) .   i 9 U + 1  The number of iterations K is √ �� 2 U � r U i − s U ln( δ/ 2 i � ∞ ) K ← max , U , ln ρ where r k i , s k i are two variables updated based on max/min consensus, so that � r U i − s U i � ∞ equals � � � p k i − p k to max i,j ∈V ∞ . � j 12 / 22

Our Algorithm: CPCA Algorithm Development Iteration: Consensus-based Update of Local Vectors Results � � �� N log m With K ∼ O N log iterations, we have ǫ 2 � � p K � max i − ¯ p ∞ ≤ δ. � i ∈V This translates to | p K i ( x ) − ¯ p ( x ) | ≤ ǫ 2 , where p K p ( x ) are the Chebyshev polynomials recovered from p K i ( x ) , ¯ i , ¯ p , respectively. 13 / 22

Our Algorithm: CPCA Algorithm Development Finding Minima: Taking a Straightforward Approach Goal i , compute X ∗ Based on p K e , containing ǫ -optimal solutions of (1). Intuitions ◮ After the initialization, we have | ¯ p ( x ) − f ( x ) | ≤ ǫ 1 , x ∈ X . After the iteration, we have | p K i ( x ) − ¯ p ( x ) | ≤ ǫ 2 , x ∈ X . ◮ If we set ǫ 1 = ǫ 2 = ǫ 4 , it follows that | p K i ( x ) − f ( x ) | ≤ ǫ 2 , x ∈ X . ◮ The value of f ( x ) at the optimal points of p K i ( x ) are within ǫ of optimal. e of p K ◮ This means that the points in the optimal set X ∗ i ( x ) are ǫ -optimal solutions of (1). 14 / 22

Our Algorithm: CPCA Algorithm Development Finding Minima: Taking a Straightforward Approach Procedures Recover the polynomial proxy p K i ( x ) from p K i . 1 Construct the colleague matrix M C from p K i , and compute its real eigenvalues. (These are the 2 stationary points of p K i ( x ) .)   0 1 1 1 0   2 2   1 1 0   2 2 M C =   ... ... ...       1 1 0   2 2 cm − 2 cm − 1 c 0 c 1 c 2 1 − − − · · · 2 − − 2 cm 2 cm 2 cm 2 cm 2 cm m × m i ( x ) , and take the optimal points to form X ∗ Compute and compare the critical values of p K e . 3 15 / 22

CPCA: A Chebyshev Proxy and Consensus based Algorithm for General - PowerPoint PPT Presentation

CPCA: A Chebyshev Proxy and Consensus based Algorithm for General Distributed Optimization (Accepted by 2020 American Control Conference) Zhiyu He, Jianping He*, Cailian Chen and Xinping Guan Shanghai Jiao Tong University March 2020

Web Proxy Web Proxy Caching Caching Caching Web Proxy Web Proxy Caching By Miquel Company

CS70: Lecture 33. WLLN, Confidence Intervals (CI): Chebyshev vs. CLT 1. Review: Inequalities:

Bounds on Deviation by Markov: Chebyshev Bound E[(R -) 2 ] x 2 variance of R chebyshev.1

I n t e r n s L i g h t n i n g T a l k s Proxy editing PiTiVi Proxy editing

Consensus and Dissent or: Meta - Consensus Consensus about what we have consensus

Prior Work Consensus Consensus Reliable BGP Consensus Reliable BGP Consensus Routing

Consensus Building Consensus is Consensus is finding an acceptable proposal that all members

Prior Work Consensus Consensus Reliable BGP Consensus Reliable BGP Consensus Routing

CS70: Lecture 33. Inequalities: An Overview Chebyshev Distribution Markov WLLN, Confidence

From optimal cubature formulae to Chebyshev lattices: a way towards generalised Clenshaw-Curtis

MySQL Proxy Making MySQL more flexible Jan Kneschke jan@mysql.com MySQL Proxy proxy-servers

C# Design Patterns: Proxy APPLYING THE PROXY PATTERN Steve Smith FORCE MULTIPLIER FOR DEV TEAMS

January 29, 2018 Proxy Statements under Maryland Law 2018 The 2018 proxy season is here.

Raft: A Consensus Algorithm for Replicated Logs Diego Ongaro and John Ousterhout Stanford

CONSENSUS Fall 2012 Ken Birman Consensus a classic problem Consensus abstraction underlies

Membership of the consensus group Membership of the consensus group Members of the group were

Lecture 3: High-level Programming in the Situation Calculus: Golog and ConGolog Yves Lesprance

Divide and Conquer: A Mixture-Based Approach to RAPTOR Regional Adaptation for MCMC Theoretical

CS141: Intermediate Data Structures and Algorithms Divide and Conquer: Design and Analysis Amr

How to Divide Optimal Division into . . . Students into Groups Combined Optimality . . . A More

Parameterization, stacking, and with the CRS Stack method inversion of locally coherent events

LOCAL DECAY FOR WEAK INTERACTIONS WITH MASSLESS PARTICLES JEAN-MARIE BARBAROUX, J ER EMY

Dark Matter Simulations for the Large-Scale Structure of the Universe Raul E. Angulo Advanced

Transformation Design and Operation Working Group Meeting 1 2 30 April 2020 Please ensure that