multi agent distributed optimization over networks and
play

Multi-agent distributed optimization over networks and its - PowerPoint PPT Presentation

Vistas in Control | ETH Zurich | September 10 - 11 2018 Multi-agent distributed optimization over networks and its application to energy systems Maria Prandini Introduction Robotic networks Social networks taken from AJGpr.com Transportation


  1. Proposed distributed algorithm Local problem of agent i at iteration k + 1 � a i z i ( k ) = j ( k ) x j ( k ) j 1 c ( k ) � x i − z i ( k ) � 2 x i ( k + 1) = arg min x i ∈ X i f i ( x i ) + • Information vector j a i • z i ( k ) = � j ( k ) x j ( k ) • a i j ( k ): how agent i weights info of agent j • Proxy term c ( k ) � x i − z i ( k ) � 2 : deviation from (weighted) average 1 • • c ( k ): trade-off between optimality and agents’ disagreement 13

  2. Proposed distributed algorithm Local problem of agent i at iteration k + 1 � a i z i ( k ) = j ( k ) x j ( k ) j 1 c ( k ) � x i − z i ( k ) � 2 x i ( k + 1) = arg min x i ∈ X i f i ( x i ) + • Does this algorithm converge? • If yes, does it provide the same solution with the centralized problem (had we been able to solve it)? 14

  3. Algorithm analysis 1. Convexity and compactness • f i ( · ): convex for all i • X i : compact, convex, non-empty interior for all i ⇒ f i ( · ): Lipschitz continuous on X i 15

  4. Algorithm analysis 1. Convexity and compactness • f i ( · ): convex for all i • X i : compact, convex, non-empty interior for all i ⇒ f i ( · ): Lipschitz continuous on X i 2. Choice of the proxy term • � c ( k ) � k : non-increasing • Should not decrease too fast � c ( k ) = ∞ k c ( k ) 2 < ∞ � k • E.g., harmonic series 15

  5. Algorithm analysis 3. Information mix • Weights a i j ( k ): non-zero lower bound if link between i − j present ⇒ Info mixing at a non-diminishing rate • Weights a i j ( k ): form a doubly stochastic matrix ⇒ Agents influence each other equally in the long run 16

  6. Algorithm analysis 3. Information mix • Weights a i j ( k ): non-zero lower bound if link between i − j present ⇒ Info mixing at a non-diminishing rate • Weights a i j ( k ): form a doubly stochastic matrix ⇒ Agents influence each other equally in the long run 4. Network connectivity – All information flows (eventually) • Any pair of agents communicates infinitely often • Bounded intercommunication time 16

  7. Algorithm analysis 3. Information mix • Weights a i j ( k ): non-zero lower bound if link between i − j present ⇒ Info mixing at a non-diminishing rate • Weights a i j ( k ): form a doubly stochastic matrix ⇒ Agents influence each other equally in the long run 4. Network connectivity – All information flows (eventually) • Any pair of agents communicates infinitely often • Bounded intercommunication time 16

  8. Algorithm analysis 3. Information mix • Weights a i j ( k ): non-zero lower bound if link between i − j present ⇒ Info mixing at a non-diminishing rate • Weights a i j ( k ): form a doubly stochastic matrix ⇒ Agents influence each other equally in the long run 4. Network connectivity – All information flows (eventually) • Any pair of agents communicates infinitely often • Bounded intercommunication time 16

  9. Algorithm analysis 3. Information mix • Weights a i j ( k ): non-zero lower bound if link between i − j present ⇒ Info mixing at a non-diminishing rate • Weights a i j ( k ): form a doubly stochastic matrix ⇒ Agents influence each other equally in the long run 4. Network connectivity – All information flows (eventually) • Any pair of agents communicates infinitely often • Bounded intercommunication time 16

  10. Algorithm analysis Main result Under the structural + network assumptions, the proposed proximal algorithm converges to some minimizer x ∗ of the centralized problem, i.e., k →∞ � x i ( k ) − x ∗ � = 0 , for all i lim • Asymptotic agreement and optimality • Rate no faster than c ( k ) – “slow enough” to trade agreement and optimality 17

  11. Comparison with other methods • Proximal algorithms vs. gradient/subgradient methods 18

  12. Comparison with other methods • Proximal algorithms 1 c ( k ) � x i − z i ( k ) � 2 x i ( k + 1) = arg min x i ∈ X i f i ( x i ) + • Gradient algorithms � � x i ( k + 1) = P X i z i ( k ) − c ( k ) ∇ f i ( z i ( k )) • Proximal algorithms allow for • No gradient/subgradient calculation – user can feed problem data in any solver • Heterogeneous constraint sets • No differentiability assumptions 19

  13. Comparison with subgradient Optimal power allocation in cellular networks (non-differentiable objective) proposed solution vs. gradient-based approach 25 20 Proximal (Sub)gradient 15 Cost function 10 5 0 0 20 40 60 80 100 Iteration 20

  14. Building district problem revisited – Simulation results Set-up • 3 buildings - 3 zones each (different chiller per building) • Pair-wise communication (gossip) 19

  15. Building district problem revisited – Simulation results Set-up • 3 buildings - 3 zones each (different chiller per building) • Pair-wise communication (gossip) 19

  16. Building district problem revisited – Simulation results Set-up • 3 buildings - 3 zones each (different chiller per building) • Pair-wise communication (gossip) 19

  17. Building district problem revisited – Simulation results Set-up • 3 buildings - 3 zones each (different chiller per building) • Pair-wise communication (gossip) 19

  18. Building district problem revisited – Simulation results Set-up • 3 buildings - 3 zones each (different chiller per building) • Pair-wise communication (gossip) Implementation • Simulation in MATLAB • Optimization solver SEDUMI via the MATLAB interface YALMIP 19

  19. Simulation results – Temperature set-points Optimal zone temperature profiles of building 1 (consensus solution). 26 Zones Temperature [ o C] 25 24 23 22 Zone 1 21 Zone 2 20 Zone 3 Constr. 19 0 2 4 6 8 10 12 14 16 18 20 22 24 Time [h] Temperature of zone 2 (middle one) is always the lowest, it acts as a passive thermal storage draining heat of the other zones through floor/ceiling. 20

  20. Simulation results – Storage usage 200 10 175 7.5 150 5 Energy exchange [MJ] Stored energy [MJ] 125 2.5 100 0 75 -2.5 E 50 s -5 e s 1 2 e s 25 -7.5 3 e s 0 -10 0 2 4 6 8 10 12 14 16 18 20 22 24 Time [h] Solution computed at iteration k = 1 by the middle-chiller building (“blue”). The middle-chiller building uses the storage charged by the others 21

  21. Simulation results – Storage usage 200 10 E s e s 1 175 7.5 2 e s 3 e s 150 5 Energy exchange [MJ] Stored energy [MJ] 125 2.5 100 0 75 -2.5 50 -5 25 -7.5 0 -10 0 2 4 6 8 10 12 14 16 18 20 22 24 Time [h] At consensus, the small-chiller building (“orange”) uses the storage charged by the others 22

  22. Simulation results – Chillers usage 2 1.8 1.6 1.4 1.2 COP 1 0.8 0.6 0.4 Medium Small Large 0.2 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Time [h] COP of the chillers when each building uses a fix fraction of the storage 23

  23. Simulation results – Chillers usage 2 1.8 1.6 1.4 1.2 COP 1 0.8 0.6 0.4 Medium Small 0.2 Large 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Time [h] COP of the chillers in the optimally shared storage case 24

  24. Simulation results Solution computed based on nominal disturbance profiles... 1000 60 LW rad. [W/m 2 ] 800 SW rad. [W/m 2 ] Outside T. [°C] 40 600 Occupancy [#] 400 20 200 0 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Time [h] 25

  25. Simulation results Solution computed based on nominal disturbance profiles... 1000 60 LW rad. [W/m 2 ] 800 SW rad. [W/m 2 ] Outside T. [°C] 40 600 Occupancy [#] 400 20 200 0 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Time [h] Courtesy of Istituto di Scienze dell’Atmosfera e del Clima (ISAC) - CNR 25

  26. Problem set-up Decision-coupled problem � minimize f i ( x ) i subject to � x ∈ X i i 26

  27. Problem set-up Decision-coupled problem with uncertainty � minimize f i ( x ) i subject to � x ∈ X i ( δ ) , for all δ ∈ ∆ i 27

  28. Problem set-up Decision-coupled problem with uncertainty � minimize f i ( x ) i subject to � x ∈ X i ( δ ) , for all δ ∈ ∆ i • Stochastic set-up • δ : Uncertain parameter δ ∼ P • ∆: (Possibly) continuous set • Semi-infinite optimization program 28

  29. Problem set-up Decision-coupled problem with uncertainty � minimize f i ( x ) i subject to � � x ∈ X i ( δ ) i δ ∈ ∆ • Stochastic set-up • δ : Uncertain parameter δ ∼ P • ∆: (Possibly) continuous set • Semi-infinite optimization program 29

  30. Data-based approach Decision-coupled problem with uncertainty � f i ( x ) minimize i subject to � � x ∈ X i ( δ ) i δ ∈ S • Replace ∆ with S 27

  31. Data-based approach Decision-coupled problem with uncertainty � f i ( x ) minimize i subject to � � x ∈ X i ( δ ) i δ ∈ S Two cases: 1. Agents have the same data set S 28

  32. Data-based approach Decision-coupled problem with uncertainty � f i ( x ) minimize i subject to � � x ∈ X i ( δ ) i δ ∈ S Two cases: 1. Agents have the same data set S � � 2. Agents have different data sets S i i 28

  33. Data-based approach Decision-coupled problem with uncertainty � minimize f i ( x ) i subject to � � x ∈ X i ( δ ) i δ ∈ S i Two cases: 1. Agents have the same data set S � � 2. Agents have different data sets S i i 29

  34. Data-based approach Common data set – distributed implementation � f i ( x ) minimize i subject to � � x ∈ X i ( δ ) δ ∈ S i 28

  35. Data-based approach Common data set – distributed implementation � f i ( x ) minimize i subject to � � x ∈ X i ( δ ) δ ∈ S i • Apply proximal algorithm with � δ ∈ S X i ( δ ) in place of X i 28

  36. Data-based approach Common data set – distributed implementation � f i ( x ) minimize i subject to � � x ∈ X i ( δ ) δ ∈ S i • Apply proximal algorithm with � δ ∈ S X i ( δ ) in place of X i • Let x ∗ S denote the converged solution 28

  37. Probabilistic feasibility – Common data set Data-based program P S Robust program P ∆ � � f i ( x ) f i ( x ) minimize minimize i i → x ∗ subject to subject to S � � � � x ∈ X i ( δ ) x ∈ X i ( δ ) i δ ∈ S i δ ∈ ∆ • Is x ∗ S feasible for P ∆ ? 29

  38. Probabilistic feasibility – Common data set Data-based program P S Robust program P ∆ � � f i ( x ) f i ( x ) minimize minimize i i → x ∗ subject to subject to S � � � � x ∈ X i ( δ ) x ∈ X i ( δ ) i δ ∈ S i δ ∈ ∆ • Is x ∗ S feasible for P ∆ ? • Is this true for any S ? 29

  39. Probabilistic feasibility – Common data set Data-based program P S Robust program P ∆ � � f i ( x ) f i ( x ) minimize minimize i i → x ∗ subject to subject to S � � � � x ∈ X i ( δ ) x ∈ X i ( δ ) i δ ∈ S i δ ∈ ∆ Feasibility link [Calafiore & Campi, TAC 2006] Fix β ∈ (0 , 1) and S . With confidence ≥ 1 − β , x ∗ S is feasible with probability ≥ 1 − ǫ ( d , | S | , β ), i.e. � � δ ∈ ∆ : x ∗ � P S / ∈ X i ( δ ) ≤ ǫ ( d , | S | , β ) with prob. ≥ 1 − β i 30

  40. Probabilistic feasibility – Common data set Feasibility link Fix β ∈ (0 , 1) and S . With confidence ≥ 1 − β , x ∗ S is feasible for P ∆ with probability ≥ 1 − ǫ ( d , | S | , β ), i.e. � � � δ ∈ ∆ : x ∗ S / ∈ X i ( δ ) ≤ ǫ ( d , | S | , β ) with prob. ≥ 1 − β P i • On which parameters does ǫ depends on? ǫ = 2 d + ln 1 � � | S | β • Logarithmic in β : β can be set close to 0 • Linear in | S | − 1 : The more data the better the result • Linear in d : # decision variables 31

  41. Data-based approach Different data set – distributed implementation � f i ( x ) minimize i subject to � � x ∈ X i ( δ ) i δ ∈ S i • Apply proximal algorithm with � δ ∈ S i X i ( δ ) in place of X i 32

  42. Data-based approach Different data set – distributed implementation � f i ( x ) minimize i subject to � � x ∈ X i ( δ ) i δ ∈ S i • Apply proximal algorithm with � δ ∈ S i X i ( δ ) in place of X i • Let x ∗ S denote the converged solution, S = { S i } i 32

  43. Probabilistic feasibility – Different data sets Single-agent - a posteriori Fix β i ∈ (0 , 1) and S i . With confidence ≥ 1 − β i , � � δ ∈ ∆ : x ∗ ≤ ǫ i ( d S i P S / ∈ X i ( δ ) i , | S i | , β i ) A posteriori result • d S i i : empirical estimate of “support” samples (wait and see) Changing S i the result will change • Complexity of ǫ i ( d S i i , | S i | , β i ) as in the previous case • Result thanks to [Campi, Garatti & Ramponi, CDC 2015] 33

  44. Probabilistic feasibility – Different data sets Single-agent - a posteriori Fix β i ∈ (0 , 1) and S i . With confidence ≥ 1 − β i , � � δ ∈ ∆ : x ∗ ≤ ǫ i ( d S i S / ∈ X i ( δ ) i ) P • Two-agent example, d = 2 d S 1 1 = 1 and d S 2 d S 1 1 = 0 and d S 2 2 = 1 2 = 2 34

  45. Probabilistic feasibility – Different data sets Multi-agent - a posteriori Fix β ∈ (0 , 1) and � S i � i . With confidence ≥ 1 − β , � � � � ǫ i ( d S i δ ∈ ∆ : x ∗ S / ∈ X i ( δ ) ≤ i ) P i i A posteriori result • Can we turn it into an a priori statement? 35

  46. Probabilistic feasibility – Different data sets Multi-agent - a posteriori Fix β ∈ (0 , 1) and � S i � i . With confidence ≥ 1 − β , � � � � ǫ i ( d S i δ ∈ ∆ : x ∗ S / ∈ X i ( δ ) ≤ i ) P i i A posteriori result • Can we turn it into an a priori statement? i ǫ i ( d S i • What is the worst-case value for � i ) that we can “observe”? 35

  47. Probabilistic feasibility – Different data sets Multi-agent - a posteriori Fix β ∈ (0 , 1) and � S i � i . With confidence ≥ 1 − β , � � � � ǫ i ( d S i δ ∈ ∆ : x ∗ S / ∈ X i ( δ ) ≤ i ) P i i A posteriori result • Can we turn it into an a priori statement? i ǫ i ( d S i • What is the worst-case value for � i ) that we can “observe”? • Conservative bound: d S i ≤ d for all i i 35

  48. Probabilistic feasibility – Different data sets Multi-agent - a posteriori Fix β ∈ (0 , 1) and � S i � i . With confidence ≥ 1 − β , � � � � ǫ i ( d S i δ ∈ ∆ : x ∗ S / ∈ X i ( δ ) ≤ i ) P i i A posteriori result • Can we turn it into an a priori statement? i ǫ i ( d S i • What is the worst-case value for � i ) that we can “observe”? • Conservative bound: d S i ≤ d for all i i i d S i • Sharper bound: � ≤ d (# decision variables) i 35

  49. Probabilistic feasibility – Different data sets Multi-agent - a priori � � Fix β ∈ (0 , 1) and S i i . With confidence ≥ 1 − β , � � δ ∈ ∆ : x ∗ � S / ∈ X i ( δ ) ≤ ǫ P i where � ǫ = maximize ǫ i ( d i ) i subject to � d i ≤ d i 36

  50. Common vs. different data sets 1 0.9 " 0.8 " e 0.7 " Probability of violation 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number of agents - m Approach using different constraint sets • Close to the case of common data sets • Less conservative than the worst case bound 37

  51. Literature comparison Closest approach 1 • almost sure convergence results (need to sample constraints infinitely many times) 1 S. Lee and A. Nedic, Distributed random projection algorithm for convex optimization, IEEE Journal on Selected Topics in Signal Processing 2013. 38

  52. Literature comparison Closest approach 1 • almost sure convergence results (need to sample constraints infinitely many times) Proposed solution • weaker guarantees but with a finite number of samples 1 S. Lee and A. Nedic, Distributed random projection algorithm for convex optimization, IEEE Journal on Selected Topics in Signal Processing 2013. 38

  53. Addressed problems m � min f i ( x ) x i =1 m � s.t. x ∈ X i i =1 • local objectives f i • coupled decision x • local constraints X i Decision-coupled problem 39

  54. Addressed problems m m � � min f i ( x ) min f i ( x i ) x x 1 ,..., x m i =1 i =1 m m � � s.t. x ∈ s.t. g i ( x i ) ≤ 0 X i i =1 i =1 x i ∈ X i ∀ i • local objectives f i • coupled decision x • local constraints X i Decision-coupled problem 39

  55. Addressed problems m m � � min f i ( x ) min f i ( x i ) x x 1 ,..., x m i =1 i =1 m m � � s.t. x ∈ s.t. g i ( x i ) ≤ 0 X i i =1 i =1 x i ∈ X i ∀ i • local objectives f i • local objectives f i • coupled decision x • local constraints X i Decision-coupled problem 39

  56. Addressed problems m m � � min f i ( x ) min f i ( x i ) x x 1 ,..., x m i =1 i =1 m m � � s.t. x ∈ s.t. g i ( x i ) ≤ 0 X i i =1 i =1 x i ∈ X i ∀ i • local objectives f i • local objectives f i • coupled decision x • local decisions x i • local constraints X i Decision-coupled problem 39

  57. Addressed problems m m � � min f i ( x ) min f i ( x i ) x x 1 ,..., x m i =1 i =1 m m � � s.t. x ∈ s.t. g i ( x i ) ≤ 0 X i i =1 i =1 x i ∈ X i ∀ i • local objectives f i • local objectives f i • coupled decision x • local decisions x i • local constraints X i • coupling constraint � m i =1 g i ( x i ) ≤ 0 Decision-coupled problem 39

  58. Addressed problems m m � � min f i ( x ) min f i ( x i ) x x 1 ,..., x m i =1 i =1 m m � � s.t. x ∈ s.t. g i ( x i ) ≤ 0 X i i =1 i =1 x i ∈ X i ∀ i • local objectives f i • local objectives f i • coupled decision x • local decisions x i • local constraints X i • coupling constraint � m i =1 g i ( x i ) ≤ 0 Decision-coupled problem Constraint-coupled problem 39

  59. Proposed solution for constraint-coupled problems m � min f i ( x i ) x 1 ,..., x m i =1 m � s.t. g i ( x i ) ≤ 0 i =1 x i ∈ X i ∀ i • local objectives f i • local decisions x i • coupling constraint � m i =1 g i ( x i ) ≤ 0 40

  60. Proposed solution for constraint-coupled problems At each iteration k , agent i m � min f i ( x i ) x 1 ,..., x m i =1 m � s.t. g i ( x i ) ≤ 0 i =1 x i ∈ X i ∀ i • local objectives f i • local decisions x i • coupling constraint � m i =1 g i ( x i ) ≤ 0 40

  61. Proposed solution for constraint-coupled problems At each iteration k , agent i m � a i ℓ i ( k ) ← j ( k ) λ j ( k ) � min f i ( x i ) x 1 ,..., x m j ∈N i i =1 m � s.t. g i ( x i ) ≤ 0 i =1 x i ∈ X i ∀ i • local objectives f i • local decisions x i • coupling constraint � m i =1 g i ( x i ) ≤ 0 40

  62. Proposed solution for constraint-coupled problems At each iteration k , agent i m � a i ℓ i ( k ) ← j ( k ) λ j ( k ) � min f i ( x i ) x 1 ,..., x m j ∈N i i =1 m � s.t. g i ( x i ) ≤ 0 λ i ( k +1) ← arg max λ i ≥ 0 ˜ ϕ i ( λ i ) i =1 x i ∈ X i ∀ i where • local objectives f i • local decisions x i ϕ i ( λ i ) = λ ⊤ ˜ i g i ( x i ( k +1)) • coupling constraint c ( k ) � λ i − ℓ i ( k ) � 2 1 − 2 � m i =1 g i ( x i ) ≤ 0 40

  63. Proposed solution for constraint-coupled problems At each iteration k , agent i m � a i ℓ i ( k ) ← j ( k ) λ j ( k ) � min f i ( x i ) x 1 ,..., x m j ∈N i i =1 ˜ x i ( k +1) ← arg min f i ( x i ) m � x i ∈ X i s.t. g i ( x i ) ≤ 0 λ i ( k +1) ← arg max λ i ≥ 0 ˜ ϕ i ( λ i ) i =1 x i ∈ X i ∀ i where • local objectives f i f i ( x i ) = f i ( x i ) + ℓ i ( k ) ⊤ g i ( x i ) ˜ • local decisions x i ϕ i ( λ i ) = λ ⊤ ˜ i g i ( x i ( k +1)) • coupling constraint c ( k ) � λ i − ℓ i ( k ) � 2 1 − 2 � m i =1 g i ( x i ) ≤ 0 40

  64. Algorithm analysis Main result (Convergence & optimality) Under the structural + network assumptions, the proposed algorithm combining dual decomposition and proximal minimization converges to the set of minimizers of the centralized problem. 41

  65. Algorithm analysis Main result (Convergence & optimality) Under the structural + network assumptions, the proposed algorithm combining dual decomposition and proximal minimization converges to the set of minimizers of the centralized problem. Probabilistic feasibility results for the stochastic case have been developed. 41

  66. Problem set-up – discrete decision variables m � c ⊤ P : min i x i x 1 ,..., x m i =1 m � subject to: A i x i ≤ b i =1 x i ∈ X i ∀ i = 1 , . . . , m Features • local decision vectors x i • local linear objectives c ⊤ i x i • p coupling linear constraints � m i =1 A i x i ≤ b • local mixed-integer polyhedral constraint sets X i 42

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend