Distributed nonsmooth composite optimization via the proximal - - PowerPoint PPT Presentation

distributed nonsmooth composite optimization via the
SMART_READER_LITE
LIVE PREVIEW

Distributed nonsmooth composite optimization via the proximal - - PowerPoint PPT Presentation

Distributed nonsmooth composite optimization via the proximal augmented Lagrangian Neil K. Dhingra neilkdh.com joint work with Sei Zhen Khong Mihailo Jovanovi LCCC Focus Period on Large-Scale and Distributed Optimization June 9, 2017 1 /


slide-1
SLIDE 1

Distributed nonsmooth composite optimization via the proximal augmented Lagrangian

Neil K. Dhingra

neilkdh.com joint work with Sei Zhen Khong Mihailo Jovanović

LCCC Focus Period on Large-Scale and Distributed Optimization June 9, 2017

1 / 35

slide-2
SLIDE 2

Applications

satellite formations combination drug therapy power networks control of buildings

2 / 35

slide-3
SLIDE 3

Structure via composite optimization

minimize f(x) + g(Tx) ← − ← − performance structure

◮ f – possibly nonconvex; cts-differentiable ◮ g – convex; often non-differentiable ◮ Tx – promote structure in alternate coordinates ◮ g(x) admits easily computable proximal operator, g(Tx) does not

3 / 35

slide-4
SLIDE 4

Outline

I Proximal augmented Lagrangian

  • centralized approach – method of multipliers

II Primal-dual method

  • distributable
  • convergence for convex problems
  • linear convergence for strongly convex problems

4 / 35

slide-5
SLIDE 5

Proximal gradient method

minimize f(x) + g(x) Generalizes gradient descent xk+1 = proxαkg

  • xk − αk∇f(xk)
  • cannot be used for g(Tx) in general

Nesterov ‘07 Beck & Teboulle ‘09

5 / 35

slide-6
SLIDE 6

Proximal operator and Moreau envelope

◮ Proximal operator

proxµg(v) := argmin

z

g(z) + 1 2µ z − v2

◮ Moreau envelope

Mµg(v) := inf

z

g(z) + 1 2µ z − v2

  • continuously differentiable even when g is not

∇Mµg(v) = 1 µ

  • v − proxµg(v)
  • Parikh & Boyd, FnT in Optimization ‘14

6 / 35

slide-7
SLIDE 7

Example

◮ Soft-thresholding – proximal operator for ℓ1 norm

minimize

zi

  • i
  • γ |zi| +

1 2µ (zi − vi)2

  • separability ⇒ element-wise analytical solution

prox operator soft-thresholding Moreau envelope Huber function a = µγ ∇M saturation

7 / 35

slide-8
SLIDE 8

Auxiliary variable

minimize

x, z

f(x) + g(z) subject to Tx − z = 0

◮ Decouples f and g ◮ Can use methods for constrained optimization

Augmented Lagrangian Lµ(x, z; y) = f(x) + g(z) + y, Tx − z + 1 2µ Tx − z2

8 / 35

slide-9
SLIDE 9

Method of multipliers

(xk+1, zk+1) = argmin

x,z

Lµ(x, z; yk) yk+1 = yk + 1 µ (Txk+1 − zk+1)

◮ Gradient ascent on a strengthened dual problem ◮ Requires joint minimization over x and z ◮ Well-studied: convergence to local minimum, adaptive µ update,

inexact subproblems, etc.

9 / 35

slide-10
SLIDE 10

MM cartoon

Lµ(x, z; y0)

10 / 35

slide-11
SLIDE 11

MM cartoon

Lµ(x, z; y0)

10 / 35

slide-12
SLIDE 12

MM cartoon

Lµ(x, z; y1)

10 / 35

slide-13
SLIDE 13

MM cartoon

Lµ(x, z; y1)

10 / 35

slide-14
SLIDE 14

MM cartoon

Lµ(x, z; y⋆)

10 / 35

slide-15
SLIDE 15

MM cartoon

Lµ(x, z; y⋆)

10 / 35

slide-16
SLIDE 16

Alternating direction method of multipliers

xk+1 = argmin

x

Lµ(x, zk; yk) differentiable zk+1 = argmin

z

Lµ(xk+1, z; yk) proxµg(·) yk+1 = yk + 1 µ (Txk+1 − zk+1)

◮ Convenient for distributed implementation ◮ Convergence speed influenced by µ ◮ Challenge: convergence for nonconvex f

Hong, Luo, Razaviyayn, SIAM J. Optimiz. ‘16

11 / 35

slide-17
SLIDE 17

ADMM cartoon

Lµ(x, z; y0)

12 / 35

slide-18
SLIDE 18

ADMM cartoon

Lµ(x, z; y0)

12 / 35

slide-19
SLIDE 19

ADMM cartoon

Lµ(x, z; y0)

12 / 35

slide-20
SLIDE 20

ADMM cartoon

Lµ(x, z; y1)

12 / 35

slide-21
SLIDE 21

ADMM cartoon

Lµ(x, z; y1)

12 / 35

slide-22
SLIDE 22

ADMM cartoon

Lµ(x, z; y1)

12 / 35

slide-23
SLIDE 23

ADMM cartoon

Lµ(x, z; y2)

12 / 35

slide-24
SLIDE 24

ADMM cartoon

Lµ(x, z; y2)

12 / 35

slide-25
SLIDE 25

ADMM cartoon

Lµ(x, z; y2)

12 / 35

slide-26
SLIDE 26

Alternating direction method of multipliers

xk+1 = argmin

x

Lµ(x, zk; yk) differentiable zk+1 = argmin

z

Lµ(xk+1, z; yk) proxµg(·) yk+1 = yk + 1 µ (Txk+1 − zk+1)

13 / 35

slide-27
SLIDE 27

Proximal augmented Lagrangian

Lµ(x, z; y) = f(x) + g(z) + 1 2µ z − (Tx + µy)2

  • − µ

2 y2 Minimize over z z⋆

µ(x, y) = proxµg(Tx + µy)

Evaluate Lµ(x, z; y) at z⋆ Lµ(x; y) := Lµ(x, z⋆

µ(x, y); y)

= f(x) + Mµg(Tx + µy) − µ 2 y2 continuously differentiable in x and y Dhingra, Khong, Jovanović, arXiv:1610.04514

14 / 35

slide-28
SLIDE 28

Proximal augmented Lagrangian MM

xk+1 = argmin

x

Lµ(x; yk) yk+1 = yk + 1 µ (Txk+1 − proxµg(Txk+1 + µyk))

◮ Nonconvex f: convergence to local minimum ◮ x-minimization step: differentiable problem

Dhingra, Khong, Jovanović, arXiv:1610.04514

15 / 35

slide-29
SLIDE 29

Proximal augmented Lagrangian MM cartoon

Lµ(x, z; y0), Lµ(x; y0)

16 / 35

slide-30
SLIDE 30

Proximal augmented Lagrangian MM cartoon

Lµ(x, z; y0), Lµ(x; y0)

16 / 35

slide-31
SLIDE 31

Proximal augmented Lagrangian MM cartoon

Lµ(x, z; y1), Lµ(x; y1)

16 / 35

slide-32
SLIDE 32

Proximal augmented Lagrangian MM cartoon

Lµ(x, z; y⋆), Lµ(x; y⋆)

16 / 35

slide-33
SLIDE 33

Edge addition in directed consensus networks

x1 x2 x4 x3 x5 x6 x7

z are edges, columns of T are basis for space of balanced graphs

Identify edges x(γ) = minimize

x

f2(x) + γTx1 Design edge weights x⋆(γ) = minimize

x

f2(x) subject to sp(Tx) ∈ sp(Tx(γ))

17 / 35

slide-34
SLIDE 34

Edge addition in directed consensus networks

percent performance loss number of added edges

18 / 35

slide-35
SLIDE 35

Comparison with ADMM

Outer iter. (k)

  • Comp. time (s)

m m Outer iterations per outer iteration

  • guaranteed convergence to local minimum
  • computational savings from reduced outer iterations

Dhingra, Khong, Jovanović, arXiv:1610.04514

19 / 35

slide-36
SLIDE 36

Outline

I Proximal augmented Lagrangian

  • centralized approach – method of multipliers

II Primal-dual method

  • distributable
  • convergence for convex problems
  • linear convergence for strongly convex problems

20 / 35

slide-37
SLIDE 37

Primal-descent dual-ascent

Arrow-Hurwicz-Uzawa type gradient flow ˙ x ˙ y

  • =

−∇x L ∇y L

  • ◮ Existing methods use subgradients or projection

◮ Convenient for distributed implementation

Arrow, Hurwicz, Uzawa, ‘59 Nedic & Ozdaglar, TAC ‘09 Wang & Elia, CDC ‘11 Feijer & Paganini, AUT ‘10 Cherukuri, Gharesifard, Cortés, SCL ‘15

21 / 35

slide-38
SLIDE 38

First-order primal-dual method

˙ x ˙ y

  • =

−∇x Lµ(x; y) ∇y Lµ(x; y)

  • ◮ Continuous rhs – even for non-differentiable g(Tx)
  • algorithmic implementation via forward Euler discretization

◮ Convex f – asymptotic convergence

  • Lyapunov function & LaSalle’s invariance principle

◮ Strongly cvx, Lip. cts gradient – linear convergence

  • Integral Quadratic Constraints
  • extends to discrete-time

Dhingra, Khong, Jovanović, arXiv:1610.04514

22 / 35

slide-39
SLIDE 39

Method of multipliers cartoon II

Lµ(x; y), min

x Lµ(x; y)

23 / 35

slide-40
SLIDE 40

Method of multipliers cartoon II

Lµ(x; y), min

x Lµ(x; y)

23 / 35

slide-41
SLIDE 41

Method of multipliers cartoon II

x1 = argmin

x

Lµ(x; y0), min

x Lµ(x; y)

23 / 35

slide-42
SLIDE 42

Method of multipliers cartoon II

y1 = y0 +

1 µ∇yLµ(x1; y0), min x Lµ(x; y)

23 / 35

slide-43
SLIDE 43

Method of multipliers cartoon II

x2 = argmin

x

Lµ(x; y1), min

x Lµ(x; y)

23 / 35

slide-44
SLIDE 44

Method of multipliers cartoon II

y⋆ = y1 +

1 µ∇yLµ(x2; y1), min x Lµ(x; y)

23 / 35

slide-45
SLIDE 45

Method of multipliers cartoon II

x⋆ = argmin

x

Lµ(x; y⋆), min

x Lµ(x; y)

23 / 35

slide-46
SLIDE 46

Primal-dual cartoon

(x1, y1) = (x0, y0) − α(∇xLµ(x0; y0), −∇yLµ(x0; y0)), min

x Lµ(x; y)

24 / 35

slide-47
SLIDE 47

Primal-dual cartoon

(x2, y2) = (x1, y1) − α(∇xLµ(x1; y1), −∇yLµ(x1; y1)), min

x Lµ(x; y)

24 / 35

slide-48
SLIDE 48

Primal-dual cartoon

(x⋆, y⋆) = (x2, y2) − α(∇xLµ(x2; y2), −∇yLµ(x2; y2)), min

x Lµ(x; y)

24 / 35

slide-49
SLIDE 49

Distributed updates

˙ x ˙ y

  • =

−∇f(x) − T T ∇Mµg(Tx + µy) µ∇Mµg(Tx + µy) − µy

  • ◮ Recall ∇Mµg(v) = 1

µ(v − proxµg(v)) ◮ Distributed implementation if g separable and

  • ∇f: Rn → Rn is a sparse mapping
  • T T T is sparse

25 / 35

slide-50
SLIDE 50

Distributed updates

˙ x ˙ y

  • =

−∇f(x) − T T ∇Mµg(Tx + µy) µ∇Mµg(Tx + µy) − µy

  • ◮ Recall ∇Mµg(v) = 1

µ(v − proxµg(v)) ◮ Distributed implementation if g separable and

  • ∇f: Rn → Rn is a sparse mapping
  • T T T is sparse

◮ Each node xi

  • communicates according to ∇f and T T T
  • stores yi according to T T

25 / 35

slide-51
SLIDE 51

Overlapping group LASSO example

minimize

1 2Ax − b2 2 +

  • (Tx)i2

Gradient mapping: ∇f(x) = AT (Ax − b)

  • communicate states xi according to ∇f and T T T
  • store yi corresponding to red edges

x1 x2 x4 x3

    ⋆ ⋆ ⋆ ⋆ ⋆    

  • A

  ⋆ ⋆ ⋆ ⋆ ⋆ ⋆  

  • T

26 / 35

slide-52
SLIDE 52

Reformulation of distributed optimization

minimize

x

  • fi(x)

≡ minimize

x1,x2,...

  • fi(xi)

subject to Tx = 0

◮ T T is Laplacian or incidence matrix of connected network

≡ minimize

x1,x2,...

  • fi(xi) + I0(Tx)

Indicator function is I0(z) :=

  • 0,

z = 0 ∞, z = 0

27 / 35

slide-53
SLIDE 53

Reformulation of distributed optimization

minimize

x

  • fi(x)

≡ minimize

x1,x2,...

  • fi(xi)

subject to Tx = 0

◮ T T is Laplacian or incidence matrix of connected network

≡ minimize

x1,x2,...

  • fi(xi) + I0(Tx)

Indicator function is I0(z) :=

  • 0,

z = 0 ∞, z = 0

◮ Let ¯

y := T T y and T T T = L ˙ x ˙ ¯ y

  • =

−∇f(x) −

1 µLx − ¯

y Lx

  • ◮ Each agent stores xi and ¯

yi, communicates across L

27 / 35

slide-54
SLIDE 54

Reformulation of distributed optimization

◮ Discrete-time primal-dual

xk+1 = xk − α

  • ∇f(xk) + 1

µLxk + ¯ yk

  • ¯

yk+1 = ¯ yk + αLxk

◮ EXTRA by Shi, Ling, Wu, Yin ‘15

xk+1 = Wxk − α∇f(xk) + 1 µLxk +

k−1

  • t=0

(W − ˜ W)xt

28 / 35

slide-55
SLIDE 55

Reformulation of distributed optimization

◮ Discrete-time primal-dual

xk+1 = xk − α

  • ∇f(xk) + 1

µLxk + ¯ yk

  • ¯

yk+1 = ¯ yk + αLxk

◮ EXTRA by Shi, Ling, Wu, Yin ‘15

xk+1 = Wxk − α∇f(xk) + 1 µLxk +

k−1

  • t=0

(W − ˜ W)xt Equivalent! W = I − α

µL, ˜

W = 1

2(I + W), dual stepsize αy = α 2µ

xk+1 = xk − α

  • ∇f(xk) + 1

µLxk +

k−1

  • t=0

Lxt

= ¯ yk

  • 28 / 35
slide-56
SLIDE 56

Sketch of asymptotic convergence proof

◮ Introduce Lyapunov function with ˜

x := x − x⋆, ˜ y := y − y⋆ V (˜ x, ˜ y) =

1 2˜

x2 +

1 2˜

y2

◮ Show ˙

V ≤ 0, thus by LaSalle’s invariance principle, x(t) y(t)

x y

  • ˙

V (˜ x, ˜ y) = 0, ˙ ˜ x ˙ ˜ y

  • = 0
  • = (x⋆, y⋆)

◮ Convex → asymptotic convergence

Dhingra, Khong, Jovanović, arXiv:1610.04514

29 / 35

slide-57
SLIDE 57

Feedback representation

˙ x ˙ y

  • =

−∇f(x) − T T ∇Mµg(Tx + µy) µ∇Mµg(Tx + µy) − µy

  • 30 / 35
slide-58
SLIDE 58

Feedback representation

˙ x ˙ y

  • =

−(∇f(x) − mfx) − T T ∇Mµg(Tx + µy) − mfx µ∇Mµg(Tx + µy) − µy

  • ‘borrow’ mf strong convexity from ∇f so G is stable

30 / 35

slide-59
SLIDE 59

Feedback representation

G ∇f − mfI µ∇Mµg ξ1 = x ξ2 = Tx + µy u1 u2

◮ Linear system G :

˙ w = Aw + Bu, ξ = Cw, w := [xT yT ]T ˙ x ˙ y

  • =

−(∇f(x) − mfx) − T T ∇Mµg(Tx + µy) − mfx µ∇Mµg(Tx + µy) − µy

  • ‘borrow’ mf strong convexity from ∇f so G is stable

30 / 35

slide-60
SLIDE 60

Feedback representation

G ∇f − mfI µ∇Mµg ξ1 = x ξ2 = Tx + µy u1 u2

◮ Linear system G :

˙ w = Aw + Bu, ξ = Cw, w := [xT yT ]T A = −mfI −µI

  • , B =

−I − 1

µT T

I

  • , C =

I T µI

  • u1(ξ1) = ∇f(ξ1) − mfξ1, u2(ξ2) = ξ2 − proxµg(ξ2)
  • ‘borrow’ mf strong convexity from ∇f so G is stable

30 / 35

slide-61
SLIDE 61

Integral Quadratic Constraints ∇f − mfI µ∇Mµg u1 u2 ξ1 ξ2

◮ f − mf 2 ˜

x2 convex because f is mf-strongly convex

◮ Lf Lipschitz continuous gradient of convex function

ξ − ξ0 u − u0 T LfI LfI −2I

  • ΠLf

ξ − ξ0 u − u0

  • ≥ 0

31 / 35

slide-62
SLIDE 62

Linear convergence

◮ Linear convergence

w(t) ≤ τ e−ρtw(0) w := [xT yT ]T if (after applying KYP Lemma) Gρ(jω) I ∗ Π Gρ(jω) I

  • 0,

∀ ω ∈ R

  • transfer function Gρ(jω) = C(jωI − (A + ρI))−1B
  • Π describes IQC for u1 and u2

Lessard, Recht, Packard ‘16 Hu and Seiler, ‘16

32 / 35

slide-63
SLIDE 63

Sketch of linear convergence proof

  • 1. Set µ = Lf − mf and evaluate

    µ ˆ m + ˆ m2 + ω2 ˆ m2 + ω2 I ˆ m ˆ m2 + ω2 T T ∗ ˆ m/µ ˆ m2 + ω2 TT T + ω2 − ρˆ µ ˆ µ2 + ω2 I     ≻ 0 ˆ m := mf − ρ, ˆ µ := µ − ρ

  • 2. Take Schur complement and diagonalize
  • concave scalar function quadratic in ω2
  • show absence of roots at ω2 ≥ 0 for ρ = 0

33 / 35

slide-64
SLIDE 64

Sketch of linear convergence proof

  • 1. Set µ = Lf − mf and evaluate

    µ ˆ m + ˆ m2 + ω2 ˆ m2 + ω2 I ˆ m ˆ m2 + ω2 T T ∗ ˆ m/µ ˆ m2 + ω2 TT T + ω2 − ρˆ µ ˆ µ2 + ω2 I     ≻ 0 ˆ m := mf − ρ, ˆ µ := µ − ρ

  • 2. Take Schur complement and diagonalize
  • concave scalar function quadratic in ω2
  • show absence of roots at ω2 ≥ 0 for ρ = 0
  • f is mf strongly convex
  • ∇f is Lf Lipschitz cts
  • TT T is full rank

     → linear convergence when µ ≥ Lf − mf Dhingra, Khong, Jovanović, arXiv:1610.04514

33 / 35

slide-65
SLIDE 65

Sketch of linear convergence proof

  • 1. Set µ = Lf − mf and evaluate

    µ ˆ m + ˆ m2 + ω2 ˆ m2 + ω2 I ˆ m ˆ m2 + ω2 T T ∗ ˆ m/µ ˆ m2 + ω2 TT T + ω2 − ρˆ µ ˆ µ2 + ω2 I     ≻ 0 ˆ m := mf − ρ, ˆ µ := µ − ρ

  • 2. Take Schur complement and diagonalize
  • concave scalar function quadratic in ω2
  • show absence of roots at ω2 ≥ 0 for ρ = 0
  • f is mf strongly convex
  • ∇f is Lf Lipschitz cts
  • TT T is full rank

     → linear convergence when µ ≥ Lf − mf conservative! Dhingra, Khong, Jovanović, arXiv:1610.04514

33 / 35

slide-66
SLIDE 66

Optimal placement

◮ Monitor targets and stay near neighbors

minimize

x

1 2(xi − bi)2 + I[−1,1](Tx) Sampling speed of 1 kHz and a step-size of 1 × 10−3.

34 / 35

slide-67
SLIDE 67

Conclusions

Proximal augmented Lagrangian

  • continuously differentiable
  • enables MM

Distributed implementation

  • primal-dual method
  • connections with existing distributed optimization techniques

Ongoing work

  • remove rank constraint for linear convergence
  • second order methods

35 / 35

slide-68
SLIDE 68

Extra slides

36 / 35

slide-69
SLIDE 69

Asymptotic convergence for convex problems

At any (x, y) there is a 0 D I such that D(T ˜ x + µ˜ y) = proxµg(Tx + µy) − proxµg(Tx⋆ + µy⋆) Derivative of V negative semidefinite ˙ V (˜ x, ˜ y) = − ˜ x, ∇f(x) − ∇f(x⋆) − 1

µ T ˜

x, (I − D) T ˜ x − µ ˜ y, D˜ y

37 / 35

slide-70
SLIDE 70

Asymptotic convergence for convex problems

At any (x, y) there is a 0 D I such that D(T ˜ x + µ˜ y) = proxµg(Tx + µy) − proxµg(Tx⋆ + µy⋆) Derivative of V negative semidefinite ˙ V (˜ x, ˜ y) = − ˜ x, ∇f(x) − ∇f(x⋆) − 1

µ T ˜

x, (I − D) T ˜ x − µ ˜ y, D˜ y If ˙ V = 0, ∇f(x) = ∇f(x⋆), ˜ y ∈ ker{D}, T ˜ x ∈ ker{(I − D)}, thus ˙ ˜ x ˙ ˜ y

  • =

−T T ˜ y

  • If additionally ˜

y ∈ ker{T T }, (x, y) is optimal

37 / 35

slide-71
SLIDE 71

Linear convergence for strongly convex problems

Schur complement: ˆ m/µ µ ˆ m + ˆ m2 + ω2 TT T + ω2 − ρˆ µ ˆ µ2 + ω2 I ≻ 0 Diagonalize where λi are eigenvalues of TT T ω4 +

  • ˆ

mλi µ

+ ˆ m2 + µ ˆ m − ρˆ µ

  • ω2 +

ˆ mˆ µ2λi µ

− ρˆ µ(µ ˆ m + ˆ m2) > 0 Set ρ = 0 ω4 + mfλi

µ

+ m2

f + µmf

  • ω2 + µmfλi > 0

positive coefficients = ⇒ roots negative or complex

38 / 35

slide-72
SLIDE 72

Optimal placement II

minimize 1 2

  • A

T

  • x −

b

  • 2

+ I[−c,c](Tx)

(a) Optimal configuration I (b) Optimal configuration II (c) Agent trajectories (d) Distance from optimal

39 / 35

slide-73
SLIDE 73

Directed consensus networks

◮ Distributed information exchange over edges zij

˙ ψi =

  • j

zij(ψj − ψi)

◮ Want nodes to compute average, ψi(t) → 1 nψi(0)

40 / 35

slide-74
SLIDE 74

Consensus networks

Aggregate dynamics ˙ ψ = −Lp ψ + d

◮ If Lp is balanced, nodes approach average

Penalize deviation from average ζ = I − (1/n)11T ψ

41 / 35

slide-75
SLIDE 75

Consensus networks

Aggregate dynamics ˙ ψ = −(Lp + Lc) ψ + d

◮ If Lp + Lc is balanced, nodes approach average

Penalize deviation from average ζ = I − (1/n)11T − R1/2Lc

  • ψ

Add edges to network

◮ F(z) = Lc is graph Laplacian of added edges z

41 / 35

slide-76
SLIDE 76

Balanced network

ψ1 ψ2 ψ3 z12 z23 z31 z32

◮ For each node ψi, in-degree equals out-degree, j zij = j zji

ψ1 : z12 − z31 = ψ2 : − z12 + z23 − z32 = ψ3 : − z23 + z32 + z31 =

◮ Linear constraint on added edges Ez = 0 ◮ z = Tx parametrizes balanced graphs, Ez = E(Tx) = 0

linear constraint in z if Lp balanced, affine if not

42 / 35

slide-77
SLIDE 77

Balanced vs. unbalanced directed consensus networks

ψ1 ψ2 ψ3 1 1 1 ψ1 ψ2 ψ3 1 2 1 L1 =   1 −1 1 −1 −1 1   L2 =   1 −1 1 −1 −2 2   vT

1 L1 = 0,

vT

1 = 1 √ 3

  • 1

1 1

  • vT

2 L2 = 0,

vT

2 = 1 √ 5

  • 2

2 1

  • ◮ Nodes approach weighted avg. ψi(t) → vT ψ(0) 1

◮ Weighted avg. doesn’t ‘move’, i.e.,

˙ (vT ψ) = −(vT L)ψ = 0

43 / 35

slide-78
SLIDE 78

Edge addition in consensus networks

minimize

x

f2(x) + γ Tx1 Performance:

◮ H2 norm of deviations from average and control effort ◮ Nonconvex

Structure:

◮ Balanced Lc ◮ Minimize number of edges

Cannot use proximal gradient because T nondiagonal

44 / 35