Gossip algorithms for solving Laplacian systems Anastasios Zouzias - - PowerPoint PPT Presentation

gossip algorithms for solving laplacian systems
SMART_READER_LITE
LIVE PREVIEW

Gossip algorithms for solving Laplacian systems Anastasios Zouzias - - PowerPoint PPT Presentation

Gossip algorithms for solving Laplacian systems Anastasios Zouzias University of Toronto joint work with Nikolaos Freris (EPFL) Based on : 1.Fast Distributed Smoothing for Clock Synchronization (CDC 12) 2.Randomized Extended Kaczmarz for


slide-1
SLIDE 1

Anastasios Zouzias University of Toronto

Gossip algorithms for solving Laplacian systems

joint work with Nikolaos Freris (EPFL)

Based on : 1.Fast Distributed Smoothing for Clock Synchronization (CDC ‘12) 2.Randomized Extended Kaczmarz for Solving Least-Squares (Arxiv, May ‘12)

slide-2
SLIDE 2
  • I. Problems: Solving Laplacian & edge-vertex systems
  • II. Motivation: Clock Synchronization over WSNs

III.Randomized Gossip Model & Averaging Problem IV.Gossip Solvers via Randomized (Extended) Kaczmarz

Outline

slide-3
SLIDE 3
  • I. Problems: Solving Laplacian & edge-vertex systems
  • II. Motivation: Clock Synchronization over WSNs

III.Randomized Gossip Model & Averaging Problem IV.Gossip Solvers via Randomized (Extended) Kaczmarz

Outline

slide-4
SLIDE 4
  • Each node is aware of its neighbors;

exchanges packets with them only

  • Static network; no communication

errors; ignore numerical issues

  • Synchronous, asynchronous & Gossip

Model of computation G=(V,E): n nodes, m edges

Distributed solver: Laplacian system

b1 b4 b2 b3 b5 b7 b8 b9 b6 b13 b12 b10 b11 x2 x1 x4 x5 x6 x13 x12 x9 x8 x11 x10 x3 x7

Problem I Input: Each node i gets bi Goal: Each node i computes

ith coordinate of xLS := L†b

L

b x =

Laplacian System

slide-5
SLIDE 5

G=(V,E): n nodes, m edges

Distributed solver: edge-vertex system

2 2

  • 4

7

  • 8
  • 12

1

  • 1

3 6

  • 2
  • 5

7 3 9

  • 2

8 4 1

x2 x1 x4 x5 x6 x13 x12 x9 x8 x11 x10 x3 x7

Edge-vertex System

= B y

m × n

x

Problem II Input: Each edge (i,j) gets y(i,j) Goal: Each node i computes ith

coordinate of xLS = B†y

Bek :=    −1, if k = i; 1, if k = j; 0,

  • therwise.

Normal equation of Bx=y is Laplacian system (L = B⊤B)

slide-6
SLIDE 6
  • I. Problems: Solving Laplacian & edge-vertex Systems
  • II. Motivation: Clock Synchronization over WSNs

III.Randomized Gossip Model & Averaging Problem IV.Gossip Solvers via Randomized (Extended) Kaczmarz

Outline

slide-7
SLIDE 7

Case Study: Clock Synchronization over

WSNs

v Assumptions

  • v(t) = t + ov, ov ∈ R
  • Each node has clock; same speed
  • Node v does not know

Nodes can approx. relative

  • ffsets

for every

  • uv = ov − ou

u ∈ Neigh(v)

  • v

Neigh(v)

Clock Synchronization Problem Input: Estimates for all

(u, v) ∈ E

Goal: Compute offsets that min.

  • u)u∈V

max

u,v E |˜

  • uv − ouv|2

ˆ

  • uv = ouv + N(0, 1)
  • ver all pairs
  • f nodes

G=(V,E): n nodes, m edges

slide-8
SLIDE 8

Tree-based Approach

Idea: Build a spanning tree In general, no hope for better accuracy ...but wireless networks are ``well-connected’’

u v Sync error between u & v grows like ≈ O(

  • diam(G))

Every edge: normal error

ˆ

  • uv =
  • (u′,v′)∈P

ˆ

  • u′v′

Path u~v : diam(G)

slide-9
SLIDE 9

Modeling Wireless Networks

Q: Can we do better on Random Geometric Graphs?

Yes! Spatial Smoothing [KEES03,GK06] ...as Random Geometric Graphs Square: unit area

  • n nodes uniform over square
  • Connectivity [GK00]:
  • Diameter:
  • Tree-based approach: error

O

  • n

log n

  • O(n1/4)

r = O

  • log n

n

slide-10
SLIDE 10

Observation: Every loop in G: sum of relative offsets equals zero

Spatial Smoothing

Idea: Incorporate the loop constraints Bx = o

Relative offset

  • f (i, j)

Encode constraints in linear system:

How?

= B

m × n

x

  • (i,j)
slide-11
SLIDE 11

Properties of Least-Squares

Thm[KEES03] Replace each edge by unit resistor. Then error variance between

any pair of nodes u and v is :

E |˜

  • uv − ouv|2 ∼ Reff(u, v)

Effective resistances of RGG bounded by O(1) [GK06]

Q: How to compute the LS solution?

Gaussian error: compute LS solution of Bx = ˆ

  • Tree-based vs Smoothing

O(n1/4) O(1)

vs

slide-12
SLIDE 12

Asynchronous Jacobi

Each node regularly:

v ∈ V

˜

  • v ← 1

dv

  • u∈Neigh(v)

  • u + ˆ
  • uv)
  • estimates relative offsets with nghbrs
  • broadcasts its current offset
  • updates its estimate:

ˆ

  • v

Thm[GK06]: After rounds, it holds that where is the min-cut value

k ≥ 4m2 β2 ln(x∗ /ε)

  • x(k) − x∗
  • 2 ≤ ε

β

It converges[BT89]

Synchronous Jacobi

For k = 1, 2, . . .

˜

  • (k+1)

v

← 1 dv

  • u∈Neigh(v)
  • ˜
  • (k)

u

+ ˆ

  • uv
  • ˆ
  • v = 0,

∀v ∈ V

The Model Matters...

Use coordinate descent:

∂ ∂xu Bx − o2 = 0

slide-13
SLIDE 13

The Model Matters...

Thm[GK06]: After rounds, it holds that where is the min-cut value

k ≥ 4m2 β2 ln(x∗ /ε)

  • x(k) − x∗
  • 2 ≤ ε

β

Synchronous Model

For k = 1, 2, . . .

˜

  • (k+1)

v

← 1 dv

  • u∈Neigh(v)
  • ˜
  • (k)

u

+ ˆ

  • uv
  • ˆ
  • v = 0,

∀v ∈ V

all

Asynchronous Model

Each node regularly:

v ∈ V

˜

  • v ← 1

dv

  • u∈Neigh(v)

  • u + ˆ
  • uv)
  • estimates relative offsets with nghbrs
  • broadcasts its current offset
  • updates its estimate:

ˆ

  • v

It converges[BT89]

nothing

Randomized Gossip Model

(a.k.a. asynchronous time model) [BTA86,BGPS06]

Each node u (randomly) activates itself w.p. pu & performs local computation

slide-14
SLIDE 14
  • I. Problems: Solving Laplacian & edge-vertex systems
  • II. Motivation: Clock Synchronization over WSNs

III.Randomized Gossip Model & Averaging Problem IV.Gossip Solvers via Randomized (Extended) Kaczmarz

Outline

slide-15
SLIDE 15

7 1 3 2

Distributed Averaging:

Input: Every node u gets Goal: Every node want access

to global average

wu

Distributed Averaging

5 4 4 How many rounds required to approx. withinε?

Basic primitive for

  • ther functions

Averaging can solve Problems I and II

[BDFSV10,XBL05,XBL06]

Gossip averaging algorithm

1.Every node u activates uniformly at random 2.Picks random neighbor v and averages their current values wu,wv

2.5 2.5

[BGPS06] proved that rounds are sufficient whp Special cases, complete graph [KSSV00,KDG03,KDN+06]

O( n λ2(G) log(n/ε))

slide-16
SLIDE 16

Claim: Non-uniform sampling of nodes is feasible with zero communication under gossip model (given ‘s)

γu

Gossip Model

Assumptions

  • Each node u has independent Poisson time process: rate
  • Each node activates when its arrival occurs
  • Equivalently*: single global Poisson process: rate
  • Arrivals correspond to rounds

γu

  • u∈V

γu

*minimum of ind. Poisson is equivalent to single Poisson with sum of their rates

Goal: Design and analyze gossip algorithms for Problem I and II

slide-17
SLIDE 17
  • I. Problems: Solving Laplacian & edge-vertex systems
  • II. Motivation: Clock Synchronization over WSNs

III.Randomized Gossip Model & Averaging Problem IV.Gossip Solvers via Randomized (Extended) Kaczmarz

Outline

slide-18
SLIDE 18

Kaczmarz Method

H2 Hm

. . .

A

x

=

x(0)

·

x(1)

· ·· · · ·

x(2)

·

x(m)

y

Repeat:

Set ik = k mod m + 1

k = k + 1

x(0) = 0

x(k+1) = x(k) + yik −

  • A(ik), x(k)
  • A(ik)

2 A(ik)

Initialize: Kaczmarz Method (K) (Assumption: Ax=y has solution) It convergences [K37] Huge literature; many extensions; rediscovered many times

H1 =

  • x |
  • A(1), x
  • = y1
slide-19
SLIDE 19

H1 =

  • x |
  • A(1), x
  • = y1
  • Pick w.p.

ik ∈ [m]

pi ∝

  • A(i)
  • 2

Repeat:

k = k + 1

x(0) = 0

x(k+1) = x(k) + yik −

  • A(ik), x(k)
  • A(ik)

2 A(ik)

Initialize:

Randomized Kaczmarz(RK) [SV06]

Pick rows randomly

Randomized Kaczmarz Method

H2 Hm

. . .

x(0)

·

x(1)

· ·· · · ·

x(2)

·

x(m)

A

x

= y

Exponential convergence

E

  • x(k) − x∗
  • 2

  • 1 −

1 κ2

F(A)

k x∗2

κ2

F(A) :=

A2

F

σ2

min(A)

where

(Assumption: Ax=y has solution)

slide-20
SLIDE 20

Let’s apply RK on Problems I and II

slide-21
SLIDE 21

b1 b2 b3 b5 b7 b8 b9 b6 b13 b12 b10 b11 x2 x1 x4 x5 x6 x13 x12 x9 x8 x11 x10 x3 x7

RK Laplacian Solver

b4

L

b x =

Laplacian System

(Assumption: Lx=b has solution) Pick w.p.

Repeat:

k = k + 1

x(0) = 0

Initialize:

Randomized Kaczmarz(RK) [SV06]

pi ∝

  • L(i)
  • 2

ik ∈ [n]

x(k+1) = x(k) + bik −

  • L(ik), x(k)
  • L(ik)

2 L(ik)

Node u activates w.p.

Repeat:

Each node u: Gossip Laplacian Solver

d2

u + du

θ = xu − 1 du

  • ℓ∈Nu

xℓ

Every : xv ← xv + θ − bu/du

1 + du

xu ← xu + bu − duθ 1 + du v ∈ Neigh(u)

  • broadcasts
  • sets

xu = 0

sparsity

  • f L

RK analysis & diag. preconditioning : rounds whp

  • O(n/λ2

2(G))

slide-22
SLIDE 22

2 2

  • 4

7

  • 8
  • 12

1

  • 1

3 6

  • 2
  • 5

7 3 9

  • 2

8 4 1

x2 x1 x4 x5 x6 x13 x12 x9 x8 x11 x10 x3 x7

RK Edge-vertex Solver

=

B y

m × n

x

Randomized Kaczmarz(RK) [SV06]

Pick uniformly

Repeat:

k = k + 1 x(0) = 0

Initialize:

e = (i, j) ∈ E

x(k+1) = x(k) + ye −

  • B(e), x(k)

2 B(e)

(Assumption: Bx=y has solution) Node u activates w.p. du & selects random neighbor v

Repeat:

Every node u: xu=0 Gossip Edge-Vertex Solver

  • sends xu & receives xv
  • Performs:

Similarly for node v

xu ← (xu + xv + y(u,v))/2 sparsity

  • f B

RK analysis & diag. preconditioning : rounds whp

  • O(n/λ2(G))

Consistency assumption (limitation of RK) How to handle the general case?

slide-23
SLIDE 23

H1 =

  • x |
  • A(1), x
  • = y1 + w1
  • RK for Noisy Systems

H2 Hm

. . .

xLS

·

Converges to ball centered at LS solution

A

x

= y

w +

E

  • x(k) − xLS
  • 2

  • 1 −

1 κ2

F(A)

k xLS2 + w2 σ2

min(A)

RK is robust to noise [Needell09]

Randomized Kaczmarz(RK) [SV06]

Pick w.p.

ik ∈ [m]

pi ∝

  • A(i)
  • 2

Repeat:

k = k + 1

x(0) = 0

Initialize:

x(k+1) = x(k+1) + yik + wik −

  • A(ik), x(k)
  • A(ik)

2 A(ik)

(Assumption: Ax=y has solution)

slide-24
SLIDE 24

`Modify’ original system Ax=y s.t. 1) xLS is preserved 2) Modified system has LS error ≈ 0 Idea:

H2

. . .

xLS

·

H3

H1

✓ ✓ Problem

Given subspace as colspan(A), vector y and

Goal: Find z s.t.

ε > 0

  • z − yR(A)
  • ≤ ε

Handling the general setting

Robustness of RK: convergence into fixed ball

Ax = y Ax = yR(A)

How?

slide-25
SLIDE 25

Randomized Orthogonal Projection

E

  • z(k) − yR(A)⊥
  • 1 −

1 κ2

F(A)

k y2

Exponential convergence[Z.Freris12]:

Orthogonality gives yR(A)

Problem

Given subspace as colspan(A), vector y and

Goal: Find z s.t.

ε > 0

  • z − yR(A)
  • ≤ ε

jk ∈ [n]

z(k+1) =

  • I −

A(jk)A⊤

(jk)

  • A(jk)
  • 2
  • z(k)

Pick w.p.

Repeat:

k = k + 1

Initialize: randOP

pj ∝

  • A(j)
  • 2

z(0) = y

slide-26
SLIDE 26

x(k+1) = x(k) + yik − z(k)

ik −

  • A(ik), x(k)
  • A(ik)

2 A(ik)

y - z(k)→yR(A)

Randomized Extended Kaczmarz for LS

Pick w.p.

ik ∈ [m]

pi ∝

  • A(i)
  • 2

Repeat:

k = k + 1

x(0) = 0

Initialize: ;

Randomized Extended Kaczmarz

Pick w.p.

jk ∈ [n]

p(j) ∝

  • A(j)
  • 2

z(k+1) = (I − Pjk)z(k)

RK applied on Ax=yR(A)

E

  • x(k) − xLS
  • 2

  • 1 −

1 κ2

F(A)

⌊ k

2 ⌋

xLS2 + 2 b2 σmin(A)

  • Exponential convergence [Z. Freris12]

*Inspired by Extended Kaczmarz method [Pop99]

RK + randOP = Randomized Extended Kaczmarz

z(0) = y

slide-27
SLIDE 27

2 2

  • 4

7

  • 8
  • 12

1

  • 1

3 6

  • 2
  • 5

7 3 9

  • 2

8 4 1

x2 x1 x4 x5 x6 x13 x12 x9 x8 x11 x10 x3 x7

=

B y

m × n

x

Problem II Grand Finale

Repeat:

k = k + 1

x(0) = 0

Initialize: ;

Randomized Extended Kaczmarz

z(k+1) = (I − Pjk)z(k)

Node u activates w.p. du & selects random neighbor v

Repeat:

Every node u: xu=0, z(u,v)=y(u,v) v in Nu Gossip Edge-Vertex Solver

  • Sends xu & receives xv
  • Performs:

Similarly node v

  • Update weights on edges

adjacent to u;broadcast to nghbrs

xu ← (xu + xv + y(u,v) − z(u,v))/2

sparsity

  • f B

z(0) = y

Pick node w.p. Pick uniformly

e = (i, j) ∈ E

x(k+1) = x(k) + ye − z(k)

e

  • B(e), x(k)

2 B(e)

Same rate of convergence

djk

jk

slide-28
SLIDE 28

b1 b2 b3 b5 b7 b8 b9 b6 b13 b12 b10 b11 x2 x1 x4 x5 x6 x13 x12 x9 x8 x11 x10 x3 x7

Problem I Grand Finale

b4

L

b x =

Laplacian System

Two solutions: 1.Use REK as before 2.Use RK & gossip averaging to project b

  • nto ,

1⊥ b′ = b − bavg1

slide-29
SLIDE 29

Summary

  • Randomized coordinate descent [LL08,Nest10]
  • Termination, numerical issues; communication errors, etc
  • Gossip model of computation
  • Randomized Iterative Solvers: RK & REK
  • Interplay between randomized solvers & gossip algorithms

Topics not covered:

slide-30
SLIDE 30

Thank you