Consensus over Stochastically Switching Directed Topologies S. - - PowerPoint PPT Presentation

consensus over stochastically switching directed
SMART_READER_LITE
LIVE PREVIEW

Consensus over Stochastically Switching Directed Topologies S. - - PowerPoint PPT Presentation

Consensus over Stochastically Switching Directed Topologies S. Vanka, V. Gupta and M. Haenggi Department of Electrical Engineering University of Notre Dame S. Vanka, V. Gupta and M. Haenggi IEEE IT School 2009 1 / 22 Outline 1 Introduction 2


slide-1
SLIDE 1

Consensus over Stochastically Switching Directed Topologies

  • S. Vanka, V. Gupta and M. Haenggi

Department of Electrical Engineering University of Notre Dame

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 1 / 22

slide-2
SLIDE 2

Outline

1 Introduction 2 Problem Formulation and Main Results 3 Key Ingredients of the Proof

Expected deviation from average consensus point is zero Constructing a martingale Bounding martingale differences Using the Azuma-Hoeffding Inequality

4 Applications

Consensus over Fading Channels

5 Conclusions 6 Future Work

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 2 / 22

slide-3
SLIDE 3

Introduction

Reaching Consensus

Decentralized algorithm for a system of n agents 1, 2, . . . , n to achieve agreement over value of their state

Initial state xi

  • for ith node

Iterative exchange of information specified by a time-varying communication digraph G = (V , E), where

Vertex set V : set of all participating agents 1, 2, . . ., n An edge eji − − → (j, i) ∈ E iff the message from node j is used by i

Average Consensus Algorithm: xt = Wt−1xt−1, where xt [x(1)

t

x(2)

t

. . . x(n)

t

]T is the state vector Average Consensus Point: x∗

av = n−11∗xo

Wt = I − hLt, where Lt is the Laplacian of Gt For all t, Gt is balanced ⇒ Wt is doubly stochastic ⇒ xt can reach average consensus

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 3 / 22

slide-4
SLIDE 4

Introduction

Prior Work

Deterministically varying communication topologies

Roots going back to Tsitsiklis (1984) Problems studied include criteria for convergence, optimizing convergence rate etc. (Jadbabaie, Lin and Morse (2003), Xiao and Boyd (2004), Olfati-Saber and Murray (2004), Ren and Beard (2006). . .)

Randomly varying communication topologies

Problems studied include randomized gossip, convergence on random graphs, consensus in networks with noise and packet losses (Boyd et al. (2006), Hatano and Mesbahi (2005), Porfiri and Stilwell (2007), Salehi and Jadbabaie (2008), Fagnani and Zampieri (2008), Huang (2007a&b), Hovareshti et al. (2008). . .)

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 4 / 22

slide-5
SLIDE 5

Introduction

Average Consensus over Random Networks

State after t + 1 iterations: xt+1 = WtWt−1 · · · W0xo. ⇒ Time evolution determined by matrix product WtWt−1 . . . W0 of random stochastic matrices Suppose that

{Wt : t = 0, 1, . . .} is matrix-valued i.i.d. sequence All update matrices Wt have positive diagonal elements The linear system xt+1 = E[W ]xt asymptotically reaches consensus

It is known that

The state xt almost surely reaches consensus (Salehi-Jadbabaie 2008) The consensus point is random variable

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 5 / 22

slide-6
SLIDE 6

Introduction

Average Consensus over Random Networks

State after t + 1 iterations: xt+1 = WtWt−1 · · · W0xo. ⇒ Time evolution determined by matrix product WtWt−1 . . . W0 of random stochastic matrices Suppose that

{Wt : t = 0, 1, . . .} is matrix-valued i.i.d. sequence All update matrices Wt have positive diagonal elements The linear system xt+1 = E[W ]xt asymptotically reaches consensus

It is known that

The state xt almost surely reaches consensus (Salehi-Jadbabaie 2008) The consensus point is random variable

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 5 / 22

slide-7
SLIDE 7

Introduction

Average Consensus over Random Networks

State after t + 1 iterations: xt+1 = WtWt−1 · · · W0xo. ⇒ Time evolution determined by matrix product WtWt−1 . . . W0 of random stochastic matrices Suppose that

{Wt : t = 0, 1, . . .} is matrix-valued i.i.d. sequence All update matrices Wt have positive diagonal elements The linear system xt+1 = E[W ]xt asymptotically reaches consensus

It is known that

The state xt almost surely reaches consensus (Salehi-Jadbabaie 2008) The consensus point is random variable

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 5 / 22

slide-8
SLIDE 8

Problem Formulation and Main Results

Problem Formulation

Linear dynamics xt+1 = Wtxt, fixed initial state x0 Ideally: Reach average consensus point 1(n−11∗x0) But what if not all Wt are balanced? Matrix sequence Wt :

An i.i.d. sequence of stochastic matrices All Wt have positive diagonal entries The system x(t + 1) = E[W ]x(t) reaches average consensus

⇒ x(t) almost surely reaches consensus Can we quantify this deviation from the average consensus point?

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 6 / 22

slide-9
SLIDE 9

Problem Formulation and Main Results

Problem Formulation

Linear dynamics xt+1 = Wtxt, fixed initial state x0 Ideally: Reach average consensus point 1(n−11∗x0) But what if not all Wt are balanced? Matrix sequence Wt :

An i.i.d. sequence of stochastic matrices All Wt have positive diagonal entries The system x(t + 1) = E[W ]x(t) reaches average consensus

⇒ x(t) almost surely reaches consensus Can we quantify this deviation from the average consensus point?

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 6 / 22

slide-10
SLIDE 10

Problem Formulation and Main Results

Problem Formulation

Linear dynamics xt+1 = Wtxt, fixed initial state x0 Ideally: Reach average consensus point 1(n−11∗x0) But what if not all Wt are balanced? Matrix sequence Wt :

An i.i.d. sequence of stochastic matrices All Wt have positive diagonal entries The system x(t + 1) = E[W ]x(t) reaches average consensus

⇒ x(t) almost surely reaches consensus Can we quantify this deviation from the average consensus point?

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 6 / 22

slide-11
SLIDE 11

Problem Formulation and Main Results

Problem Formulation

Linear dynamics xt+1 = Wtxt, fixed initial state x0 Ideally: Reach average consensus point 1(n−11∗x0) But what if not all Wt are balanced? Matrix sequence Wt :

An i.i.d. sequence of stochastic matrices All Wt have positive diagonal entries The system x(t + 1) = E[W ]x(t) reaches average consensus

⇒ x(t) almost surely reaches consensus Can we quantify this deviation from the average consensus point?

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 6 / 22

slide-12
SLIDE 12

Problem Formulation and Main Results

Geometric Interpretation

Instantaneous average xav(t) 1∗xt

n

For balanced communication graphs xav(t) = xav(0) = average consensus point Interested in characterizing the deviation xt − 1xav(0) of instantaneous state from average consensus point n = 2

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 7 / 22

slide-13
SLIDE 13

Problem Formulation and Main Results

Geometric Interpretation

Now xt can be written as a sum of et and rt where

et 1(xav(0) − xav(0)) : Deviation of Instantaneous Average rt xt − 1xav(t) : Disagreement

Define δt xav(t) − xav(0), so that et = δt n = 2

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 7 / 22

slide-14
SLIDE 14

Problem Formulation and Main Results

Our Results

For a fixed initial state x0, ¯ W E[W ], P I − n−111∗. Let 1 = λ1 ≥ λ2 ≥ · · · λn > −1 be the eigenvalues of ¯ W , and µ max(|λ2|, |λn|). Then for all ǫ > 0 and t: Deviation δt of instantaneous average from the average consensus point P(|δt| ≥ ǫ) ≤ min(1, 2 exp(−ǫ2βt)) Distance rt from the consensus subspace P(rt − P ¯ W txo∞ ≥ ǫ) ≤ min(1, 2 exp(−ǫ2βt)) where β(t)

1−µ2 2C 2xo2

∞(1−µ2t).

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 8 / 22

slide-15
SLIDE 15

Key Ingredients of the Proof

Will highlight the key results used to derive the concentration bound

  • n the deviation δt from the average consensus point

Proof has four main ingredients

Expected deviation from average consensus point is zero Constructing a martingale Bounding the differences between successive terms of this sequence by leveraging the connectivity of ¯ W Using the Azuma-Hoeffding inequality to bound the deviation δt from its mean

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 9 / 22

slide-16
SLIDE 16

Key Ingredients of the Proof Expected deviation from average consensus point is zero

Expected Deviation in the Consensus Subspace

Can show that E[δt] = 0, exploiting: (1) The matrix sequence {Wt} is i.i.d. (2) ¯ W is doubly stochastic E[δt] = n−11∗

  • E

t−1

  • k=0

Wkxo

  • − xo
  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 10 / 22

slide-17
SLIDE 17

Key Ingredients of the Proof Expected deviation from average consensus point is zero

Expected Deviation in the Consensus Subspace

Can show that E[δt] = 0, exploiting: (1) The matrix sequence {Wt} is i.i.d. (2) ¯ W is doubly stochastic E[δt] = n−11∗

  • E

t−1

  • k=0

Wkxo

  • − xo
  • (1)

= n−11∗( ¯ W t − I)xo

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 10 / 22

slide-18
SLIDE 18

Key Ingredients of the Proof Expected deviation from average consensus point is zero

Expected Deviation in the Consensus Subspace

Can show that E[δt] = 0, exploiting: (1) The matrix sequence {Wt} is i.i.d. (2) ¯ W is doubly stochastic E[δt] = n−11∗

  • E

t−1

  • k=0

Wkxo

  • − xo
  • (2)

= n−1(1∗ ¯ W t − 1∗)xo = 0.

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 10 / 22

slide-19
SLIDE 19

Key Ingredients of the Proof Constructing a martingale

Definition of a martingale

Definition (Chap. 12, Mitzenmacher and Upfal, 2005) A sequence Y1, Y2, . . . , Yt is a martingale with respect to a sequence X1, X2, . . . , Xt if:

1 E[|Yk|] < ∞ for k = 1, 2, . . . , t. 2 E(Yk+1 | X1, . . . , Xk) = Yk, for k = 1, . . . , t − 1.

The sequence {Xk} is called a filtration. For the current problem, define a sequence {Yk(t)} for k = 1, 2, . . . t: Xk = Wt−k Yk(t)

  • E[δt|Wt−1, Wt−2, . . . Wt−k]
  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 11 / 22

slide-20
SLIDE 20

Key Ingredients of the Proof Constructing a martingale

Definition of a martingale

Definition (Chap. 12, Mitzenmacher and Upfal, 2005) A sequence Y1, Y2, . . . , Yt is a martingale with respect to a sequence X1, X2, . . . , Xt if:

1 E[|Yk|] < ∞ for k = 1, 2, . . . , t. 2 E(Yk+1 | X1, . . . , Xk) = Yk, for k = 1, . . . , t − 1.

The sequence {Xk} is called a filtration. For the current problem, define a sequence {Yk(t)} for k = 1, 2, . . . t: Xk = Wt−k Yk(t)

  • E[δt|Wt−1, Wt−2, . . . Wt−k]
  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 11 / 22

slide-21
SLIDE 21

Key Ingredients of the Proof Constructing a martingale

Verifying that {Yk(t)} is a martingale w.r.t. {Wt−k}

Property 1: E[|Yk(t)|] < ∞. Follows from the fact that |Yk(t)| is upper-bounded by 2xo∞ Property 2: E[Yk+1(t) | Wt−1, Wt−2, · · · , Wt−k] = Yk. Follows from definition: E[Yk+1(t) | Wt−1, Wt−2, · · · , Wt−k] =E[E[δt | Wt−1, · · · , Wt−k+1] | Wt−1, · · · , Wt−k] =E[δt | Wt−1, · · · , Wt−k] = Yk.

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 12 / 22

slide-22
SLIDE 22

Key Ingredients of the Proof Bounding martingale differences

Bounding martingale differences

Yk+1(t) − Yk(t)∞

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 13 / 22

slide-23
SLIDE 23

Key Ingredients of the Proof Bounding martingale differences

Bounding martingale differences

Yk+1(t) − Yk(t)∞ = n−11∗Wt−1 · · · Wt−k+1( ¯ W − Wt−k) ¯ W t−kxo∞

(1)

≤ n−11∗∞

  • 1

l=k−1

  • l=1

Wt−l∞

  • 1

( ¯ W − Wt−k) ¯ W t−k∞xo∞ ≤ ( ¯ W − Wt−k) ¯ W t−k∞xo∞ Key Ideas

  • 1. ℓ∞-norm is sub-multiplicative
  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 13 / 22

slide-24
SLIDE 24

Key Ingredients of the Proof Bounding martingale differences

Bounding martingale differences

Yk+1(t) − Yk(t)∞ ≤ ( ¯ W − Wt−k) ¯ W t−k∞xo∞

(2)

✘✘✘✘✘✘✘✘✘✘ ✘ ✿0

( ¯ W − Wt−k)(n−111∗) + ( ¯ W − Wt−k)C1µt−k)xo∞ Key Ideas

  • 1. ℓ∞-norm is sub-multiplicative
  • 2. ¯

W and Wt−k are stochastic

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 13 / 22

slide-25
SLIDE 25

Key Ingredients of the Proof Bounding martingale differences

Bounding martingale differences

Yk+1(t) − Yk(t)∞

(1,3)

≤ Cµt−kxo∞ Key Ideas

  • 1. ℓ∞-norm is sub-multiplicative
  • 2. ¯

W and Wt−k are stochastic

  • 3. Spectral gap of of ¯

W is positive

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 13 / 22

slide-26
SLIDE 26

Key Ingredients of the Proof Using the Azuma-Hoeffding Inequality

Using the Azuma-Hoeffding Inequality

Theorem (Chap. 12, Mitzenmacher and Upfal ’05) Suppose {Yk : k = 0, 1, . . . } is a martingale and |Yk+1 − Yk| < ck almost surely for all k. Then, for all positive integers t and ǫ > 0, P(|Yt − Y0| ≥ ǫ) ≤ 2 exp

ǫ2 2 t

k=1 c2 k

  • .

For our martingale: Yt(t) = δt, Y0(t) = E[δt] = 0. Yk+1(t) − Yk(t)∞ ≤ Cµt−kxo∞ ck(t) Observing that 2 k=t

k=0 c2 k(t) = β−1 t

and using the result: P(|δt| ≥ ǫ) ≤ 2 exp(−ǫ2βt)

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 14 / 22

slide-27
SLIDE 27

Key Ingredients of the Proof Using the Azuma-Hoeffding Inequality

Using Concentration Inequalities to Bound Higher Moments of δt

Theorem (Chap. 13, Folland 1999) If X is a positive random variable, its rth moment is given by E[X r] =

  • |X|rdµ ≡ r

∞ ur−1γ(u)du, where γ(u) P(X ≥ u). If γ(u) ≤ λ(u) for all u, E[X r] ≤ r ∞ ur−1λ(u)du. ⇒ Bounds every moment of a random variable.

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 15 / 22

slide-28
SLIDE 28

Key Ingredients of the Proof Using the Azuma-Hoeffding Inequality

Asymptotic Distribution of δt

Results valid for all t ≥ 0. Since limt→∞ x(t) = 1α almost surely: lim

t→∞ P(|δt| ≥ ǫ) ≤ lim t→∞ 2 exp(−ǫ2βt)

holds almost surely. But limt→∞ δt = α − xav(0) almost surely, β∞ limt→∞ βt = (1 − µ2)(2C 2xo2

∞)−1.

Therefore P(|α − xav(0)| ≥ ǫ) ≤ 2 exp(−ǫ2β∞)

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 16 / 22

slide-29
SLIDE 29

Applications Consensus over Fading Channels

System Model

n consensus-seeking nodes, labelled V = {1, 2, . . . , n} located at {r1, r2, . . . , rn}. Node i initially holds a value xi

  • , i = 1, 2, . . . , n.

Edges in interconnection topology established by wireless links, with node i transmitting with power Pi Synchronous state update after all nodes transmit once

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 17 / 22

slide-30
SLIDE 30

Applications Consensus over Fading Channels

Communication Model

Block fading Rayleigh channels, independent across iterations Medium access using Time Division Multiple Access (TDMA) protocol Link Failure Model:

A node i can successfully receive a message from node j iff the Signal-to-Noise-Ratio (SNRji) exceeds a known threshold Θ A failed link is said to be in outage

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 18 / 22

slide-31
SLIDE 31

Applications Consensus over Fading Channels

Communication Model

Block fading Rayleigh channels, independent across iterations Medium access using Time Division Multiple Access (TDMA) protocol Link Failure Model:

A node i can successfully receive a message from node j iff the Signal-to-Noise-Ratio (SNRji) exceeds a known threshold Θ A failed link is said to be in outage

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 18 / 22

slide-32
SLIDE 32

Applications Consensus over Fading Channels

Communication Model

Block fading Rayleigh channels, independent across iterations Medium access using Time Division Multiple Access (TDMA) protocol Link Failure Model:

A node i can successfully receive a message from node j iff the Signal-to-Noise-Ratio (SNRji) exceeds a known threshold Θ A failed link is said to be in outage

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 18 / 22

slide-33
SLIDE 33

Applications Consensus over Fading Channels

Communication Model

Block fading Rayleigh channels, independent across iterations Medium access using Time Division Multiple Access (TDMA) protocol Link Failure Model:

A node i can successfully receive a message from node j iff the Signal-to-Noise-Ratio (SNRji) exceeds a known threshold Θ A failed link is said to be in outage

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 18 / 22

slide-34
SLIDE 34

Applications Consensus over Fading Channels

Randomness in information exchange

Fading causes random link outages Information exchange specified by a random digraph over n nodes, with each edge − − → (j, i) present with a probability pji = exp

  • −ΘN0

Pjrj − ri−α

  • ,

where N0 is the noise variance and α ≥ 2 is the path loss exponent. Lij is a Bernoulli random variable with parameter pji Therefore ¯ W = I − h¯ L, where ¯ Lij =

  • −pji

i = j

  • j=i pji

i = j

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 19 / 22

slide-35
SLIDE 35

Applications Consensus over Fading Channels

Randomness in information exchange

Fading causes random link outages Information exchange specified by a random digraph over n nodes, with each edge − − → (j, i) present with a probability pji = exp

  • −ΘN0

Pjrj − ri−α

  • ,

where N0 is the noise variance and α ≥ 2 is the path loss exponent. Lij is a Bernoulli random variable with parameter pji Therefore ¯ W = I − h¯ L, where ¯ Lij =

  • −pji

i = j

  • j=i pji

i = j

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 19 / 22

slide-36
SLIDE 36

Applications Consensus over Fading Channels

Randomness in information exchange

Fading causes random link outages Information exchange specified by a random digraph over n nodes, with each edge − − → (j, i) present with a probability pji = exp

  • −ΘN0

Pjrj − ri−α

  • ,

where N0 is the noise variance and α ≥ 2 is the path loss exponent. Lij is a Bernoulli random variable with parameter pji Therefore ¯ W = I − h¯ L, where ¯ Lij =

  • −pji

i = j

  • j=i pji

i = j

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 19 / 22

slide-37
SLIDE 37

Applications Consensus over Fading Channels

Randomness in information exchange

Fading causes random link outages Information exchange specified by a random digraph over n nodes, with each edge − − → (j, i) present with a probability pji = exp

  • −ΘN0

Pjrj − ri−α

  • ,

where N0 is the noise variance and α ≥ 2 is the path loss exponent. Lij is a Bernoulli random variable with parameter pji Therefore ¯ W = I − h¯ L, where ¯ Lij =

  • −pji

i = j

  • j=i pji

i = j

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 19 / 22

slide-38
SLIDE 38

Applications Consensus over Fading Channels

Example

n = 2 nodes, Link failures p01 = 0.2, p10 = 0.8, xo = 0.4

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Distance ε from Average Consensus Point Probability Asymptotic Distrbution of α about the Average Consensus Point

Bound Simulation results

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 20 / 22

slide-39
SLIDE 39

Conclusions

Conclusions

Framework developed to study probabilistic state evolution in average consensus algorithm Some Applications

Concentration Bounds for State in Randomized Average Consensus Algorithms Distribution of products of random stochastic matrices

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 21 / 22

slide-40
SLIDE 40

Future Work

Future Work

Obtaining sharper concentration bounds for instantaneous state average for consensus in wireless networks Extending convergence results with non-iid update matrices Applying techniques developed to the performance of distributed randomized algorithms

  • S. Vanka, V. Gupta and M. Haenggi

IEEE IT School 2009 22 / 22