Distributed Fusion in Sensor Distributed Fusion in Sensor Networks - - PowerPoint PPT Presentation

distributed fusion in sensor distributed fusion in sensor
SMART_READER_LITE
LIVE PREVIEW

Distributed Fusion in Sensor Distributed Fusion in Sensor Networks - - PowerPoint PPT Presentation

Distributed Fusion in Sensor Distributed Fusion in Sensor Networks Networks Jie Gao Computer Science Department Stony Brook University Papers Papers [Xiao04] Lin Xiao, Stephen Boyd, Fast Linear Iterations for Distributed Averaging ,


slide-1
SLIDE 1

Distributed Fusion in Sensor Distributed Fusion in Sensor Networks Networks

Jie Gao

Computer Science Department Stony Brook University

slide-2
SLIDE 2

Papers Papers

  • [Xiao04] Lin Xiao, Stephen Boyd, Fast Linear Iterations for

Distributed Averaging, Systems and Control Letters, 2004.

  • [Xiao05] Lin Xiao, Stephen Boyd and Sanjay Lall, A Scheme

for Robust Distributed Sensor Fusion Based on Average Consensus, IPSN'05, 2005.

  • [Boyd05] S. Boyd, A. Ghosh, B. Prabhakar, D. Shah, Gossip

Algorithms: Design, Analysis and Applications, INFOCOM'05.

  • Acknowledgement: many slides/figures are borrowed from Lin

Xiao.

slide-3
SLIDE 3

How to diffuse information? How to diffuse information?

  • One node has a piece of information that it wants to

send to everyone.

– Flood, multi-cast.

  • Every node has a piece of information that it wants

to send to everyone.

– Multi-round flooding.

  • How do we diffuse information in real life?

Gossip.

slide-4
SLIDE 4

Uniform gossip Uniform gossip

  • Each node x randomly picks another node y and

send to y all the information x has.

  • After O(log n) rounds, every node has all the

information with high probability.

  • Totally distributed.
  • Isotropic protocol.
slide-5
SLIDE 5

Other applications Other applications

  • Load balancing:

– N machines with different work load. – Goal: balance the load.

  • Diffusion-based load balancing

– each machine picks randomly another machine y and shift part of its extra load, if any, to y.

  • Good for the case when the work load of a job is

unknown until it starts.

slide-6
SLIDE 6

Use distributed diffusion for Use distributed diffusion for computing computing

slide-7
SLIDE 7

Parameter estimation Parameter estimation

  • We want to fit a linear model to the sensor data.
  • E.g., linear fitting.
slide-8
SLIDE 8

Maximum likelihood estimation Maximum likelihood estimation

slide-9
SLIDE 9

Example: target localization Example: target localization

slide-10
SLIDE 10

How to estimate How to estimate θ θ? ?

  • Gather all the information and run the centralized

maximum likelihood estimate.

  • Or,
  • Use a distributed fusion algorithm:

– Each sensor exchanges data with its neighbors and carries out local computation, e.g., a least-square estimate. – Eventually each sensor obtains a good estimation.

  • Advantages:

– Completely distributed. – Robust to link dynamics, only requires a mild assumption

  • n the network connectivity.

– No assumption on routing protocol or any global info.

slide-11
SLIDE 11

Distributed average consensus Distributed average consensus

  • Let’s start with a simple task.
  • Goal: compute the average of the sensor readings

by a distributed iterative algorithm.

  • Assume sensors are synchronized. x(t) is the value
  • f sensor x at time t.
slide-12
SLIDE 12

Algorithm Algorithm

slide-13
SLIDE 13

Analysis Analysis

  • Write the algorithm in a matrix form.
  • W: the weighted adjacency matrix. The value at

position (i, j) is Wi,j. It is a matrix of size n by n.

  • x(t):the sensor values at time t, a vector of size n.
  • We know: x(t+1)=Wx(t).
  • Inductively, x(t)=Wtx(0).
  • We hope the iterative algorithm converge to the

correct average.

slide-14
SLIDE 14

Performance Performance

  • Questions:

– Does this algorithm converge? – How fast does it converge? – How to choose the weights so that the algorithm converges quickly?

slide-15
SLIDE 15

Convergence condition: intuition Convergence condition: intuition

  • The vector (1, 1, …, 1) is a fixed point.
  • each row sums up to 1.

W Row i

slide-16
SLIDE 16

Convergence condition: intuition Convergence condition: intuition

  • Think the value as money. The total money

in the system should be kept the same.

  • Mass conservation.
  • each column sums up to 1.

W Column j

slide-17
SLIDE 17

Doubly stochastic matrix Doubly stochastic matrix

  • W must be a doubly stochastic matrix: all

the row sum up to 1; and all the columns sum up to 1.

W Column j Row i

slide-18
SLIDE 18

Convergence condition: intuition Convergence condition: intuition

  • The algorithm should converge to the

average.

  • Write the average in a matrix form.
  • Average vector: 1/n 11T x(0).
  • We want Wt →1/n 11T x(0), as t →∞.

1/ 1/ ... 1/ 1/ 1/ ... 1/ ... ... ... ... 1/ 1/ ... 1/ n n n n n n n n n

slide-19
SLIDE 19

Convergence condition Convergence condition

  • Theorem: if and
  • nly if W is a doubly stochastic matrix and

the spectral radius of (W - 11T /n) is less than 1.

W Column j Row i

slide-20
SLIDE 20

A detour on matrix theory A detour on matrix theory

slide-21
SLIDE 21

Matrix, Matrix, eigenvalues eigenvalues, eigenvectors , eigenvectors

  • An n by n matrix A.
  • Eigenvalues: λ1, λ2, …, λn. (real numbers)
  • Corresponding eigenvectors: v1, v2, …, vn. (non-

zero vector of size n).

  • Avi = λivi.
  • A2vi = A(Avi) = A(λivi) = λi (Avi)= λi

2vi.

  • Inductively, Akvi = λi

kvi.

slide-22
SLIDE 22

Spectral radius Spectral radius

  • Spectral radius of M: ρ(A)=max|λi|.
  • Theorem:

if and only if ρ(A)<1.

  • Proof: () Suppose λ=ρ(A) with eigenvector v.
  • 0=(lim Ak)v = lim Akv = lim λkv = (lim λk ) v.
  • Since v is non-zero, lim λk =0. This shows ρ(A)<1.
  • () This direction uses Jordan Normal Form.
slide-23
SLIDE 23

Back to distributed diffusion Back to distributed diffusion

slide-24
SLIDE 24

Convergence condition Convergence condition

  • Theorem: if and
  • nly if W is a doubly stochastic matrix and

the spectral radius of (W - 11T /n) is less than 1.

W Column j Row i

slide-25
SLIDE 25

Proof of the convergence condition Proof of the convergence condition

  • Sufficiency: if W is a doubly stochastic matrix and

ρ(W - 11T /n) < 1, then

  • Proof:

1. W is doubly stochastic. Thus 2. Now we have 3. Since ρ(W - 11T /n) < 1,

slide-26
SLIDE 26

Convergence rate Convergence rate

The smaller the better.

slide-27
SLIDE 27

Fastest iterative algorithm? Fastest iterative algorithm?

  • Given a graph, find the weight function such that

the iterative algorithm converges fastest.

  • Theorem (Xiao & Boyd 04): When the matrix W is

symmetric, the above optimization problem can be formulated by a semi-definite programming and can be solved efficiently.

slide-28
SLIDE 28

Choosing the weight Choosing the weight

slide-29
SLIDE 29

Example: weight selection Example: weight selection

slide-30
SLIDE 30

Extension to changing topologies Extension to changing topologies

slide-31
SLIDE 31

Changing topologies Changing topologies

  • The sensor network topology changes over time.

– Link failure. – Mobility. – Power constraints. – Channel fading.

  • However, the distributed fusion algorithm only

assumes a mild condition on network connectivity --

  • the network is “connected in a long run”.
slide-32
SLIDE 32

Changing topologies Changing topologies

  • The communication graph G(t) is time-varying.
  • For n nodes, there are only finitely many

communication graphs, and finitely many weight functions.

  • There are a subset of graphs that appear infinitely

many times.

  • If the collection of graphs that appear infinitely

many times are jointly connected, then the algorithm converges.

slide-33
SLIDE 33

Changing topologies Changing topologies

  • We emphasize that this is a very mild condition on

connectivity.

  • Many links can fail permanently.
  • We only require that a connected graph “survives”

in the sequence of (possibly disconnected) graphs.

slide-34
SLIDE 34

Choice of weights Choice of weights

slide-35
SLIDE 35

Robust convergence Robust convergence

  • Intuition: the weight function W (for both max degree and

Metropolis) is paracontracting.

  • It preserves the fixed-point subspace and contract all other
  • vectors. Thus if we apply the matrix infinitely many times, the

limit has to be a fixed point.

slide-36
SLIDE 36

Extension to parameter estimation Extension to parameter estimation

slide-37
SLIDE 37

Maximum likelihood estimation Maximum likelihood estimation

slide-38
SLIDE 38

Distributed parameter estimation Distributed parameter estimation

  • A sensor node i knows
  • Goal: we want to evaluate in a distributed fashion
  • Idea: use the average consensus algorithm.
slide-39
SLIDE 39

Distributed parameter estimation Distributed parameter estimation

slide-40
SLIDE 40

Distributed parameter estimation Distributed parameter estimation

slide-41
SLIDE 41

Intermediate estimates Intermediate estimates

slide-42
SLIDE 42

Properties Properties

slide-43
SLIDE 43

Simulation Simulation

slide-44
SLIDE 44

A demo A demo

slide-45
SLIDE 45

A larger example A larger example

slide-46
SLIDE 46

Random gossip model Random gossip model

slide-47
SLIDE 47

Random gossip Random gossip

  • Completely asynchronous. No synchronized clock

is needed.

  • At each time, a node can only talk to one other

node.

  • Distributed average consensus: each node picks
  • ne node with some probability distribution and

compute the average.

  • Natural averaging algorithm: each node uniformly

randomly picks a neighbor and compute the avg.

  • Again, one can find the optimal averaging

distribution by convex programming s.t. the algorithm converges fastest.

slide-48
SLIDE 48

Random geometric graphs Random geometric graphs

  • Gd(n, r): place n nodes uniformly random in a d-

dimensional cube and connect two nodes if they are within distance r.

  • Bad news: the natural

averaging algorithm converges about the same order as the

  • ptimal one both are slow.
  • Good news: no need to
  • ptimize. The natural averaging

is a local and distributed algorithm with optimal performance.

slide-49
SLIDE 49

Internet Internet

  • Preferential attachment model: a new comer

connects an edge to the existing nodes with probability proportion to the degree.

  • “Rich get richer”.
  • The graph obtained is an expander:

– spectral gap is a constant; – the second largest eigenvalue is small enough; – random walk mixes fast;

  • Optimal averaging algorithm has an averaging time

O(log1/ε), independent of the graph size.

  • Averaging on P2P network is extremely fast.
slide-50
SLIDE 50

Summary Summary

  • One of the few examples that are so robust to

topological changes.

  • Many applications on similar problems.
  • Distributed optimization.