Data Collection Infrastructure for Location- Location-Unaware Sensor - - PowerPoint PPT Presentation
Data Collection Infrastructure for Location- Location-Unaware Sensor - - PowerPoint PPT Presentation
Data Collection Infrastructure for Location- Location-Unaware Sensor Networks Distributed coding protocols for data storage Silvija Kokalj-Filipovic Predrag Spasojevic Roy Yates Talk OutLine Data Collection from a
Talk OutLine
- Data Collection from a Location-Unaware Wireless Sensor Network
– Network nodes self-organize into a web-like infrastructure of routes – Network data is encoded and stored along circular infrastructure routes using a distributed coding protocol – A Mobile Data Collector arrives to a random point of the network perimeter – Connects to the closest node of each circular route and collects encoded data from the nodes within its immediate neighborhood – Up-front collection from the neighborhood combined with polling distant nodes selectively to collect symbols which unlock the decoding process is an energy- efficient solution that allows for full decoding – The data collection is completed when the collector decodes all network data
Sensor Network Example
location-unaware sensor nodes randomly scattered in a plane
Sensor Network Example
location-unaware sensor nodes randomly scattered in a plane Isotropic wireless propagation
Data Dissemination
advertising along source spokes
increases the likelihood of information discovery
avoiding flooding-based data publishing
no redundant transmissions (broadcast storm)
source
Simulated Dissemination Scenario
50 100 150 200 250 300 350 −50 50 100 150 200 250 300 350
R
2
R
1
Infrastructure
building modeling
coding for distributed data storage
decoding strategies for data collection
Isometric Routes Data Collector
- Light Isometric Networks
- Heavy Isometric Networks
- infrastructure developed as a side
effect of search for specific data items
– use for storage of network data, through network- network-coding based methods
- inspired by the current work on network coding fo
coding for storage in WSN
Isometric Networks
R
2
R
1
network partitioned into sub-networks that are customized to handle network storage task according to the number of associated sources
) ( ] [ 1 2
2
− = i r k E
s i
πλ
ir Ri =
Data Collector wants a snapshot of network data
Current Work in Network Coding for Data Storage in WSN
Two basic approaches: – Decentralized erasure codes (1)
- encode k symbols into codewords of length n, which can be decoded fro
m any subset of k symbols within the codeword
- Decoding complexity: O(k3)
– Decentralized fountain codes (2)
- potentially infinitely many codewords (linear combinations of k data
blocks); can be decoded from any k independent combinations
- Decoding complexity: almost linear in k
- Abstract (1) or overly expensive (2) random routing techniques
- We propose structured approach to decrease cost
(1) Dimakis, Prabhakaran, Ramchandran. Decentralized erasure codes for distributed networked storage ( ‘05/6) (2) Liu, Liang, Li. Data persistence in large-scale sensor networks with decentralized fountain codes ( ‘07)
G
Decentralized Erasure Codes
k data nodes n storage nodes
X1 X2 X3
f1 f2 f3 f4 f5 f1 X1 +f2 X2 f3 X2 f4 X1 +f5 X3 f6 f6 X3
⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡ =
6 5 3 2 4 1 3 2 1 4 3 2 1
] [ ] [ f f f f f f X X X Y Y Y Y
Want matrix as sparse as possible (decreases dissemination cost) Now assume only storage nodes 1-3 are queried. To reconstruct it suffices to have G to be full rank
- K. Ramchandran
Making dense sensor networks smarter using randomized in-network processing NSF workshop “Future Directions in networked sensing” May 2006
Basic Coding Approach: Random Linear Coding
0 0 0 0 0
1
k
1 2 5 i 5 2 i n n n-1 n-2 1 1
k
x xG= y
T
G= Both approaches:
a certain number of packet replicas to be randomly diffused from independent independent sources and stored an random nodes (matrix rows)
Node i holds a codeword of degree d equal to the number of non-zero entries in this column Decentralized Fountain: How to build a code if your data is not in one place?
imposes a probability distribution on codeword degrees
- Light Isometric Networks:
– Random Linear Codes
- Decoding complexity for L light networks:
- Heavy Isometric Networks:
– Decentralized Fountain Codes
- Decoding complexity for H heavy networks:
Isometric Networks
R
2
R
1
∑ =
L i i
k
1 3
∑ =
H i i
k
1
network coding selected according to the number of associated sources
ir Ri =
( )3
1 1 3
∑ <
= =
∑
L i i L i i
k k
Dissemination and Storage
Relaying and Overhearing/Combining
Let us assume that
- there is one source per relay (i.e. per squad)
- number of sources (relays) n larger than squad size h
Sources associate with one of the closest relays Relays disseminate (mix data) Squad nodes overhear, combine and store data
Storage squad: set of nodes in the range of relay relays
Storage and Dissemination Graphs
Mixing over circular graph with ni nodes, nodes, each of degree 2 Mixing time O(i2)
i r ir n
s i
π π 4 2 ≈ =
- Circular dissemination with network coding in the context of “wireless multicast
multicast advantage”
- Apply network coding for storage in squad nodes that overhear 2 relays
ni
1 2 3
Super-Squads: sets of adjacent squads
high energy cost of data dissemination
Mixing time i2/2
Super-squad
Storage in Isometric Networks
MDC collects from SUPER-SQUADS
Assumption: a mobile data collector (MDC) will establish connection with a random relay Goal: to have all data of the isometric network available in the vicinity of selected relay
large number of sources: storage with linear decoding complexity needed h squad nodes h=O(rs
2), h<n
0 0 0 0 0
hn storage nodes n sources
Storage Protocol controls degree distribution of codewords
COLLECTION
Random Matrix created by Storage Protocol
Or collect a large number where n independent equations exist with high probability?
Super-squad
Collection Strategies
Super-squad
Is this matrix invertable?
A large collection that guarantees decoding (whp) costs a lot: collection energy constraint
Up-front collecting: Collect small super-squad of code symbols locally
fits the energy budget, but insufficient
On-demand collecting : collect selected code symbols
likely to cost more per symbol, but few of them needed during decoding process
TRADE-OFF in collection strategy
0 0 0 0 0
hn storage nodes n sources
Efficient Collection Strategy: Push-Pull Model
Push:
The closest super-squad of size s sends coded packets
- Enough to decode partially (belief propagation decoder)
Pull:
Query for d code symbols which can continue belief-propagation decoding (decoder doping)
n sources n sources
0 0 0 0
STUCK! UNSTUCK!!!
{
s
n n d s − +
Doping Cost:
500 1000 1500 2000 2500 3000 3500 4000 10 20 30 40 50 60 70 80 90 100
n: number of symbols to decode percentage od doping symbols desired undecoded rate is 0.01
RS constant: 0.10
I min doping percentage I mean doping percentage I max doping percentage R min doping percentage R mean doping percentage R max doping percentage I: Ideal Soliton R: Robust Soliton
How to select degree distribution to collect efficiently?
Random Fountain Encoding of Network Data
Doping Cost Depends on Degree Distribution Robust Soliton Ideal Soliton
How many coded packets do I need to pull to decode all data?
500 1000 1500 2000 2500 3000 3500 4000 10 20 30 40 50 60 70 80 90 100
n: number of symbols to decode percentage od doping symbols LEGEND D: Fountain with Degree-2 Doping U: uniform doping D min DP D mean DP D max DP U min DP U mean DP U max DP
Random-access Push-Pull Data Collection and Decoding
Doping Cost Depends on Doping Mechanism random packet pull “smart” packet pull