Querying Sensor Networks Sam Madden 1 Sensor Networks Small - - PowerPoint PPT Presentation

querying sensor networks
SMART_READER_LITE
LIVE PREVIEW

Querying Sensor Networks Sam Madden 1 Sensor Networks Small - - PowerPoint PPT Presentation

Querying Sensor Networks Sam Madden 1 Sensor Networks Small computers with: Radios Sensing hardware Batteries Remote deployments Long lived 10s, 100s, or 1000s Battery Pack Smart Sensor, aka Mote 2


slide-1
SLIDE 1

1

Querying Sensor Networks

Sam Madden

slide-2
SLIDE 2

2

Sensor Networks

  • Small computers with:

– Radios – Sensing hardware – Batteries

  • Remote deployments

– Long lived – 10s, 100s, or 1000s

Battery Pack Smart Sensor, aka “Mote”

slide-3
SLIDE 3

3

Motes

Mica Mote 4Mhz, 8 bit Atmel RISC uProc 40 kbit Radio 4 K RAM, 128 K Program Flash, 512 K Data Flash AA battery pack Based on TinyOS*

*Hill, Szewczyk, Woo, Culler, & Pister. “Systems Architecture Directions for Networked Sensors.” ASPLOS 2000. http://webs.cs.berkeley.edu/tos

slide-4
SLIDE 4

4

Sensor Net Sample Apps

Traditional monitoring apparatus. Earthquake monitoring in shake- test sites. Vehicle detection: sensors along a road, collect data about passing vehicles. Habitat Monitoring: Storm petrels on Great Duck Island, microclimates on James Reserve.

slide-5
SLIDE 5

5

Programming Sensor Nets Is Hard

– Months of lifetime required from small batteries

» 3-5 days naively; can’t recharge often » Interleave sleep with processing

– Lossy, low-bandwidth, short range communication

»Nodes coming and going »~20% loss @ 5m »Multi-hop

– Remote, zero administration deployments – Highly distributed environment – Limited Development Tools

»Embedded, LEDs for Debugging!

Need high level abstractions!

200-800 instructions per bit transmitted! High-Level Abstraction Is Needed!

slide-6
SLIDE 6

6

A Solution: Declarative Queries

  • Users specify the data they want

– Simple, SQL-like queries – Using predicates, not specific addresses – Same spirit as Cougar – Our system: TinyDB

  • Challenge is to provide:

– Expressive & easy-to-use interface – High-level operators

» Well-defined interactions » “Transparent Optimizations” that many programmers would miss

  • Sensor-net specific techniques

– Power efficient execution framework

  • Question: do sensor networks change query

processing? Yes!

slide-7
SLIDE 7

7

Overview

  • TinyDB: Queries for Sensor Nets
  • Processing Aggregate Queries (TAG)
  • Taxonomy & Experiments
  • Acquisitional Query Processing
  • Other Research
  • Future Directions
slide-8
SLIDE 8

8

Overview

  • TinyDB: Queries for Sensor Nets
  • Processing Aggregate Queries (TAG)
  • Taxonomy & Experiments
  • Acquisitional Query Processing
  • Other Research
  • Future Directions
slide-9
SLIDE 9

9

TinyDB Demo

slide-10
SLIDE 10

10

TinyOS Schema Query Processor Multihop Network

TinyDB Architecture

Schema:

  • “Catalog” of commands &

attributes Filterlight >

400

get (‘temp’)

Aggavg(temp) Queries

SELECT AVG(temp) WHERE light > 400

Results T:1, AVG: 225 T:2, AVG: 250

Tables Samples got(‘temp’)

Name: temp Time to sample: 50 uS Cost to sample: 90 uJ Calibration Table: 3 Units: Deg. F Error: ± 5 Deg F Get f : getTempFunc()…

getTempFunc(…)

TinyDB

~10,000 Lines Embedded C Code ~5,000 Lines (PC-Side) Java ~3200 Bytes RAM (w/ 768 byte heap) ~58 kB compiled code (3x larger than 2nd largest TinyOS Program)

slide-11
SLIDE 11

11

Declarative Queries for Sensor Networks

  • Examples:

SELECT nodeid, nestNo, light FROM sensors WHERE light > 400 EPOCH DURATION 1s

1

Epoch Nodeid nestNo Light 1 17 455 2 25 389 1 1 17 422 1 2 25 405

Sensors “Find the sensors in bright nests.”

slide-12
SLIDE 12

12

Aggregation Queries

Epoch region CNT(…) AVG(…) North 3 360 South 3 520 1 North 3 370 1 South 3 520

“Count the number occupied nests in each loud region of the island.”

SELECT region, CNT(occupied) AVG(sound) FROM sensors GROUP BY region HAVING AVG(sound) > 200 EPOCH DURATION 10s

3

Regions w/ AVG(sound) > 200 SELECT AVG(sound) FROM sensors EPOCH DURATION 10s

2

slide-13
SLIDE 13

13

Overview

  • TinyDB: Queries for Sensor Nets
  • Processing Aggregate Queries (TAG)
  • Taxonomy & Experiments
  • Acquisitional Query Processing
  • Other Research
  • Future Directions
slide-14
SLIDE 14

14

Tiny Aggregation (TAG)

  • In-network processing of aggregates

– Common data analysis operation

» Aka gather operation or reduction in || programming

– Communication reducing

» Operator dependent benefit

– Across nodes during same epoch

  • Exploit query semantics to improve

efficiency!

Madden, Franklin, Hellerstein, Hong. Tiny AGgregation (TAG), OSDI 2002.

slide-15
SLIDE 15

15

Query Propagation Via Tree-Based Routing

  • Tree-based routing

– Used in:

» Query delivery » Data collection

– Topology selection is important; e.g.

» Krishnamachari, DEBS 2002, Intanagonwiwat, ICDCS 2002, Heidemann, SOSP 2001 » LEACH/SPIN, Heinzelman et al. MOBICOM 99 » SIGMOD 2003

– Continuous process

» Mitigates failures

A B C D F E

Q:SELECT …

Q Q Q Q Q Q Q Q Q Q Q Q

R:{…} R:{…} R:{…} R:{…} R:{…}

slide-16
SLIDE 16

16

Basic Aggregation

  • In each epoch:

– Each node samples local sensors once – Generates partial state record (PSR)

» local readings » readings from children

– Outputs PSR during assigned comm. interval

  • At end of epoch, PSR for whole network
  • utput at root
  • New result on each successive epoch
  • Extras:

– Predicate-based partitioning via GROUP BY

1 2 3 4 5

slide-17
SLIDE 17

17

Illustration: Aggregation

1 2 3 4 5 4 1 3 2 1 4

1 2 3 4 5 1 Sensor # Interval # Interval 4 SELECT COUNT(*) FROM sensors Epoch

slide-18
SLIDE 18

18

Illustration: Aggregation

1 2 3 4 5 4 1 3 2 2 1 4

1 2 3 4 5 2 Sensor # Interval 3 SELECT COUNT(*) FROM sensors Interval #

slide-19
SLIDE 19

19

Illustration: Aggregation

1 2 3 4 5 4 1 3 2 2 1 3 1 4

1 2 3 4 5 3 1 Sensor # Interval 2 SELECT COUNT(*) FROM sensors Interval #

slide-20
SLIDE 20

20

Illustration: Aggregation

1 2 3 4 5 4 1 3 2 2 1 3 1 5 4

1 2 3 4 5 5 Sensor # SELECT COUNT(*) FROM sensors Interval 1 Interval #

slide-21
SLIDE 21

21

Illustration: Aggregation

1 2 3 4 5 4 1 3 2 2 1 3 1 5 4 1

1 2 3 4 5 1 Sensor # SELECT COUNT(*) FROM sensors Interval 4 Interval #

slide-22
SLIDE 22

22

Interval Assignment: An Approach

1 2 3 4 5 SELECT COUNT(*)… 4 intervals / epoch

Interval # = Level

4 3 Level = 1 2 Epoch

Comm Interval

4 3 2 1 5 5

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

Z

Z Z

L T L T L T T L T L L

Pipelining: Increase throughput by delaying result arrival until a later epoch

Madden, Szewczyk, Franklin, Culler. Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor

  • Networks. WMCSA 2002.
  • CSMA for collision

avoidance

  • Time intervals for

power conservation

  • Many variations(e.g. Yao

& Gehrke, CIDR 2003)

  • Time Sync (e.g. Elson &

Estrin OSDI 2002)

slide-23
SLIDE 23

23

Aggregation Framework

  • As in extensible databases, we support any

aggregation function conforming to:

Aggn={finit, fmerge, fevaluate} Finit {a0} → <a0> Fmerge {<a1>,<a2>} → <a12> Fevaluate {<a1>} → aggregate value

Example: Average

AVGinit {v} → <v,1> AVGmerge {<S1, C1>, <S2, C2>} → < S1 + S2 , C1 + C2> AVGevaluate{<S, C>} → S/C Partial State Record (PSR)

Restriction: Merge associative, commutative

slide-24
SLIDE 24

24

Types of Aggregates

  • SQL supports MIN, MAX, SUM, COUNT,

AVERAGE

  • Any function over a set can be computed

via TAG

  • In network benefit for many operations

– E.g. Standard deviation, top/bottom N, spatial union/intersection, histograms, etc. – Compactness of PSR

slide-25
SLIDE 25

25

Overview

  • TinyDB: Queries for Sensor Nets
  • Processing Aggregate Queries (TAG)
  • Taxonomy & Experiments
  • Acquisitional Query Processing
  • Other Research
  • Future Directions
slide-26
SLIDE 26

26

Simulation Environment

  • Evaluated TAG via simulation
  • Coarse grained event based simulator

– Sensors arranged on a grid – Two communication models

» Lossless: All neighbors hear all messages » Lossy: Messages lost with probability that increases with distance

  • Communication (message counts) as

performance metric

slide-27
SLIDE 27

27

Taxonomy of Aggregates

  • TAG insight: classify aggregates according to

various functional properties

– Yields a general set of optimizations that can automatically be applied

Properties Partial State Monotonicity Exemplary vs. Summary Duplicate Sensitivity

Drives an API!

slide-28
SLIDE 28

28

Partial State

  • Growth of PSR vs. number of aggregated values (n)

– Algebraic: |PSR| = 1 (e.g. MIN) – Distributive: |PSR| = c (e.g. AVG) – Holistic: |PSR| = n (e.g. MEDIAN) – Unique: |PSR| = d (e.g. COUNT DISTINCT)

» d = # of distinct values

– Content Sensitive: |PSR| < n (e.g. HISTOGRAM)

Property Examples Affects Partial State MEDIAN : unbounded, MAX : 1 record Effectiveness of TAG “Data Cube”, Gray et. al

slide-29
SLIDE 29

29

Benefit of In-Network Processing

Simulation Results 2500 Nodes 50x50 Grid Depth = ~10 Neighbors = ~20 Uniform Dist.

  • Aggregate & depth

dependent benefit!

Holistic Unique Distributive Algebraic

slide-30
SLIDE 30

30

Monotonicity & Exemplary vs. Summary

Property Examples Affects Partial State MEDIAN : unbounded, MAX : 1 record Effectiveness of TAG Monotonicity COUNT : monotonic AVG : non-monotonic Hypothesis Testing, Snooping Exemplary vs. Summary MAX : exemplary COUNT: summary Applicability of Sampling, Effect of Loss

slide-31
SLIDE 31

31

Channel Sharing (“Snooping”)

  • Insight: Shared channel can reduce communication
  • Suppress messages that won’t affect aggregate

– E.g., MAX – Applies to all exemplary, monotonic aggregates

  • Only snoop in listen/transmit slots

– Future work: explore snooping/listening tradeoffs

slide-32
SLIDE 32

32

Hypothesis Testing

  • Insight: Guess from root can be used for

suppression

– E.g. ‘MIN < 50’ – Works for monotonic & exemplary aggregates

» Also summary, if imprecision allowed

  • How is hypothesis computed?

– Blind or statistically informed guess – Observation over network subset

slide-33
SLIDE 33

33

Experiment: Snooping vs. Hypothesis Testing

  • Uniform Value

Distribution

  • Dense Packing
  • Ideal

Communication

Pruning in Network Pruning at Leaves

slide-34
SLIDE 34

34

Duplicate Sensitivity

Property Examples Affects Partial State MEDIAN : unbounded, MAX : 1 record Effectiveness of TAG Monotonicity COUNT : monotonic AVG : non-monotonic Hypothesis Testing, Snooping Exemplary vs. Summary MAX : exemplary COUNT: summary Applicability of Sampling, Effect of Loss Duplicate Sensitivity MIN : dup. insensitive, AVG : dup. sensitive Routing Redundancy

slide-35
SLIDE 35

35

Use Multiple Parents

  • Use graph structure

– Increase delivery probability with no communication overhead

  • For duplicate insensitive aggregates, or
  • Aggs expressible as sum of parts

– Send (part of) aggregate to all parents

» In just one message, via multicast

– Assuming independence, decreases variance

SELECT COUNT(*) A B C R A B C c R

P(link xmit successful) = p P(success from A->R) = p2 E(cnt) = c * p2 Var(cnt) = c2 * p2 * (1 – p2) ≡ V # of parents = n E(cnt) = n * (c/n * p2) Var(cnt) = n * (c/n)2 * p2 * (1 – p2) = V/n

A B C c/n c/n R n = 2

slide-36
SLIDE 36

36

Multiple Parents Results

  • Better than

previous analysis expected!

  • Losses aren’t

independent!

  • Insight: spreads

data over many links

Critical Link! No Splitting With Splitting

slide-37
SLIDE 37

37

Taxonomy Related Insights

  • Communication Reducing

– In-network Aggregation (Partial State) – Hypothesis Testing (Exemplary & Monotonic) – Snooping (Exemplary & Monotonic) – Sampling

  • Quality Increasing

– Multiple Parents (Duplicate Insensitive) – Child Cache

slide-38
SLIDE 38

38

TAG Contributions

  • Simple but powerful data collection language

– Vehicle tracking: SELECT ONEMAX(mag,nodeid) EPOCH DURATION 50ms

  • Distributed algorithm for in-network aggregation

– Communication Reducing – Power Aware

» Integration of sleeping, computation

– Predicate-based grouping

  • Taxonomy driven API

– Enables transparent application of techniques to

» Improve quality (parent splitting) » Reduce communication (snooping, hypo. testing)

slide-39
SLIDE 39

39

Overview

  • TinyDB: Queries for Sensor Nets
  • Processing Aggregate Queries (TAG)
  • Taxonomy & Experiments
  • Acquisitional Query Processing
  • Other Research
  • Future Directions
slide-40
SLIDE 40

40

Acquisitional Query Processing (ACQP)

  • Closed world assumption does not hold

– Could generate an infinite number of samples

  • An acqusitional query processor controls

– when, – where, – and with what frequency data is collected!

  • Versus traditional systems where data is provided a priori

Madden, Franklin, Hellerstein, and Hong. The Design of An Acqusitional Query Processor. SIGMOD, 2003 (to appear).

slide-41
SLIDE 41

41

ACQP: What’s Different?

  • How should the query be processed?

– Sampling as a first class operation – Event – join duality

  • How does the user control acquisition?

– Rates or lifetimes – Event-based triggers

  • Which nodes have relevant data?

– Index-like data structures

  • Which samples should be transmitted?

– Prioritization, summary, and rate control

slide-42
SLIDE 42

42

  • E(sampling mag) >> E(sampling light)

1500 uJ vs. 90 uJ

Operator Ordering: Interleave Sampling + Selection

SELECT light, mag FROM sensors WHERE pred1(mag) AND pred2(light) EPOCH DURATION 1s

σ(pred1) σ(pred2)

mag light

σ(pred1) σ(pred2)

mag light

σ(pred1) σ(pred2)

mag light Traditional DBMS

ACQP At 1 sample / sec, total power savings could be as much as 3.5mW à Comparable to processor!

Correct ordering (unless pred1 is very selective and pred2 is not):

Cheap Costly

slide-43
SLIDE 43

43

Exemplary Aggregate Pushdown

SELECT WINMAX(light,8s,8s) FROM sensors WHERE mag > x EPOCH DURATION 1s

  • Novel, general

pushdown technique

  • Mag sampling is

the most expensive

  • peration!

γWINMAX σ(mag>x)

mag light Traditional DBMS light mag

σ(mag>x) γWINMAX σ(light > MAX)

ACQP

slide-44
SLIDE 44

44

Lifetime Queries

  • Lifetime vs. sample rate

SELECT … EPOCH DURATION 10 s SELECT … LIFETIME 30 days

  • Extra: Allow a MAX SAMPLE PERIOD

– Discard some samples – Sampling cheaper than transmitting

slide-45
SLIDE 45

45

(Single Node) Lifetime Prediction

slide-46
SLIDE 46

46

Overview

  • TinyDB: Queries for Sensor Nets
  • Processing Aggregate Queries (TAG)
  • Taxonomy & Experiments
  • Acquisitional Query Processing
  • Other Research
  • Future Directions
slide-47
SLIDE 47

47

Sensor Network Challenge Problems

  • Temporal aggregates
  • Sophisticated, sensor

network specific aggregates

– Isobar Finding – Vehicle Tracking – Lossy compression

» Wavelets

Hellerstein, Hong, Madden, and Stanek. Beyond Average. IPSN 2003 (to appear)

“Isobar Finding”

slide-48
SLIDE 48

48

Additional Research

  • Sensors, TinyDB, TinyOS

– This Talk:

» TAG (OSDI 2002) » ACQP (SIGMOD 2003) » WMCSA 2002 » IPSN 2003

– TOSSIM. Levis, Lee, Woo, Madden, & Culler. (In submission) – TinyOS contributions: memory allocator, catalog, network reprogramming, OS support, releases, TinyDB

slide-49
SLIDE 49

49

Other Research (Cont)

  • Stream Query Processing

– CACQ (SIGMOD 2002)

» Madden, Shah, Hellerstein, & Raman

– Fjords (ICDE 2002)

» Madden & Franklin

– Java Experiences Paper (SIGMOD Record, December 2001)

» Shah, Madden, Franklin, and Hellerstein

– Telegraph Project, FFF & ACM1 Demos

» Telegraph Team

slide-50
SLIDE 50

50

TinyDB Deployments

  • Initial efforts:

– Network monitoring – Vehicle tracking

  • Ongoing deployments:

– Environmental monitoring – Generic Sensor Kit – Building Monitoring – Golden Gate Bridge

slide-51
SLIDE 51

51

Overview

  • TinyDB: Queries for Sensor Nets
  • Processing Aggregate Queries (TAG)
  • Taxonomy & Experiments
  • Acquisitional Query Processing
  • Other Research
  • Future Directions
slide-52
SLIDE 52

52

TinyDB Future Directions

  • Expressing lossiness

– No longer a closed world!

  • Additional Operations

– Joins – Signal Processing

  • Integration with Streaming DBMS

– In-network vs. external operations

  • Heterogeneous Nodes and Operators
  • Real Deployments
slide-53
SLIDE 53

53

Contributions & Summary

  • Declarative Queries via TinyDB

– Simple, data-centric programming abstraction – Known to work for monitoring, tracking, mapping

  • Sensor network contributions

– Network as a single queryable entity – Power-aware, in-network query processing – Taxonomy: Extensible aggregate optimizations

  • Query processing contributions

– Acquisitional Query Processing – Framework for new issues in acquisitional systems, e.g.:

» Sampling as an operator » Languages, indices, approximations to control

when, where, and what data is acquired + processed by the system

  • Consideration of database, network, and device issues

http://telegraph.cs.berkeley.edu/tinydb

slide-54
SLIDE 54

54

Questions?