STAR: Self-Tuning Aggregation for Scalable Monitoring [On job - - PowerPoint PPT Presentation

star self tuning aggregation for scalable monitoring
SMART_READER_LITE
LIVE PREVIEW

STAR: Self-Tuning Aggregation for Scalable Monitoring [On job - - PowerPoint PPT Presentation

STAR: Self-Tuning Aggregation for Scalable Monitoring [On job market next year] Navendu Jain, Dmitry Kit, Prince Mahajan, Praveen Yalagandula , Mike Dahlin, and Yin Zhang University of Texas at Austin HP Labs Motivating Application


slide-1
SLIDE 1

STAR: Self-Tuning Aggregation for Scalable Monitoring

Navendu Jain, Dmitry Kit, Prince Mahajan, Praveen Yalagandula†, Mike Dahlin, and Yin Zhang University of Texas at Austin

†HP Labs

[On job market next year]

slide-2
SLIDE 2
  • Network traffic monitoring: Detect Heavy Hitters

Motivating Application

Traffic Stream Frequency Counts

\

0.1% threshold

Identify flows that account for a significant fraction (say 0.1%) of the network traffic

2

slide-3
SLIDE 3

Global Heavy Hitters

  • Distributed Heavy Hitter detection
  • Monitor flows that account for a significant

fraction of traffic across a collection of routers

Frequencies

+ + +

Flows Node 1 Node N Aggregate Sum 0.1% threshold

3

slide-4
SLIDE 4

Broader Goal

  • Scalable Distributed Monitoring
  • Monitor, query, and react to changes in global state
  • Examples: Network monitoring, Grid monitoring, Job

scheduling, Efficient Multicast, Distributed quota management, sensor monitoring and control, ...

Sensor Networks IP Traffic Financial apps Multicast Quota Management Grids

4

slide-5
SLIDE 5

System Model

Adaptive filters [Olston SIGMOD ’03], Astrolabe [VanRenesse TOCS ’03], TAG [Madden OSDI ’02], TACT [Yu TOCS ’02]

Arithmetic query approximation

  • Exact query answers are not needed!
  • Trade accuracy for communication/processing cost

Filters

5

Data Streams S1 Sm Monitor Query(S1,…,Sm)

Coordinator

Adjust Push

Key Challenges Large-scale: nodes, attributes (e.g., flows) Robustness to dynamic workloads Cost of adjustment

updates

  • utside

range update

slide-6
SLIDE 6

A scalable self-tuning algorithm to adaptively set the accuracy of aggregate query results

  • Flexible precision-communication cost tradeoffs

Approach

  • Aggregation Hierarchy
  • Split filters flexibly across leaves, internal nodes, root
  • Workload-Aware Approach
  • Use variance, update rate to compute optimal filters
  • Cost-Benefit Analysis
  • Throttle redistribution

Our Contribution: STAR

6

slide-7
SLIDE 7

Talk Outline

Motivation STAR Design

Aggregation Hierarchy Self-Tuning Filter Budgets

Estimate Optimal Budgets Cost-Benefit Throttling

Evaluation and Conclusions

slide-8
SLIDE 8

Background: Aggregation

PIER [Huebsch VLDB ‘03], SDIMS [Yalagandula SIGCOMM ’04], Astrolabe [VanRenesse TOCS ’03], TAG [Madden OSDI ’02]

Fundamental abstraction for scalability

  • Sum, count, avg, min, max, select, ...
  • Summary view of global state
  • Detailed view of nearby state and rare events

18 19 37 Physical Nodes (Leaf sensors) L1 L2 L3 L0 7 11 7 12 3 4 2 9 6 1 9 3 SUM

8

slide-9
SLIDE 9
  • Guarantees
  • Given an error budget , report a range s.t.

δself

Setting Filter Budgets

(1)

9

δc4 δc2(self) δc1(self)

(2)

L H

L1 L2 L3 L0

δroot

δc1

δc2

δc3

δc5 δc6 δc4

δroot(self)

slide-10
SLIDE 10

Node A Node B Node R [4,6] [3,4] [4+3, 6+4] [6,11]

δroot = 5

10

Aggregation Hierarchy

slide-11
SLIDE 11

[4,6] Node A Node B Node R [3,4] 6 5 Filtered Filtered [4,5] Sent Update [4,5] [4+4, 6+5] [6,11]

δroot = 5

11

Aggregation Hierarchy

slide-12
SLIDE 12

Talk Outline

Motivation STAR Design

Aggregation Hierarchy Self-Tuning Error Budgets

Estimate Optimal Budgets Cost-Benefit Throttling

Evaluation and Conclusions

slide-13
SLIDE 13

Goal: Self-tuning

  • Ideal distribution
  • Send budget to where filtering needed/effective
  • Large variance of inputs --> Require more budget to filter
  • Higher update rate of inputs --> Higher load to monitor

How to Set Budgets?

13

slide-14
SLIDE 14
  • Quantify filtering gain
  • Chebyshev’s inequality
  • Expected message cost

Self-tuning Budgets: Single Node

Message Load

M(δ) =

Error Budget

14

δ > σ

δ ≤ σ

slide-15
SLIDE 15

Self-tuning Budgets: Hierarchy

  • Single-level tree
  • Estimate optimal filter budget
  • Optimization problem: Min. msg cost under fixed budget
  • Solution:

… δT δc1 δc2 δcn

u1 u2 un

M(δ1) M(δ2) M(δn)

Filter budgets Update rate Expected msg cost

15

slide-16
SLIDE 16

Talk Outline

Motivation STAR Design

Aggregation Hierarchy Self-Tuning Filter Budgets

Estimate Optimal Budgets Cost-Benefit Throttling

Evaluation and Conclusions

slide-17
SLIDE 17

Redistribution Cost

17 5

Filters Data Streams S1 Sm Monitor Query(S1,…,Sm)

Coordinator

Adjust

slide-18
SLIDE 18

When to Redistribute Budgets?

  • More frequent redistribution
  • More closely approx. ideal distribution (current load)
  • Heavier redistribution overhead

Message Load Frequency of Budget Distribution

Total Load Monitoring Load Redistribution Load

18

slide-19
SLIDE 19

19

Cost-Benefit Throttling

M(δcurrent) – M(δideal)

  • 1. Load Imbalance
  • 2. Long-lasting Imbalance

Tcurrent – Ttime_last_redist

Charge: (M(δcurrent) – M(δideal)) * (Tcurrent – Ttime_last_redist) Rebalance if Charge > Threshold

slide-20
SLIDE 20

Talk Outline

Motivation STAR Design

Aggregation Hierarchy Self-Tuning Filter Budgets

Estimate Optimal Budgets Cost-Benefit Throttling

Evaluation and Conclusions

slide-21
SLIDE 21

Experimental Evaluation

STAR prototype

  • Built on top of SDIMS aggregation [Yalagandula ‘04]
  • FreePastry as the underlying DHT [Rice Univ./MPI]
  • Testbeds
  • CS Department, Emulab, and PlanetLab

Questions

  • Does arithmetic approximation reduce load?
  • Does self-tuning yield benefits and approximate ideal?

21

slide-22
SLIDE 22

Methodology

  • Simulations
  • Quantify load reduction due to self-tuning budgets

under varying workload distributions

  • App:Distributed Heavy Hitter detection (DHH)
  • Find top-100 destination IPs receiving highest traffic
  • Abilene traces for 1 hour (3 routers); 120 nodes
  • Netflow data logged every 5 minutes

22

slide-23
SLIDE 23

1.11A 1.1A 1.A A 1.A A A1 Uea U f id Ac c f cN m a f e ASN g-U a c m Til STAR

90/10 synthetic workload

  • Self-Tuning: Much better than uniform
  • Throttling: Adaptive filters [Olsten ‘03] wastes messages on

useless adjustments

Does Throttling Redistribution Benefit?

10x load reduction

23

STAR Adaptive filters

Message Cost per second Error Budget to Noise ratio

slide-24
SLIDE 24

Uniform noise workload

  • Self-tuning approximates uniform allocation
  • Avoid useless readjustments

Does Self-Tuning Approximate Ideal?

1e-04 0.001 0.01 0.1 1 0.1 1 10 100 Error Budget to Noise ratio Uniform Allocation Adap-filters (freq = 5) Adap-filters (freq = 10) Adap-filters (freq = 50) STAR

24

STAR Uniform allocation Adaptive filters

Message Cost per second Error Budget to Noise ratio

slide-25
SLIDE 25

1 10 100 1 100 10000 1e+06 CDF (% of flows) Flow value (KB) Flow value distribution 1 10 100 1 100 10000 CDF (% of flows) Number of updates Flow updates distribution

  • 80K flows send about 25 million updates in 1 hr
  • Centralized server needs to process 7K updates/sec
  • Heavy tailed distribution

Abilene Workload

60% flows send < 1KB 40% flows send 1 IP pkt 99% flows send < 330KB 99% flows send < 2k pkt

25

slide-26
SLIDE 26
  • Self-tuning significantly reduces load

DHH: Does Self-Tuning Reduce Load?

0.01 0.1 1 10 100 5 10 15 20 Message Cost per second AI Error Budget (% max flow value) BW(Root_share=0%) BW(Root_share=50%) BW(Root_share=90%) BW(Root_share=100%)

3x load reduction

7 msgs/node/sec

10x load reduction

26

slide-27
SLIDE 27

STAR Summary

  • Scalable self-tuning setting of filter budgets
  • Hierarchical Aggregation
  • Flexible divide budgets across leaves, internal nodes, root
  • Workload-Aware Approach
  • Use variance, update rate to estimate optimal budgets
  • Cost-Benefit Throttling
  • Send budgets where needed

27

slide-28
SLIDE 28

Thank you!

http://www.cs.utexas.edu/~nav/star nav@cs.utexas.edu

28