star self tuning aggregation for scalable monitoring
play

STAR: Self-Tuning Aggregation for Scalable Monitoring [On job - PowerPoint PPT Presentation

STAR: Self-Tuning Aggregation for Scalable Monitoring [On job market next year] Navendu Jain, Dmitry Kit, Prince Mahajan, Praveen Yalagandula , Mike Dahlin, and Yin Zhang University of Texas at Austin HP Labs Motivating Application


  1. STAR: Self-Tuning Aggregation for Scalable Monitoring [On job market next year] Navendu Jain, Dmitry Kit, Prince Mahajan, Praveen Yalagandula † , Mike Dahlin, and Yin Zhang University of Texas at Austin † HP Labs

  2. Motivating Application  Network traffic monitoring: Detect Heavy Hitters 0.1% threshold \ Traffic Stream Frequency Counts Identify flows that account for a significant fraction (say 0.1%) of the network traffic 2

  3. Global Heavy Hitters  Distributed Heavy Hitter detection • Monitor flows that account for a significant fraction of traffic across a collection of routers Node 1 0.1% + Frequencies threshold + + Node N Flows Aggregate Sum 3

  4. Broader Goal  Scalable Distributed Monitoring • Monitor, query, and react to changes in global state - Examples: Network monitoring, Grid monitoring, Job scheduling, Efficient Multicast, Distributed quota management, sensor monitoring and control, ... Financial apps Grids Multicast Sensor Networks IP Traffic Quota Management 4

  5. System Model Adaptive filters [Olston SIGMOD ’03], Astrolabe [VanRenesse TOCS ’03], Key Challenges TAG [Madden OSDI ’02], TACT [Yu TOCS ’02] Large-scale: nodes, attributes (e.g., flows) Arithmetic query approximation Robustness to dynamic workloads • Exact query answers are not needed! Cost of adjustment • Trade accuracy for communication/processing cost Monitor Query (S 1 ,…,S m ) Coordinator Adjust Push updates update S 1 S m Filters Data outside Streams range 5

  6. Our Contribution: STAR A scalable self-tuning algorithm to adaptively set the accuracy of aggregate query results • Flexible precision-communication cost tradeoffs Approach • Aggregation Hierarchy - Split filters flexibly across leaves, internal nodes, root • Workload-Aware Approach - Use variance, update rate to compute optimal filters • Cost-Benefit Analysis - Throttle redistribution 6

  7. Talk Outline Motivation STAR Design Aggregation Hierarchy Self-Tuning Filter Budgets Estimate Optimal Budgets Cost-Benefit Throttling Evaluation and Conclusions

  8. Background: Aggregation PIER [Huebsch VLDB ‘03], SDIMS [Yalagandula SIGCOMM ’04], Astrolabe [VanRenesse TOCS ’03], TAG [Madden OSDI ’02] Fundamental abstraction for scalability • Sum, count, avg, min, max, select, ... • Summary view of global state • Detailed view of nearby state and rare events 37 L3 18 19 L2 SUM 7 11 7 12 L1 3 4 2 9 6 1 9 3 L0 Physical Nodes (Leaf sensors) 8

  9. Setting Filter Budgets  Guarantees • Given an error budget , report a range s.t. (1) H L (2) δ root δ self δ root ( self ) L3 δ c 1 δ c 1( self ) δ c 2( self ) δ c 2 L2 δ c 3 δ c 4 δ c 4 δ c 5 δ c 6 L1 L0 9

  10. Aggregation Hierarchy [6,11] δ root = 5 Node R [4+3, 6+4] Node A Node B [4,6] [3,4] 10

  11. Aggregation Hierarchy Filtered [6,11] δ root = 5 Node R [4+4, 6+5] [4,5] Node A Node B Update 6 [4,6] 5 [3,4] [4,5] Sent Filtered 11

  12. Talk Outline Motivation STAR Design Aggregation Hierarchy Self-Tuning Error Budgets Estimate Optimal Budgets Cost-Benefit Throttling Evaluation and Conclusions

  13. How to Set Budgets? Goal: Self-tuning  Ideal distribution • Send budget to where filtering needed/effective - Large variance of inputs --> Require more budget to filter - Higher update rate of inputs --> Higher load to monitor 13

  14. Self-tuning Budgets: Single Node δ ≤ σ δ > σ Message Load Error Budget  Quantify filtering gain • Chebyshev’s inequality • Expected message cost M ( δ ) = 14

  15. Self-tuning Budgets: Hierarchy  Single-level tree • Estimate optimal filter budget - Optimization problem: Min. msg cost under fixed budget - Solution: δ T M( δ n ) Expected msg cost M( δ 1 ) M( δ 2 ) … Filter budgets δ c1 δ c2 δ cn u 1 u 2 u n Update rate 15

  16. Talk Outline Motivation STAR Design Aggregation Hierarchy Self-Tuning Filter Budgets Estimate Optimal Budgets Cost-Benefit Throttling Evaluation and Conclusions

  17. Redistribution Cost Monitor Query (S 1 ,…,S m ) Coordinator Adjust S 1 S m Filters Data Streams 17 5

  18. When to Redistribute Budgets? Total Load Message Load Redistribution Load Monitoring Load Frequency of Budget Distribution  More frequent redistribution • More closely approx. ideal distribution (current load) • Heavier redistribution overhead 18

  19. Cost-Benefit Throttling 1. Load Imbalance 2. Long-lasting Imbalance T current – T time_last_redist M( δ current ) – M( δ ideal ) Charge: (M( δ current ) – M( δ ideal )) * ( T current – T time_last_redist ) Rebalance if Charge > Threshold 19

  20. Talk Outline Motivation STAR Design Aggregation Hierarchy Self-Tuning Filter Budgets Estimate Optimal Budgets Cost-Benefit Throttling Evaluation and Conclusions

  21. Experimental Evaluation STAR prototype • Built on top of SDIMS aggregation [Yalagandula ‘04] • FreePastry as the underlying DHT [Rice Univ./MPI] • Testbeds - CS Department, Emulab, and PlanetLab Questions • Does arithmetic approximation reduce load? • Does self-tuning yield benefits and approximate ideal? 21

  22. Methodology  Simulations • Quantify load reduction due to self-tuning budgets under varying workload distributions  App:Distributed Heavy Hitter detection (DHH) • Find top-100 destination IPs receiving highest traffic • Abilene traces for 1 hour (3 routers); 120 nodes - Netflow data logged every 5 minutes 22

  23. Does Throttling Redistribution Benefit? 90/10 synthetic workload • Self-Tuning: Much better than uniform • Throttling: Adaptive filters [Olsten ‘03] wastes messages on useless adjustments A Uea U f id Ac c f cN m a f e ASN g-U a c m Til Message Cost per second STAR Adaptive filters 1.A 10x load reduction 1.1A STAR 1.11A 1.A A A1 23 Error Budget to Noise ratio

  24. Does Self-Tuning Approximate Ideal? Uniform noise workload • Self-tuning approximates uniform allocation • Avoid useless readjustments 1 Adaptive Message Cost per second 0.1 filters 0.01 Uniform Uniform Allocation allocation Adap-filters (freq = 5) 0.001 Adap-filters (freq = 10) STAR Adap-filters (freq = 50) STAR 1e-04 0.1 1 10 100 24 Error Budget to Noise ratio Error Budget to Noise ratio

  25. Abilene Workload  80K flows send about 25 million updates in 1 hr • Centralized server needs to process 7K updates/sec • Heavy tailed distribution 60% flows 99% flows 40% flows 99% flows send < 1KB send < 330KB send 1 IP pkt send < 2k pkt 100 100 CDF (% of flows) CDF (% of flows) 10 10 Flow value distribution Flow updates distribution 1 1 1 100 10000 1e+06 1 100 10000 25 Flow value (KB) Number of updates

  26. DHH: Does Self-Tuning Reduce Load?  Self-tuning significantly reduces load 7 msgs/node/sec 100 Message Cost per second 10 3x load 1 reduction 10x load 0.1 BW(Root_share=0%) reduction BW(Root_share=50%) BW(Root_share=90%) BW(Root_share=100%) 0.01 0 5 10 15 20 26 AI Error Budget (% max flow value)

  27. STAR Summary  Scalable self-tuning setting of filter budgets • Hierarchical Aggregation - Flexible divide budgets across leaves, internal nodes, root • Workload-Aware Approach - Use variance, update rate to estimate optimal budgets • Cost-Benefit Throttling - Send budgets where needed 27

  28. Thank you! http://www.cs.utexas.edu/~nav/star nav@cs.utexas.edu 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend