Lube: Mitigating Bottlenecks in Wide Area Data Analytics
Hao Wang* Baochun Li
iQua
HotCloud’17
Lube : Mitigating Bottlenecks in Hao Wang* Wide Area Data - - PowerPoint PPT Presentation
HotCloud17 Lube : Mitigating Bottlenecks in Hao Wang* Wide Area Data Analytics Baochun Li i Qua Wide Area Data Analytics DC Master Namenode Workers Datanodes 2 Wide Area Data Analytics Why wide area data analytics? Data Volume
HotCloud’17
2
Workers Datanodes DC Master Namenode
DC #2 DC #n Workers Workers Datanodes Datanodes
… … …
DC #1 Master Namenode
Why wide area data analytics?
Problems
2
10.8.3.3 (CT) 10.12.3.32 (TR) 10.4.3.5 (WT) 10.2.3.4 (TR) Bandwidth (Mbps) 100 200 300 400 500 0:00 6:00 12:00 18:00 0:00 6:00 12:00 Jan 1 Jan 2 10.6.3.3 (VC)
Measured by iperf on SAVI testbed https://www.savinetwork.ca/
3
1 301 601 901 1201 1501 1801 2101
Time (s)
node_4 node_3 node_2 node_1 0.4 0.2 0.0 0.2 0.4
Running Berkeley Big Data Benchmark
Collected by jvmtop Nodes in different DCs may have different resource utilizations
4
Bottlenecks
Bottlenecks emerges at runtime
Fluctuation Heterogeneity Data analytics performance
5
“Much of this performance work has been motivated by three widely-accepted mantras about the performance of data analytics — network, disk and straggler.” Making Sense of Performance in Data Analytics Frameworks NSDI’15, Kay Ousterhout
Existing optimization method does not consider runtime bottlenecks
6
Task queue Resource queue Mitigating bottlenecks
7
in bottleneck
Lube Master Bottleneck Info. Cache Lube Scheduler Available Worker Pool Lube Client Model Update Online Bottleneck Detector Training Pool Network I/O JVM more metrics Disk I/O Lightweight Performance Monitors Bottleneck Detector Submitted Task Queue
(worker, intensity)
Bottleneck-aware Scheduling
Three major components
8
Coefficients
Historical states Autoregressive (AR) + Moving Average(MA) Current state input
(time_1, mem_util) (time_2, mem_util) (time_t-1, mem_util) … (time_t, mem_util)
9
ARIMA(p, d, q)
past … … future t
{time_stamp: mem, net, cpu, disk}
A(aij) A(aij) B(bj(k)) B(bj(k)) … Q Od Od O2 O2 O1 O1
q1 q1 q2 q2 qi qi qj qj
Ok Ok O
Hidden Markov Model
Sliding Hidden Markov Model
for outdated observations To make HMM online
10
Memory utilization of executor processes Network utilization of datanode processes CPU utilization of executor processes Disk (SSD) utilization of datanode processes Time (s)
Built-in task schedulers:
Bottleneck-aware scheduler:
A single worker node is bottlenecked continuously while all nodes are rarely bottlenecked at the same time
11
PUBLISH + HSET metric {time: val} (e.g, iotop {time: I/O}) SUBSCRIBE metric_1 metric_2 … HSET worker_id {time: {metric: val_ob, val_inf}} HGET worker_id time
Worker Redis Server iotop jvmtop nethogs Master Redis Server Lube Scheduler … Master Node Worker Nodes Bottleneck Detection Module APIs:
Implementation
Deployment
12
Query-1 Hit Rate (%) 60 80 100 a b c Query-2 Hit Rate (%) 60 80 100 a b c Query-3 Hit Rate (%) 60 80 100 a b c Query-4 Hit Rate (%) 60 80 100 ARIMA SlidHMM
Calculation ARIMA ignores nonlinear patterns
hitrate = #((time, detection)∩(time, observation)) #(time, detection)
13
Query-1 Time (ms) 0.5 1.0 2 × 1 05 4 × 1 05 Query-2 Time (ms) 0.5 1.0 2 × 1 05 4 × 1 05 Query-3 Time (ms) 0.5 1.0 2 × 1 05 4 × 1 05 Query-4 Time (ms) 0.5 1.0 2 × 1 05 4 × 1 05 Pure Spark Lube-ARIMA Lube-SlidHMM
14
Task completion times Average 75th Lube-ARIMA 12.454s 22.075s Lube-SlidHMM 14.783s 27.469s
Pure Spark ARIMA + Spark SlidHMM + Spark Lube-ARIMA Lube-SlidHMM Query-1 1000 1200 1400 1600 Query-2 1000 1200 1400 1600 Query-3 800 1000 1200 1400 1600 1800 Query-4 1000 1200 1400 1600 1800 Time (s) Time (s)
15
Query completion times
time by up to 33% Control Groups for overhead
16
Bottleneck detection models
Learning, LSTM Bottleneck-aware scheduling
WAN conditions
18