Fingerprinting the datacenter: automated classification
- f performance crises
Peter Bodík1,3, Moises Goldszmidt3, Armando Fox1, Dawn Woodard4, Hans Andersen2
1RAD Lab, UC Berkeley 2Microsoft 3Research 4Cornell University
Fingerprinting the datacenter: automated classification of - - PowerPoint PPT Presentation
Fingerprinting the datacenter: automated classification of performance crises Peter Bodk 1,3 , Moises Goldszmidt 3 , Armando Fox 1 , Dawn Woodard 4 , Hans Andersen 2 1 RAD Lab, UC Berkeley 2 Microsoft 3 Research 4 Cornell University Crisis
1RAD Lab, UC Berkeley 2Microsoft 3Research 4Cornell University
2
OK OK CRISIS 3:00 AM 3:15 AM 4:15 AM next day
3
4
5
6
7
1: CPU utilization 2: workload 100: latency
server 1
1: CPU utilization 2: workload 100: latency
server 2
1: CPU utilization 2: workload 100: latency
server 1000
1: select relevant metrics 2: summarize using quantiles 3: map into hot/normal/cold 4: average over time OK OK CRISIS
time 1: CPU utilization 2: workload 100: latency
8
OK OK CRISIS
model input (all metrics) model output (binary)
9
CPU utilization 0% 100% # servers 95th percentile 25th percentile 50th percentile, median
10
10
time
12
OK OK CRISIS epochs
13
14
15
16
17
18
19