1
1/27
Resource Availability Prediction in Fine-Grained Cycle Sharing Systems
Xiaojuan Ren, Seyong Lee, Rudolf Eigenmann, Saurabh Bagchi School of ECE, Purdue University Presented by: Saurabh Bagchi
Work supported by National Science Foundation
2/27
Resource Availability Prediction in Fine-Grained Cycle Sharing - - PDF document
Resource Availability Prediction in Fine-Grained Cycle Sharing Systems Xiaojuan Ren, Seyong Lee, Rudolf Eigenmann, Saurabh Bagchi School of ECE, Purdue University Presented by: Saurabh Bagchi Work supported by National Science Foundation
1/27
Work supported by National Science Foundation
2/27
3/27
4/27
5/27
6/27
7/27
8/27
9/27
slowdown
10/27
effect on host job due to contention from the guest job
11/27
host group
systems
Th1 Th2
Observed machine CPU usage%
no UEC no UEC minimized guest priority UEC So, terminate guest
12/27
– Guest applications: SPEC CPU2000 benchmark suite – Host workload: Musbus Unix benchmark suite – 300 MHz Solaris Unix machine with 384 MB physical memory – Measure host CPU reduction by running a guest application together with a set of host workload
– Memory thrashing happens when processes desire more memory than the system has – Impacts of CPU and memory contention can be isolated – The two thresholds, Th1 and Th2 can still be applied to quantify CPU contention
13/27
S1: Machine CPU load is [0%,Th1] S2: Machine CPU load is (Th1,Th2] S3: Machine CPU load is (Th2 ,100%] -- UEC S4: Memory thrashing -- UEC S5: Machine unavailability -- URR
S1 S3 S2 S4 S5
14/27
The probability that resource will be available throughout a future time window
Daily patterns of host workloads are comparable among recent days
15/27
16/27
S : state Q : transition probability matrix
Qi,j = Pr { the process that has entered Si will enter Sj on its next transition };
H : holding time mass function matrix
Hi, j (m) = Pr { the process that has entered Si remains at Si for m time units before the next transition to Sj }
Pi, j (m) = Pr {S(t0+m)=j | S(t0)=i}
17/27
– Backward Kolmogorov integral equations
– Recursive equations
TR(W): the probability of not transferring to S3, S4, or S5 within an arbitrary time window, W of size T
∈
S k m j k k i i j i
, , ,
− = ∈ − =
1 1 , , 1 1 , 1 , ,
m l S k j k k i i m l j k k i j i
5 , 4 , 3 ,
init init init
Too inefficient for online prediction
18/27
Gateway
Job Scheduler
Predictor Guest Process Resource Monitor Host Node Client
Host Process Entity part of
19/27
Memory contention CPU contention 69-79% 283-356 0-3% 19-30% 100% Percentage 3-12 83-121 405-453 Frequency URR UEC Total amount Categories
20/27
21/27
*R. Wolski, N. Spring, and J. Hayes, The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing, JFGCS, 1999 ** Toolset from P. A. Dinda and D. R. O’Halaron. “An evaluation of linear models for host load prediction”. In
22/27
500 1000 1500 2000 2500 1 2 3 4 5 6 7 8 9 10
Time window length (hr) Total computation time (ms)
5 10 15 20 25 30
Q and H computation time (ms) Total computation time Q and H computation time
Less than 0.006% overhead to a remote job
23/27
5 10 15 20 25 30 35 40 1 2 3 4 5 6 7 8 9 10 Window length (Hr) Relative error of predicted TR (%) Average 5 10 15 20 25 30 35 40 1 2 3 4 5 6 7 8 9 10 Window length (Hr) Relative error of predicted TR (%) Average
Prediction on Weekdays Prediction on Weekends Relative error = abs(TRpredicted-TRempirical)/TRempirical
Predictions over 24 different time windows on 20 machines Accuracy is higher than 86% in average Accuracy is higher than 73% in the worst case 24/27
Last measured values LAST Autoregressive moving average models with p+q coefficients ARMA(p,q) Moving average models with p coefficients MA(p) Mean over the previous N values (N < p) BM(p) Purely autoregressive models with p coefficients AR(p) Description Model
0% 25% 50% 75% 100% 125% 150% 175% 200% 225% 250% 1 3 5 7 9
Time window length (hr) Relative error of predicted TR
SMP AR (8) BM (8) MA (8) ARMA (8,8) LAST
Resource Prediction System: http://www.cs.cmu.edu/~cmcl/remulac/remos.html
Maximum prediction errors over time windows starting at 8:00 am on weekdays
25/27
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 1 2 3 4 5 6 7 8 9 10
Amount of injected noise Prediction discrepancy
T = 1 hr T = 2 hr T = 3 hr T = 5 hr T = 10 hr
26/27
Sahoo, et. al., ACM SIGKDD, 2003
27/27
28/27
29/27
30/27
– CPU-intensive guest applications – host groups consisting of multiple processes with diverse CPU usage – 1.7 GHz Redhat Linux machine All processes have the same priority Guest process takes the lowest priority
0% 10% 20% 30% 40% 50% 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Host CPU usage in absence of guest process (L H ) Reduction rate of host CPU usage due to resource contention
1 host process 2 host processes 3 host processes 4 host processes 5 host processes 0% 10% 20% 30% 40% 50% 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Host CPU usage in absence of guest process (L H ) Reduction rate of host CPU usage due to resource contention
1 hsot process 2 host processes 3 host processes 4 host processes 5 host processes
(Th1, 5%) (Th2, 5%)
31/27
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2+1 0.2+0.9 0.2+0.8 0.2+0.7 0.1+1 0.1+0.9 0.1+0.8 0.1+0.7
CPU usage in isolation (Host+Guest) Actual CPU usage of the Guest process Equal priority Nice -19
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 9 1 3 8 0.05 0.1 0.1 5 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Degradation of host CPU usage under contention Host CPU usage in isolation Guest priority
32/27
system has
0% 5% 1 0% 1 5% 20% 25% 30% 35% 40% H1 H2 H3 H4 H5 H6
Host workload Reduction rate of host CPU usage due to resource contention apsi galgel bzip2 mcf
0% 5% 1 0% 1 5% 20% 25% 30% 35% 40% H1 H2 H3 H4 H5 H6
Host workload Reduction rate of host CPU usage due to resource contention apsi galgel bzip2 mcf
Guest process with priority 0 Guest process with priority 19
33/27
Linux CPU scheduler
34/27
current value of the series against one or more prior values of the series. p is the order of the AR model. Linear least squares techniques (Yule- Walker) are used for model fitting.
squared error
the current value of the series against the white noise or random shocks of one or more prior values of the series. Iterative non-linear fitting procedures (Powell’s methods) need to be used in place of linear least squares.
noise
length are used for prediction