1
A Study of Deadline Scheduling for Client-Server Systems
- n the Computational Grid
A Study of Deadline Scheduling for Client-Server Systems on the - - PowerPoint PPT Presentation
A Study of Deadline Scheduling for Client-Server Systems on the Computational Grid Atsuko Takefusa, JSPS/TITECH Henri Casanova, UCSD/SDSC Satoshi Matsuoka, TITECH/JST Francine Berman, UCSD/SDSC http://ninf.is.titech.ac.jp/bricks/ 1 The
1
2
A promising platform for the
A crucial issue is Scheduling
Most scheduling works aim at improving
3
Grid software which provides a service on the
e.g. Ninf, NetSolve, Nimrod
Client-server architecture RPC-style programming model Many high-profile applications from science
Molecular biology, genetic information, operations
4
Resource economy model (E.g. [Zhao and
Nimrod [abramson ’00] presents a study of
5
Our goal is to minimize
The overall occurrences of deadline misses The resource cost
Each request comes with a deadline
Deadline-scheduling algorithm under simple
Simulation on Bricks
6
Overview of Bricks and its improvement
More scalable and realistic simulations
A Deadline-scheduling algorithm for multi-
Load Correction mechanism Fallback mechanism
Experiments in multi-client multi-server
Resource load, resource cost, conservatism of
7
A Grid simulation framework to evaluate
Scheduling algorithms Scheduling framework components
Bricks provides
Reproducible and controlled Grid evaluation
Flexible setups of simulation environments (Grid
Evaluation environment for external Grid
8
Client Client
Server Network Network Network Network
NetworkPredictor ServerPredictor
9
Client
Server
Client
Server
Client
Server
Client
Server
Client Client
Server
Client
Server
Client
Server
Client
Server
Client
Server Server
Client
10
Many NES scheduling strategies ?
Deadline-scheduling:
Aims at meeting user-supplied job deadline
11
Wsend, Wrecv, Ws: send/recv data size, and logical comp. cost Psend, Precv, Pserv: estimated send/recv throughput, and performance
Server 1 Server 2 Server 3 Comp. Estimated job execution time Send Recv now Tuntil deadline
Deadline
12
Tuntil deadline Ttarget Server 1 Server 2 Server 3 Comp. Estimated job execution time Send Recv
13
Accuracy of predictions is not guaranteed Monitoring systems do not perceive load
Tasks might be out-of-order in FCFS queues
14
Scheduling decisions will result in an increase
Server can estimate whether it will be able to
15
Modify load predictions from monitoring system,
server Si
NetworkPredictor ServerPredictor
Corrected prediction
16
Server can estimate whether it will be able to
Fallback happens when:
Tsend : Comm. duration (send) ETexec, ETrecv: Estimated comm. (recv) and comp. duration
Fallback Re-submit
17
Experiments in multi-client multi-server
Resource load, resource cost, conservatism of
Performance criteria:
Failure rate: Percentage of requests that missed
Resource cost: Avg. resource cost over all
18
Greedy: Typical NES scheduling strategy Deadline (Opt = 0.5, 0.6, 0.7, 0.8, 0.9) Load Correction (on/off) Fallback (Nmax fallbacks = 0/1/2/3/4/5)
19
Grid Computing Environment (?75 nodes, 5 Grids)
# of local domain: 10, # of local domain nodes: 5-10
Characteristics of client jobs
Send/recv data size: 100-5000[Mbits] # of instructions: 1.5-1080[Gops]
60(high load), 90(medium load), 120(low load) [min]
20
The Presto II cluster:
Dual Pentium III 800MHz Memory: 640MB Network: 100Base/TX
Use APST[Casanova ’00] to
24 hour simulation x 2,500 runs
21
10 20 30 40 50 60 70 Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9 Failure Rate [%]
x/x L/x x/F L/F
22
“Low” load leads to improved failure rates All show similar characteristics
10 20 30 40 50 60 70 Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9 Failure Rate[%] x/x L/x x/F L/F 10 20 30 40 50 60 70 Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9 Failure Rate [%] x/x L/x x/F L/F 10 20 30 40 50 60 70 Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9 Failure Rate [%]
x/x L/x x/F L/F
23
50 100 150 200 250 300 350 400 450 500 Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9
x/x L/x x/F L/F
24
10 20 30 40 50 60 70 Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9 Failure Rate [%] 1 2 3 4 5
25
50 100 150 200 250 300 350 400 450 500 Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9
1 2 3 4 5
26
Economy model:
Nimrod [abramson ’00]
Uses a self-scheduler Targets parameter sweep apps. from a single user
Grid performance evaluation systems:
MicroGrid [Song ’00]
Emulates a virtual Globus Grid on an actual cluster Not appropriate for large numbers of experiments
Simgrid [Casanova ’01]
A trace-based discrete event simulator Provides primitives for simulation of application scheduling Lacks the network-modeling feature Bricks provides
27
Proposed a deadline-scheduling algorithm for
Investigated performance in multi-client multi-
The experiments showed
It is possible to make a trade-off between failure-
Load Correction may not be useful Future NES systems should use deadline-scheduling
28
Make Bricks support more sophisticated
Investigate their feasibility and improve our
Implement the deadline-scheduling algorithm