The Gain of Resource Delegation in Distributed Computing - - PowerPoint PPT Presentation
The Gain of Resource Delegation in Distributed Computing - - PowerPoint PPT Presentation
technische universitt Robotics Research Institute dortmund The Gain of Resource Delegation in Distributed Computing Environments Alexander Flling, Christian Grimme, Joachim Lepping, and Alexander Papaspyrou 15th Workshop on Job Scheduling
technische universität dortmund
Robotics Research Institute
Outline
- Motivation
- System Model
- Resource Delegation Policy
- Evaluation
- Setup
- Results
- Conclusion and Future Work
2 Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Motivation
- Distributed computing infrastructures (DCI) have reached
production status
- More and more users draw its computing resources from Grid and
Cloud infrastructures
- Many DCIs are exhaustively used and produce significant revenue
- Cloud-Infrastructures allow easy on-demand provisioning of
resources (enlargement of local resource space)
- Infrastructure as a Service (IaaS) by virtualization technology
- Simple access and pricing model
- The temporal extension of the local resource space allows more
flexible scheduling decisions
- Locally, no traditional parallel job scheduling problem with parallel
machines (Pm, Rm, Qm - Model)
- On-demand resource leasing may improve scheduling performance
3 Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
System Model
4 Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
System Model
5 Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
System Model
6
Resource Negotiation
Workload- Analysis Workload- Analysis Forward Jobs Forward Jobs
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
System Model
7
Resource Negotiation
Workload- Analysis Workload- Analysis Forward Jobs Forward Jobs Remote Access Remote Access
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Properties of Resource Delegation
- Different from centralized scheduling with multi-site execution
- No central scheduling component but independent sites
- Scheduler cedes full control to other schedulers when resource
access is granted (for a certain period)
- Resource leasing enlarges the local resource space
- Scheduling decisions are exclusively made by local schedulers
- Resources might be used immediately or later during leasing period
- Advanced scheduling strategies may support both
- local allocations under varying machine sizes
- planning of future resource requirements
- Each participant in a DCI is both resource consumer and resource
provider
8 Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Submission Triggered Resource Delegation Policy (ST-RDP)
9
Queue Schedule Schedule 1 2 3 Site 1 Site 2
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Submission Triggered Resource Delegation Policy (ST-RDP)
10 10
Forward Job To LRMS Queue Schedule Schedule 1 2 3 4 4 CPUs 100 Seconds Site 1 Site 2
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Submission Triggered Resource Delegation Policy (ST-RDP)
11 11
Forward Job To LRMS Check Resource Availability Queue Schedule Schedule 1 2 3 4 4 CPUs 100 Seconds Site 1 Site 2 2 CPUs idle
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Submission Triggered Resource Delegation Policy (ST-RDP)
12 12
Forward Job To LRMS Check Resource Availability [Enough] Queue Schedule Schedule 1 2 3 4 4 CPUs 100 Seconds Site 1 Site 2 2 CPUs idle
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Submission Triggered Resource Delegation Policy (ST-RDP)
13 13
Forward Job To LRMS Check Resource Availability [Not Enough] Try to Lend Resource Deficiency [Enough] Queue Schedule Schedule 1 2 3 4 4 CPUs 100 Seconds Site 1 Site 2 2 CPUs idle
?
Request 2 CPUs for 100 seconds
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Submission Triggered Resource Delegation Policy (ST-RDP)
14 14
Forward Job To LRMS Check Resource Availability [Not Enough] Try to Lend Resource Deficiency [Enough] [Request Denied] Queue Schedule Schedule 1 2 3 4 4 CPUs 100 Seconds Site 1 Site 2 2 CPUs idle
?
Request 2 CPUs for 100 seconds
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Submission Triggered Resource Delegation Policy (ST-RDP)
15 15
Forward Job To LRMS Check Resource Availability [Not Enough] Try to Lend Resource Deficiency [Enough] Prioritize New Job [Request Accepted] [Request Denied] Queue Schedule Schedule 1 2 3 4 4 CPUs 100 Seconds Site 1 Site 2 2 CPUs idle Request 2 CPUs for 100 seconds
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Submission Triggered Resource Delegation Policy (ST-RDP)
16 16
Forward Job To LRMS Check Resource Availability [Not Enough] Try to Lend Resource Deficiency [Enough] Prioritize New Job [Request Accepted] [Request Denied] Queue Schedule Schedule 1 2 3 Site 1 Site 2 4a 4b
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Evaluation Setup
- Input Data
- Real Workload Traces from Parallel Workloads Archive
- KTH, CTC, SDSC05
- ~ 100 – 1600 CPUs, ~ 28000 – 74000 Jobs (first 11 months)
- Local Resource Management System
- EASY Backfilling
- Evaluation objectives for results
- Improvements in AWRT
- Reconfiguration behavior
17 Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Results: ST-RDP Performance
5 10 15 20 25 Setup 1 Setup 2 Setup 3 AWR AWRT imp mproveme ments in % n % KTH-11 CTC-11 SDSC05-11
18 Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
200
Results: Reconfiguration behavior
19
- KTH and CTC 11 month with ST-RDP
CPUs CPUs Time in month Time in month
KTH CTC
500 400 300 200 400 300 100 2 4 6 8 10 2 4 6 8 10
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Conclusion
20
- Proposed new concept for resource delegation in DCIs
- Parallel job scheduling problems under varying machine sizes
- The resource requirements can be flexibly negotiated among
participants
- Evaluation of a simple resource delegation method
- Without need for further information exchange
- Robust in changing environments
- Results show significant benefits for the local scheduling
(improvement in AWRT)
- During operation, many resources are delegated among sites
Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Future Work
- Application to larger DCI environments
- Considering additional location policies that decides which site to ask
first for delegation
- Long term planning of resource leasing/delegation
- Not only single job decisions
- Decisions should be based on workload records (user behavior,
submission patterns etc.)
- Eventually, make decision on predicted user behavior
- Consider additional parameters like local queue/schedule status
21 Alexander Fölling | April 23, 2010
technische universität dortmund
Robotics Research Institute
Thank You
22