Dynamic Fractional Resource Scheduling for HPC Workloads Mark - PowerPoint PPT Presentation

Scheduling DFRS Heuristics Experiments Conclusions Dynamic Fractional Resource Scheduling for HPC Workloads Mark Stillwell 1 Frédéric Vivien 2 , 1 Henri Casanova 1 1 Department of Information and Computer Sciences University of Hawai’i at M¯ anoa 2 INRIA, France Invited Talk, October 8, 2009 M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions Formalization HPC Job Scheduling Problem 0 < N homogeneous nodes 0 < J jobs, each job j has: arrival time 0 ≤ r j 0 < t j ≤ N tasks compute time 0 < c j J not known r j and t j not known before r j c j not known until j completes M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions Formalization Schedule Evaluation make span not relevant for unrelated jobs flow time over-emphasizes very long jobs stretch re-balances in favor of short jobs average stretch prone to starvation max stretch helps with average while bounding worst case M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions Current Approaches Current Approaches Batch Scheduling, which no one likes usually FCFS with backfilling backfilling needs (unreliable) compute time estimates unbounded wait times poor resource utilization No particular objective Gang Scheduling, which no one uses globally coordinated time sharing complicated and slow memory pressure a concern M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions Dynamic Fractional Resource Scheduling VM Technology basically, time sharing pooling of discrete resources (e.g., multiple CPUs) hard limits on resource consumption job preemption and task migration M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions Dynamic Fractional Resource Scheduling Problem Formulation extends basic HPC problem jobs now have per-task CPU need α j and memory requirement m j multiple tasks can run on one node if total memory requirement ≤ 100 % job tasks must be assigned equal amounts of CPU resource assigning less than the need results in proportional slowdown assigned allocations can change no run-time estimates so we need another metric to optimize M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions Dynamic Fractional Resource Scheduling Yield Definition The yield , y j ( t ) of job j at time t is the ratio of the CPU allocation given to the job to the job’s CPU need. requires no knowledge of flow or compute times can be optimized for at each scheduling event maximizing minimum yield related to minimizing maximum stretch How do we keep track of job progress when the yield can vary? M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions Dynamic Fractional Resource Scheduling Virtual Time Definition The virtual time v j ( t ) of job j at time t is the subjective time experienced by the job. � t v j ( t ) = r j y j ( τ ) d τ job completes when v j ( t ) = c j M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions Dynamic Fractional Resource Scheduling The Need for Preemption final goal is to minimize maximum stretch without preemption, stretch of non-clairvoyant on-line algorithms unbounded consider 2 jobs both require all of the system resources one has c j = 1 other has c j = ∆ need criteria to decide which jobs should be preempted M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions Dynamic Fractional Resource Scheduling Priority Jobs should be preempted in order by increasing priority. newly arrived jobs may have infinite priority 1 / v j ( t ) performs well, but subject to starvation ( t − r j ) / v j ( t ) time avoids starvation, but does not perform well ( t − r j ) / ( v j ( t )) 2 seems a reasonable compromise other possibilities exist M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions Greedy Heuristics Greedy Scheduling Heuristics G REEDY – Put tasks on the host with the lowest CPU demand on which it can fit into memory; new jobs may have to be resubmitted using bounded exponential backoff. G REEDY - PMTN – Like G REEDY , but older tasks may be preempted G REEDY - PMTN - MIGR – Like G REEDY - PMTN , but older tasks may be migrated as well as preempted M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions MCB Heuristics Connection to multi-capacity bin packing For each discrete scheduling event: problem similar to multi-capacity (vector) bin packing, but has optimization target and variable CPU allocations can formulate as an MILP [Stillwell et al., 2009] (NP-complete) relaxed LP heuristics slow, give low quality solutions M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions MCB Heuristics Applying MCB heuristics yield is continuous, so choose a granularity (0.01) perform a binary search on yield, seeking to maximize for each fixed yield, set CPU requirement and apply heuristic found yield is the maximized minimum, leftover CPU used to improve average if a solution cannot be found at any yield, remove the lowest priority job and try again M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Scheduling DFRS Heuristics Experiments Conclusions MCB Heuristics M CB 8 Heuristic Based on [Leinberger et al., 1999], simplified to 2-dimensional case: 1 Put job tasks in two lists: CPU-intensive and memory-intensive 2 Sort lists by “some criterion”. (M CB 8: descending order by maximum) 3 Starting with the first host, pick tasks that fit in order from the list that goes against the current imbalance. Example: current host tasks total 50% CPU and 60% memory Assign the next task that fits from the list of CPU-intensive jobs. 4 When no tasks can fit on a host, go to the next host. 5 If all tasks can be placed, then success, otherwise failure. M Stillwell, F Vivien, H Casanova UH Manoa ICS, INRIA Dynamic Fractional Resource Schedulingfor HPC Workloads

Dynamic Fractional Resource Scheduling for HPC Workloads Mark - PowerPoint PPT Presentation

Scheduling DFRS Heuristics Experiments Conclusions Dynamic Fractional Resource Scheduling for HPC Workloads Mark Stillwell 1 Frdric Vivien 2 , 1 Henri Casanova 1 1 Department of Information and Computer Sciences University of Hawaii at

Dynamic Fractional Resource Scheduling for HPC Workloads Mark Stillwell 1 eric Vivien 2 Henri

HPC @ SAO S.G. Korzennik - SAO HPC Analyst hpc@cfa February 2013 SGK ( hpc@cfa ) HPC @ SAO

Uni.lu HPC School 2020 PS6: HPC Containers: Singularity Uni.lu High Performance Computing (HPC)

Efficient Numerical Methods for Fractional Laplacian and time fractional PDEs Jie Shen Purdue

UL HPC School 2017 PS5: Advanced Scheduling with SLURM and OAR on UL HPC clusters UL High

The HPC Skill Tree A Brief Overview Kai Himstedt On Behalf of the HPC-CF Board BoF:

for HPC workloads Key Liao Center for HPC Shanghai Jiao Tong University Jan 9th, 2019 About Me

Introduction Workloads for Experiments Introduction to workloads CS 239 Workload

Just-In-TimeReview Sections 18-21 JIT18: SimplifyingRatio- nalExpressions Fractional

Aperiodic Task Scheduling Radek Pel anek Preemptive Scheduling Non-preemptive Scheduling

Whats new in HPC? Gregory Bauer To keep up-to-date on HPC HPC Guru -

UL HPC School 2017[bis] PS1: Getting Started on the UL HPC platform UL High Performance

UL HPC School 2017 PS1: Getting Started on the UL HPC platform UL High Performance Computing

Uni.lu HPC School 2019 PS3: [Advanced] Job scheduling (SLURM) Uni.lu High Performance Computing

CPU Scheduling CPU Scheduling CPU Scheduling 101 CPU Scheduling 101 The CPU scheduler makes a

Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms

TENSOR NETWORK STATES FOR LATTICE GAUGE THEORIES about classical TNS simulations of a

The Comparison of ACI and MCB Methods for Choosing a Set that Contains the Optimal Dynamic

Lecture 4: Bayesian Decision Theory and Max Likelihood Estimation Dr. Chengjiang Long Computer

FAST ALGORITHMS FOR SURFACE EMBEDDED GRAPHS VIA HOMOLOGY The Final Exam of Kyle J. Fox

Prediction from low-rank missing data Elad Hazan Roi Livni Yishay Mansour Princeton U Hebrew U

An application of optimal control on neuronal dynamics using koopman operator Putian He

Mapping ideals of quantum group multipliers Jason Crann with M. Alaghmandan and M. Neufang

Differentially Private Markov Chain Monte Carlo o 2 , Onur Dikmen 3 and Antti Honkela 1 a 1