Scheduling Many-Task Workloads on Supercomputers Dealing with - PowerPoint PPT Presentation

Scheduling Many-Task Workloads on Supercomputers Dealing with Trailing Tasks Timothy G. Armstrong, Zhao Zhang Daniel S. Katz, Michael Wilde, Ian T. Foster Department of Computer Science Computation Institute University of Chicago University of Chicago & Argonne National Laboratory

Many-tasks on a Supercomputer Multi-level scheduling  Metrics: time to solution and utilization  Inputs ...... ...... Tasks ...... Outputs ......

“Trailing T ask” Problem Utilization using 160,000 cores with molecular docking workload of 935,000 independent tasks. Last task completes at 7828 seconds

Runtime Distributions T ask runtimes can follow  various distributions Often highly skewed:  maybe power law or heavy-tailed distribution Nearly symmetrical distribution of Many-task computing  runtimes (another DOCK workload) systems should gracefully handle long-running tasks Runtimes with same mean as above but log-normal distribution

Obstacles to Shedding Workers Can't always just return unneeded worker CPUs to a pool  Reasons:  Scheduler support  Policy restrictions  Scheduler not designed for tracking many small allocations  Schedule fragmentation  Network topology; spatial fragmentation of machine  Resource provisioning granularity: thousands  T ask scheduling granularity: one 

“Bag of T asks” Workloads We only consider workloads with no dependencies:  number of “ready” tasks decreases monotonically Most many-task application like this when winding down  Similar to a stage of an many-task application with parallel  barrier after stage. E.g. MapReduce pattern Inputs ...... ...... Tasks Outputs ......

Fixed Worker Count Minimizing time to sol. leads to maximizing utilization  NP-Complete optimization problem with heuristics*:  Arbitrarily assigning tasks to idle workers (random) : 2x opt.  Assigning longest running tasks first (sorted) : 4/3x opt.  Average case behavior  for both is better Wastage Workers Allocation Duration *Many more in scheduling literature

Living with Unknown Runtimes In an ideal world we would know the runtime of each  task. In practice it is unrealistic assumption Random scheduling is what we must live with typically 

Trade-off with Fixed Worker Count With random, unavoidable trade-off between utilization  and time to solution

Chopping off the T ail of T asks When utilization gets  too low, switch to a smaller allocation No special  scheduler/system support required Chopping off the tail No tail-chopping

T ail Chopping: Worthwhile? T ail-chopping promises to provide:  A better trade-off between TTS and Utilization  High utilization more robust to changes in worker count  But has costs:  Overhead of migrating to new partition  Loss of task progress (unless tasks can be checkpointed)  Slower progress on smaller partition  Delays in requesting allocation  Assumptions for study:  T ail-chopping means progress of incomplete tasks lost  Fixed delay in acquiring new partition 

Simulation Design T ask/worker ratio decides how many workers to request  Threshold % of idle workers triggers tail-chopping  Sweep over parameter values to find trade-off curves  for different idle thresholds

Simulation Data Runtimes of 935,000 molecular docking tasks  Skewed distribution of runtimes  Available allocation sizes those on Blue Gene/P Intrepid 

Simulation Results (1) - Sorted Sorted scheduling  Effect on time to solution Effect on trade-off Effect on utilization

Simulation Results (2) - Random Random scheduling  Effect on time to solution Effect on trade-off Effect on wastage

Experiment on Blue Gene/P Proof of concept on Blue Gene/P Intrepid at Argonne  National Laboratory using Falkon task dispatcher Provisions machine partitions using task/worker ratio.  Chops off tail when idle workers above 50% threshold 

Possible Improvements “Warm up” partition before canceling old one  Scheduler support for shedding workers  T ask migration  Better heuristic for when to chop tail: use available  information about task runtimes

Conclusions Need to consider scheduling when running many-task  application on a supercomputer, especially if runtimes of tasks are highly variable Favorable utilization/time to solution trade-off not  always possible with fixed worker count T ail-chopping can give better results for time to solution  and utilization T ail-chopping delivers robustly high utilization 

Questions?

The “Straggler” Problem Described in MapReduce literature; related but different  “Straggler”: a task that is running slowly due to poor  hardware/software performance Standard solution: replicate the task on other machines  Only works if the long-running tasks don't intrinsically  involve more computation

Scheduling Many-Task Workloads on Supercomputers Dealing with - PowerPoint PPT Presentation

Scheduling Many-Task Workloads on Supercomputers Dealing with Trailing Tasks Timothy G. Armstrong, Zhao Zhang Daniel S. Katz, Michael Wilde, Ian T. Foster Department of Computer Science Computation Institute University of Chicago University

Aperiodic Task Scheduling Radek Pel anek Preemptive Scheduling Non-preemptive Scheduling

Introduction Workloads for Experiments Introduction to workloads CS 239 Workload

Supercomputers and Supercomputers and Clusters and Clusters and Grid, Grid, Oh My! Oh My!

Real-Time Scheduling slides: P. Puschner Scheduling Task Model Assumptions about task timing,

2010 Computing on Grids and Supercomputers Improving Many-Task Computing in Scientific Workflows

CPU Scheduling CPU Scheduling CPU Scheduling 101 CPU Scheduling 101 The CPU scheduler makes a

Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms

Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms

Uniprocessor Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms 2

Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms

Module 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms

Uniprocessor Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Three

CPU Scheduling CPU Scheduling CPU Scheduling 101 CPU Scheduling 101 The CPU scheduler makes a

Instruction Scheduling Last time Instruction scheduling using list scheduling Today

Traffic Footprint Characterization of Workloads using BPF Aditi Ghag aghag@vmware.com VMware

Understanding Big Data Workloads on Understanding Big Data Workloads on Modern Processors using

Lecture 11c: 2D Clipping Prof Emmanuel Agu Computer Science Dept. Worcester Polytechnic

Third Party Business Perspectives- Relationship between Biomass Buyer and Seller April 23, 2013

Wide Area Monitoring & Synchrophasors Definition, Measurement, and Application GE Consumer

Respecting Privacy with Look alike data sets Tim Garnsey - Director - Verge Labs

WEP Case Study Information Assurance Fall 2009 802.11 or Wi-Fi IEEE standard for wireless

rt ss s s

Part III Synchronization Deadlocks and Livelocks You think you know when you learn, are more

SEPECC Meeting 9:30 Funding Update 9:35 Advocacy Skill Building Tuesday, June 23, 2020 9:55

Scheduling Many-Task Workloads on Supercomputers Dealing with - PowerPoint PPT Presentation

Scheduling Many-Task Workloads on Supercomputers Dealing with Trailing Tasks Timothy G. Armstrong, Zhao Zhang Daniel S. Katz, Michael Wilde, Ian T. Foster Department of Computer Science Computation Institute University of Chicago University

Aperiodic Task Scheduling Radek Pel anek Preemptive Scheduling Non-preemptive Scheduling

Introduction Workloads for Experiments Introduction to workloads CS 239 Workload

Supercomputers and Supercomputers and Clusters and Clusters and Grid, Grid, Oh My! Oh My!

Real-Time Scheduling slides: P. Puschner Scheduling Task Model Assumptions about task timing,

2010 Computing on Grids and Supercomputers Improving Many-Task Computing in Scientific Workflows

CPU Scheduling CPU Scheduling CPU Scheduling 101 CPU Scheduling 101 The CPU scheduler makes a

Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms

Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms

Uniprocessor Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms 2

Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms

Module 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms

Uniprocessor Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Three

CPU Scheduling CPU Scheduling CPU Scheduling 101 CPU Scheduling 101 The CPU scheduler makes a

Instruction Scheduling Last time Instruction scheduling using list scheduling Today

Traffic Footprint Characterization of Workloads using BPF Aditi Ghag aghag@vmware.com VMware

Understanding Big Data Workloads on Understanding Big Data Workloads on Modern Processors using

Lecture 11c: 2D Clipping Prof Emmanuel Agu Computer Science Dept. Worcester Polytechnic

Third Party Business Perspectives- Relationship between Biomass Buyer and Seller April 23, 2013

Wide Area Monitoring &amp; Synchrophasors Definition, Measurement, and Application GE Consumer

Respecting Privacy with Look alike data sets Tim Garnsey - Director - Verge Labs

WEP Case Study Information Assurance Fall 2009 802.11 or Wi-Fi IEEE standard for wireless

rt ss s s

Part III Synchronization Deadlocks and Livelocks You think you know when you learn, are more

SEPECC Meeting 9:30 Funding Update 9:35 Advocacy Skill Building Tuesday, June 23, 2020 9:55

Wide Area Monitoring & Synchrophasors Definition, Measurement, and Application GE Consumer