Computing Systems Wei Tang*, Narayan Desai # , Venkatram Vishwanarth# - PowerPoint PPT Presentation

Job Coscheduling on Coupled High-End Computing Systems Wei Tang*, Narayan Desai # , Venkatram Vishwanarth# Daniel Buettner#, Zhiling Lan* * Illinois Institute of Technolology # Argonne National Laboratory

Outline • Background & Motivations • Problem Statement • Solutions • Evaluations

Background • Coupled systems are commonly used – Large scale system: computation, simulation, etc – Special-purpose system: data analysis, visualization, etc • Coupled applications: – Simulation / computing applications – Visualization/data analysis applications – Example: FLASH & vl3, PHASTA & ParaView

Coupled systems examples • Intrepid & Eureka @ANL – Intrepid: IBM Blue Gene/P with 163, 840 cores (#13 in Top500) – Eureka: 100-node cluster with 200 GPUs (largest GPU installation) • Ranger & Longhorn @TACC – Ranger: SunBlade with 62,976 cores (#15 in Top500) – Longhorn: 256-node Dell Cluster, 128 GPUS • Jaguar & Lens @ORNL – Jaguar: Cray XT5 with 224, 162 cores (#3 in Top500) – Lens: 32-node Linux cluster, 2 GPUs • Kraken & Verne @ NICS/UTK – Kraken: Cray XT5 with 98,928 cores (#8 in Top500) – Verne: 5-node Dell cluster. • And so on … …

Motivation • Post-hoc execution – Computing applications write data to storage system, and then analysis applications read data from storage system and process – I/O time consuming • Co-execution is increasingly demanded: – Saving I/O time by transfer data from simulation application to visualization/data analysis (an ongoing project named GLEAN) – Co-execution enables monitoring simulations, debugging at runtime – Heterogeneous computing

Problem statement • System A and B running parallel jobs – Job schedulers / scheduling policies are independent – Job queues are independent • Some of jobs on A has associated (mate) jobs on B. – Mate jobs are in pairs: one on A, the other on B • Co-scheduling Goal: – Guarantee the mate jobs in the same pair start at same time on their respective hosting system without manual reservation – Limit the negative impact of system performance and utilization.

Related work • Meta scheduling – Managing jobs on multiple clusters via a single instance – Moab by Adaptive Computing Inc, LoadLeveler by IBM – Our work is more distributed. Different scheduler running on independent resource management domain can coordinate job scheduling. • Co-Reservation – Co-allocation of compute and network resources by reservation – HARC (Highly-Available Resource Co-allocator) by LSU – Our work doesn’t involve manual reservation; co - scheduling is automatically coordinated.

Basic schemes • When a job can start to run on a machine while its mate job on the remote machine cannot, it may “hold” or “yield”. • Hold – Hold resources (nodes) which cannot be used by others until the mate job can run • Yield – Give up the turn of running without holding any resources

Algorithm

flowchart

Strategies combination • Hold-Hold – Good for the sync-up of mated jobs – Bad for system utilization – May cause deadlock • Yield-Yield – No hurt for system utilization – Bad for mated jobs waiting • Hold-Yield (or Yield-Hold) – Behave respectively

Deadlock • Coupled systems A & B, both use “hold” scheme • Circular wait (a1  b1  b2  a2  a1)

Enhancements • Solving Deadlock – Release all the held nodes periodically (e.g. every 20 minutes) • Reduce overhead – Threshold for yielding times • Fault-Tolerance consideration – A job won’t wait forever when the remote machine is down

Evaluation • Event-driven simulation using real job trace from production supercomputers. • Qsim along with Cobalt resource manager.

Experiment goals • Investigate the impact by tuning system load • Investigate the impact by tuning the proportion of paired jobs.

Job traces • Intrepid (real trace) – One month, 9220 jobs, sys. Util. 70% • Eureka (half-synthetic, packed into one month) – Trace 1: 5079 jobs, sys. Util. = 25% – Trace 2: 11000 jobs, sys. Util. = 50% – Trace 3: 14430 jobs, sys. Util. = 75% – Synthetic: 9220 jobs. Sys. Util. = 48%

Evaluation Metrics • Avg. waiting time – Start time – Submission time – Average among total jobs • Avg. slowdown – (wait time + runtime) /runtime – Average among total jobs • Mated job sync-up overhead – How many more minutes need to wait in co-scheduling – Average among all paired jobs • Loss of computing capability – Node-hour – System utilization rate

Average wait by sys. Util. Scheme on Intrepid-Eureka HH: Hold-Hold HY: Hold-Yield YH: Yield-Hold YY: Yield-Yield Sys util. on Eureka: 25% 50% 75%

Slowdown by sys. Util.

Coscheduling overhead by sys. Util. Eureka sync-up overhead (average) Intrepid job sync-up overhead (average) hold yield hold yield 160 250 140 200 120 minutes 100 minutes 150 80 100 60 40 50 20 0 0 25%/H 25%/Y 50%/H 50%/Y 75%/H 75%/Y 25%/H 25%/Y 50%/H 50%/Y 75%/H 75%/Y Eureka sys. util. / Intrepid scheme Eureka config. (sys. util./scheme) Using yield costs more sync-up overhead than using hold

Loss of computing capability by sys. Util. Intrepid loss of computing capability Eureka loss of computing capability node hour sys. Util node hour sys. Util. 1,600,000 5.0% 4000 6.0% 4.5% 1,400,000 lost sys. util. rate 3500 lost sys. util. rate 5.0% 4.0% 3000 1,200,000 node-hour node-hour 3.5% 4.0% 2500 1,000,000 3.0% 2000 3.0% 800,000 2.5% 1500 2.0% 2.0% 600,000 1000 1.5% 400,000 1.0% 1.0% 500 200,000 0.5% 0 0.0% 0 0.0% 25%/H 25%/Y 50%/H 50%/Y 75%/H 75%/Y 25%/H 25%/Y 50%/H 50%/Y 75%/H 75%/Y Eureka sys. util./Intrepid scheme Eureka config. (sys. util/scheme) Util loss is caused only by using “hold”

Avg. wait by proportion of paired jobs

Slowdown by proportion of paired jobs

Overhead by proportion of paired jobs Intrepid job sync-up overhead (average) Eureka job sync-up overhead (average) hold yield hold yield 160 250 140 200 120 minutes minutes 100 150 80 100 60 50 40 20 0 0 2.5%/H 5%/H 10%/H 20%/H 33%/H 2.5%/H 5%/H 10%/H 20%/H 33%/H mate job ratio/remote scheme mate job ratio/remote scheme

Loss of computing capability by proportion of paired jobs Eureka loss of computing capability Intrepid loss of computing capability node hour sys. Util node hour sys. Util 18000 25% 3500000 12.0% 16000 20% 14000 3000000 10.0% node-hour 12000 node-hour 2500000 15% 8.0% 10000 2000000 8000 6.0% 10% 1500000 6000 4.0% 4000 5% 1000000 2000 2.0% 500000 0 0% 0 0.0% 2.5%/H 5%/H 10%/H 20%/H 33%/H 2.5%/H 5%/H 10%/H 20%/H 33%/H mate job ratio/remote scheme mate job ratio/remote scheme

Summary • Designed and implemented coscheduling algorithm to start associated at the same time in order to fulfill multiple needs of certain applications, such as reducing I/O overhead in coupled HEC environment. • Evaluated the coscheduling impact on system performance and overhead for jobs needing coscheduling. • Conclusion: coscheduling can work with some acceptable overhead under different system utilization rate and proportion of mated jobs.

Thank you!

Computing Systems Wei Tang*, Narayan Desai # , Venkatram Vishwanarth# - PowerPoint PPT Presentation

Job Coscheduling on Coupled High-End Computing Systems Wei Tang, Narayan Desai # , Venkatram Vishwanarth# Daniel Buettner#, Zhiling Lan * Illinois Institute of Technolology # Argonne National Laboratory Outline Background &

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

Systems Systems Systems Integration Systems Integration Systems Systems Integration Systems

COMPUTING COMMUNITY CONSORTIUM The mission of the Computing Research Association's Computing

THE COMPUTING COMMUNITY CONSORTIUM (CCC) COMPUTING COMMUNITY CONSORTIUM The mission of Computing

Calm Computing The Coming Age of Mark Weiser and John Seely Brown Calm Computing Whyfor, Calm

Ray Wu Presentation to School of Computing, National University of Singapore Computing Evolution

ManyCore ManyCore Computing: ManyCore ManyCore Computing: Computing: Computing: The Impact on

Types of Expert Systems Interpretation Systems Prediction Systems Diagnosis Systems

OVERVIEW 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2 Overview

Ubiquitous Computing Gabriela Avram IxDM13 The Trends in Computing Technology 1970s 1990s

Interacting with Small Devices in Big Ways Chris Harrison 1 Small Powerful + 2 Computing

Quantum Computing and the Forest SDK Robert Smith 2 February 2019 Rigetti Computing Proprietary

Secure Outsourcing Computation Li Xiong Outline Cloud computing Computing on encrypted

Cloud Computing SENY KAMARA MICROSOFT RESEARCH Computing as a Service 2 Computing is a

THE COMPUTING COMMUNITY CONSORTIUM Elizabeth D. Mynatt Chair COMPUTING COMMUNITY CONSORTIUM The

Today's World-wide Today's World-wide Computing Grid for the Computing Grid for the Computing

Functional Dependencies and Normalization There are many forms of constraints on relational

What we can do and how? Lingyun Meng, Ph.D., Associate Professor State Key Laboratory of Rail

The current process of checking capability 1 Loads Books Referenced to ascertain maximum tonnage

morphforge Biophysical simulation in Python Mike Hull (s0897465@sms.ed.ac.uk) University of

Course Introduction 17-654/17-765 Analysis of Software Artifacts Jonathan Aldrich Why is

Bias Also Matters: Bias Attribution for Deep Neural Network Explanation Shengjie Wang*, Tianyi

Wrapup CSE443 - Spring 2012 Introduction to Computer and Network Security Professor Jaeger

Q2 2018 Management Commentary August 1, 2018 NYSE: DVN devonenergy.com Exe xecu cuting ting

Computing Systems Wei Tang*, Narayan Desai # , Venkatram Vishwanarth# - PowerPoint PPT Presentation

Job Coscheduling on Coupled High-End Computing Systems Wei Tang*, Narayan Desai # , Venkatram Vishwanarth# Daniel Buettner#, Zhiling Lan* * Illinois Institute of Technolology # Argonne National Laboratory Outline Background &

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

Systems Systems Systems Integration Systems Integration Systems Systems Integration Systems

COMPUTING COMMUNITY CONSORTIUM The mission of the Computing Research Association's Computing

THE COMPUTING COMMUNITY CONSORTIUM (CCC) COMPUTING COMMUNITY CONSORTIUM The mission of Computing

Calm Computing The Coming Age of Mark Weiser and John Seely Brown Calm Computing Whyfor, Calm

Ray Wu Presentation to School of Computing, National University of Singapore Computing Evolution

ManyCore ManyCore Computing: ManyCore ManyCore Computing: Computing: Computing: The Impact on

Types of Expert Systems Interpretation Systems Prediction Systems Diagnosis Systems

OVERVIEW 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2 Overview

Ubiquitous Computing Gabriela Avram IxDM13 The Trends in Computing Technology 1970s 1990s

Interacting with Small Devices in Big Ways Chris Harrison 1 Small Powerful + 2 Computing

Quantum Computing and the Forest SDK Robert Smith 2 February 2019 Rigetti Computing Proprietary

Secure Outsourcing Computation Li Xiong Outline Cloud computing Computing on encrypted

Cloud Computing SENY KAMARA MICROSOFT RESEARCH Computing as a Service 2 Computing is a

THE COMPUTING COMMUNITY CONSORTIUM Elizabeth D. Mynatt Chair COMPUTING COMMUNITY CONSORTIUM The

Today's World-wide Today's World-wide Computing Grid for the Computing Grid for the Computing

Functional Dependencies and Normalization There are many forms of constraints on relational

What we can do and how? Lingyun Meng, Ph.D., Associate Professor State Key Laboratory of Rail

The current process of checking capability 1 Loads Books Referenced to ascertain maximum tonnage

morphforge Biophysical simulation in Python Mike Hull (s0897465@sms.ed.ac.uk) University of

Course Introduction 17-654/17-765 Analysis of Software Artifacts Jonathan Aldrich Why is

Bias Also Matters: Bias Attribution for Deep Neural Network Explanation Shengjie Wang*, Tianyi

Wrapup CSE443 - Spring 2012 Introduction to Computer and Network Security Professor Jaeger

Q2 2018 Management Commentary August 1, 2018 NYSE: DVN devonenergy.com Exe xecu cuting ting

Job Coscheduling on Coupled High-End Computing Systems Wei Tang, Narayan Desai # , Venkatram Vishwanarth# Daniel Buettner#, Zhiling Lan * Illinois Institute of Technolology # Argonne National Laboratory Outline Background &