Workload Management: NQE/LSF Status & Plans Jack Thompson - PowerPoint PPT Presentation

Workload Management: NQE/LSF Status & Plans Jack Thompson Brian MacDonald Marketing Product Manager Technical Relationship Manager SGI Platform Computing jt@sgi.com brian@platform.com 41st Cray User Group Conference Minneapolis, Minnesota

Agenda ¥ NQE Transition & Status ¥ Migration Program ¥ Status of LSF on SGI and Cray Systems ¥ LSF Plans ¥ Q&A 2

NQE Transition NQE 3.3 ¥ Final feature release Next Steps ¥ ISV solutions prevalent Ð Core competency issue Ð Multi-vendor environment ¥ Partner solution best choice ¥ Platform ComputingÕs LSF 3

NQE Status ¥ Supported on SGI and Cray Systems Ð Support through year-end, 2004 Ð Critical bugs fixed Ð Call center support ¥ Available for Cray SV1 systems ¥ Retired on non-SGI systems 4

LSF Migration Program ¥ Discounted pricing for systems licensed for NQE before February 1, 1999 Ð Available through January 31, 2000 ¥ Migration Guide Ð Developed jointly by Platform and SGI ¥ Professional services available ¥ Inclusion of key NQE features in LSF Strong relationship between SGI and Platform Computing engineering teams 5

LSF on SGI Systems Current release is LSF 3.2 ¥ Now available on IRIX, UNICOS, UNICOS/mk Ð Including Cray SV1 ¥ Also on NT and Linux ¥ Available from SGI Ð LSF Standard Edition, LSF Parallel, LSF Client ¥ Available from Platform Computing Ð LSF Analyzer, LSF MultiCluster, LSF JobScheduler, LSF Make 6

Data Center Requirements Environments for High Performance Ð Single point of control and administration Ð Logically present a single system image to users, applications and networks Ð Application of policies across the consolidated platform - uniform across all machines Ð Uniform policies to satisfy workload performance objectives in terms of throughput, turn around and response time Ð Improved application availability - both for failures and planned outages 7

Defining Capacity Goals LSF can be focused on throughput guarantees ¥ Run as much workload on the box, absolute performance not primary goal 8 CPUs 12 jobs, 900 MB 1 GB Memory of memory, lots 6 I/O Channels of disk activity or network disk access 8

Thresholds for Execution High Priority, Critical Workload Continues Critical and Stop Lower Acceptin Low Priority g New Priority Jobs Jobs Jobs Suspended or 85 % 90 % Migrated 100 % CPU Utilization 9

Defining Capability Computing Clearly Stated Performance Goals ¥ Get my job done as quickly as possible using all necessary dedicated resources ¥ Avoid sharing and contention at all costs ¥ Problems can be tackled that otherwise could not be considered ¥ Mission critical applications gain the undivided attention of the computing infrastructure 0

Defining Capability Computing Supporting the Exclusive Execution Model ¥ multi-box parallelism (Origin 2000) ¥ mixed operation large machines ¥ optimum support for Cray T3E ¥ committed product development in support of partitioning mechanisms Ð Miser (Q4 99) Ð Miser CPU sets (Q4 99) Ð OS service follow-on (XRS) 1

Resource Based Job Placement Selection Ð Match necessary conditions Ordering Ð Choose the best from eligible candidates Reservation Ð Adjust load values for selected hosts Spanning Ð Define locality of parallel jobs 2

Single Processing Image Resource Informatio LIM n . . . Scheduler submission hosts server hosts batch queues 3

System Level Integration ¥ placement ¥ SGI Array Session ¥ control (signals, limits, ¥ Task startup and message) control ¥ consolidated ¥ ASH returned to PAM accounting Parallel Application Manager ¥ MPT 1.3 Plug-in Remote Execution Server ¥ ASH sent to RES used to discover per job usage 4

Solutions Through Integration ISVs, Custom Scientific and Commercial Applications transparently gain access to resource management services without changing their code ¥ Application Checkpoint Restart ¥ Transparent host selection ¥ Accounting for ISV applications LSF Parallel 3.2 MPT 1.3 5

LSF 4.0 Enhancements Scheduler Ð Scalability improvements for all the bells and whistles turned on - Fair-share + Back-filling á 20,000 + jobs Ð Dynamic re-configuration without re-start á lim and mbatchd Ð Client query scalability á support for thousandÕs of clients Ð Adaptive dispatch for high throughput, short running jobs Ð Time dependent configuration for queues á different queue for night, same queue 6

LSF 4.0 Enhancements Job Execution Ð Improved Input/Output handling support á I/O Spooling á Admin defined spool directory á Job level CWD discovery enhancements Ð Integrated FTA supported within LSF Ð Job Flow Ð Kill re-queue Administrative Improvements Ð Non-shared daemon configuration support Ð Automatic host type and model detection 7

Workload Management: NQE/LSF Status & Plans Jack Thompson - PowerPoint PPT Presentation

Workload Management: NQE/LSF Status & Plans Jack Thompson Brian MacDonald Marketing Product Manager Technical Relationship Manager SGI Platform Computing jt@sgi.com brian@platform.com 41st Cray User Group Conference Minneapolis,

Workload, Fatigue, and Sleep Disruption 1 Workload 1.What is workload? 2.What is the

The Living Standards Framework (LSF) Suzy Morrissey Tsy / AUT Symposium - A sustainable

WORKLOAD WORKLOAD WORKLOAD During exercise, nasal breathing causes a reduction in FEO 2

ASHA Workload Calculator What is Direct and Other indirect workload? activities Services

Learn French in beautiful Montpellier south of France, by the sea 6, RUE FOCH - 34000 MONTPELLIER

DAY 2 Agenda for Today Introduce the workload characterization problem. Discuss a

Day 3 Agenda for Today Formulate simple problem statement Revisit the workload

Local 006 Workload Appeal COLLECTIVE AGREEMENT 2014:LETTER OF INTENT #2 Why a Workload Appeal?

Workload Formulas Judicial Branch Workload Formulas and On-Bench Time Reporting | September 23,

CS 147: Computer Systems Performance Analysis Workload Selection 1 / 39 Overview CS147

PanDA PanDA-based based GRID Workload Management GRID Workload Management Maxim Potekhin

Evolution of CMS workload management Evolution of CMS workload management towards multicore job

Chapter 17 Employee Benefits: Retirement Plans Fundamentals of Private Retirement Plans

Andrea Bogie, Sarah Covington, Karen Meulendyke, and Sarah Goad Agenda Objectives Workload Study

Work Physiology & Workload Assessment Agenda Work Physiology Workload Assessment

Structure of Talk Workload-sensitive Timing Behavior Anomaly Detection 1 Motivation in Large

How to Grow Your A Simple Exercise to Find Time in Your Schedule Business Free Tools &

rs rts r

Game Engine Selection Andrew Haydn Grant Technical Director MIT Game Lab September 3, 2014

Undergraduate Networking at Small Colleges Joel Sommers Colgate University

Samba-AD, going up the ladder - New challenges, new opportunities Denis & Vincent Cardon

Recent Progress in Object Detection Jiaqi Wang Multimedia Laboratory The Chinese University of

Quantum Computing and the Forest SDK Robert Smith 2 February 2019 Rigetti Computing Proprietary

Geometric Quantifier Elimination Heuristic for Octagonal Constraints Deepak Kapur Department of

Sambuz

Useful Links

Newsletter

Mail Us

Workload Management: NQE/LSF Status & Plans Jack Thompson - PowerPoint PPT Presentation

Workload Management: NQE/LSF Status & Plans Jack Thompson Brian MacDonald Marketing Product Manager Technical Relationship Manager SGI Platform Computing jt@sgi.com brian@platform.com 41st Cray User Group Conference Minneapolis,

Workload, Fatigue, and Sleep Disruption 1 Workload 1.What is workload? 2.What is the

The Living Standards Framework (LSF) Suzy Morrissey Tsy / AUT Symposium - A sustainable

WORKLOAD WORKLOAD WORKLOAD During exercise, nasal breathing causes a reduction in FEO 2

ASHA Workload Calculator What is Direct and Other indirect workload? activities Services

Learn French in beautiful Montpellier south of France, by the sea 6, RUE FOCH - 34000 MONTPELLIER

DAY 2 Agenda for Today Introduce the workload characterization problem. Discuss a

Day 3 Agenda for Today Formulate simple problem statement Revisit the workload

Local 006 Workload Appeal COLLECTIVE AGREEMENT 2014:LETTER OF INTENT #2 Why a Workload Appeal?

Workload Formulas Judicial Branch Workload Formulas and On-Bench Time Reporting | September 23,

CS 147: Computer Systems Performance Analysis Workload Selection 1 / 39 Overview CS147

PanDA PanDA-based based GRID Workload Management GRID Workload Management Maxim Potekhin

Evolution of CMS workload management Evolution of CMS workload management towards multicore job

Chapter 17 Employee Benefits: Retirement Plans Fundamentals of Private Retirement Plans

Andrea Bogie, Sarah Covington, Karen Meulendyke, and Sarah Goad Agenda Objectives Workload Study

Work Physiology &amp; Workload Assessment Agenda Work Physiology Workload Assessment

Structure of Talk Workload-sensitive Timing Behavior Anomaly Detection 1 Motivation in Large

How to Grow Your A Simple Exercise to Find Time in Your Schedule Business Free Tools &amp;

rs rts r

Game Engine Selection Andrew Haydn Grant Technical Director MIT Game Lab September 3, 2014

Undergraduate Networking at Small Colleges Joel Sommers Colgate University

Samba-AD, going up the ladder - New challenges, new opportunities Denis &amp; Vincent Cardon

Recent Progress in Object Detection Jiaqi Wang Multimedia Laboratory The Chinese University of

Quantum Computing and the Forest SDK Robert Smith 2 February 2019 Rigetti Computing Proprietary

Geometric Quantifier Elimination Heuristic for Octagonal Constraints Deepak Kapur Department of

Sambuz

Useful Links

Newsletter

Mail Us

Work Physiology & Workload Assessment Agenda Work Physiology Workload Assessment

How to Grow Your A Simple Exercise to Find Time in Your Schedule Business Free Tools &

Samba-AD, going up the ladder - New challenges, new opportunities Denis & Vincent Cardon