Environment (CLE) Performance Jeff Larkin Jeff Kuehn Cray Inc. - PowerPoint PPT Presentation

A Micro-Benchmark Evaluation of Catamount and Cray Linux Environment (CLE) Performance Jeff Larkin Jeff Kuehn Cray Inc. ORNL <larkin@cray.com> <kuehn@ornl.gov>

Does CLE waddle like a penguin, or run like a catamount? THE BIG QUESTION! CUG2008 2

Overview Background Motivation Catamount and CLE Benchmarks Benchmark System Benchmark Results IMB HPCC Conclusions CUG2008 3

BACKGROUND CUG2008 4

Motivation Last year at CUG “CNL” was in its infancy Since CUG07 Significant effort spent scaling on large machines CNL reached GA status in Fall 2007 Compute Node Linux (CNL) renamed Cray Linux Environment (CLE) A significant number of sites have already made the change Many codes have already ported from Catamount to CLE Catamount scalability has always been touted, so how does CLE compare? Fundamentals of communication performance HPCC IMB What should sites/users know before they switch? CUG2008 5

Background: Catamount Developed by Sandia for Red Storm Adopted by Cray for the XT3 Extremely light weight Simple Memory Model No Virtual Memory No mmap Reduced System Calls Single Threaded No Unix Sockets No dynamic libraries Few Interrupts to user codes Virtual Node (VN) mode added for Dual-Core CUG2008 6

Background: CLE First, we tried a full SUSE Linux Kernel. Then, we “put Linux on a diet.” With the help of ORNL and NERSC, we began running at large scale By Fall 2007, we released Linux for the compute nodes What did we gain? Threading Unix Sockets I/O Buffering CUG2008 7

Background: Benchmarks HPCC Suite of several benchmarks, released as part of DARPA HPCS program MPI performance Performance for varied temporal and spatial localities Benchmarks are run in 3 modes SP – 1 node runs the benchmark EP – Every node runs a copy of the same benchmark Global – All nodes run benchmark together Intel MPI Benchmarks (IMB) 3.0 Formerly Pallas benchmarks Benchmarks standard MPI routines at varying scales and message sizes CUG2008 8

Background: Benchmark System All benchmarks were run on the same system, “Shark,” and with the latest OS versions as of Spring 2008 System Basics Cray XT4 2.6 GHz Dual-Core Opterons (Able to run to 1280 Cores) DDR2-667 Memory, 2GB/core Catamount (1.5.61) CLE, MPT2 (2.0.50) CLE, MPT3 (2.0.50, xt-mpt 3.0.0.10) CUG2008 9

BENCHMARK RESULTS CUG2008 10

HPCC CUG2008 11

Parallel Transpose (Cores) 140 120 100 Catamount SN 80 GB/s Catamount VN CLE MPT2 N1 60 CLE MPT2 N2 40 CLE MPT3 N1 CLE MPT3 N2 20 0 0 500 1000 1500 Processor Cores CUG2008 12

Parallel Transpose (Sockets) 120 100 80 Catamount SN GB/s Catamount VN 60 CLE MPT2 N1 CLE MPT2 N2 40 CLE MPT3 N1 CLE MPT3 N2 20 0 0 100 200 300 400 500 600 Sockets CUG2008 13

MPI Random Access 3 2.5 2 Catamount SN GUP/s Catamount VN 1.5 CLE MPT2 N1 CLE MPT2 N2 1 CLE MPT3 N1 CLE MPT3 N2 0.5 0 0 500 1000 1500 Processor Cores CUG2008 14

MPI-FFT (cores) 250 200 Catamount SN 150 GFlops/s Catamount VN CLE MPT2 N1 100 CLE MPT2 N2 CLE MPT3 N1 50 CLE MPT3 N2 0 0 200 400 600 800 1000 1200 Processor Cores CUG2008 15

MPI-FFT (Sockets) 250 200 Catamount SN 150 GFlops/s Catamount VN CLE MPT2 N1 100 CLE MPT2 N2 CLE MPT3 N1 50 CLE MPT3 N2 0 0 100 200 300 400 500 600 Sockets CUG2008 16

Naturally Ordered Latency 16 14 12 10 Time (usec) 8 6 4 2 0 512 Catamount SN 6.41346 CLE MPT2 N1 9.08375 CLE MPT3 N1 9.41753 Catamount VN 12.3024 CLE MPT2 N2 13.8044 CLE MPT3 N2 9.799 CUG2008 17

Naturally Ordered Bandwidth 1.2 1 0.8 MB/s 0.6 0.4 0.2 0 512 Catamount SN 1.07688 CLE MPT2 N1 0.900693 CLE MPT3 N1 0.81866 Catamount VN 0.171141 CLE MPT2 N2 0.197301 CLE MPT3 N2 0.329071 CUG2008 18

IMB CUG2008 19

IMB Ping Pong Latency (N1) 12 10 8 Time (usec) 6 Catamount CLE MPT2 4 CLE MPT3 2 0 0 200 400 600 800 1000 1200 Message Size (B) CUG2008 20

IMB Ping Pong Latency (N2) 10 9 8 7 Avg uSec 6 5 Catamount 4 CLE MPT2 CLE MPT3 3 2 1 0 0 200 400 600 800 1000 1200 Bytes CUG2008 21

IMB Ping Pong Bandwidth 600 500 400 MB/s 300 Catamount CLE MPT2 200 CLE MPT3 100 0 0 200 400 600 800 1000 1200 Message Size (Bytes) CUG2008 22

MPI Barrier (Lin/Lin) 160 140 120 Time (usec) 100 80 Catamount CLE MPT2 60 CLE MPT3 40 20 0 0 500 1000 1500 Processor Cores CUG2008 23

MPI Barrier (Lin/Log) 160 140 120 Time (usec) 100 80 Catamount CLE MPT2 60 CLE MPT3 40 20 0 1 10 100 1000 10000 Processor Cores CUG2008 24

MPI Barrier (Log/Log) 1000 100 Time (usec) Catamount 10 CLE MPT2 CLE MPT3 1 1 10 100 1000 10000 0.1 Processor Cores CUG2008 25

SendRecv (Catamount/CLE MPT2) CUG2008 26

SendRecv (Catamount/CLE MPT3) CUG2008 27

Broadcast (Catamount/CLE MPT2) CUG2008 28

Broadcast (Catamount/CLE MPT3) CUG2008 29

Allreduce (Catamount/CLE MPT2) CUG2008 30

Allreduce (Catamount/CLE MPT3) CUG2008 31

AlltoAll (Catamount/CLE MPT2) CUG2008 32

AlltoAll (Catamount/CLE MPT3) CUG2008 33

CONCLUSIONS CUG2008 34

What we saw Catamount CLE Does very well on dual- Handles Single Core core (SN/N1) Runs slightly better Likes large messages and large core counts Seems to handle small MPT3 helps performance messages and small core and closes the gap counts slightly better between QK and CLE CUG2008 35

What’s left to do? We’d really like to try this again on a larger machine Does CLE continue to beat Catamount above 1024, or will the lines converge or cross? What about I/O? Linux adds I/O buffering, how does this affect I/O performance at scale? How does this translate into application performance? See "Cray XT4 Quadcore: A First Look", Richard Barrett, et.al., Oak Ridge National Laboratory (ORNL) CUG2008 36

Does CLE waddle like a penguin, or run like a catamount? CLE RUNS LIKE A BIG CAT! CUG2008 37

Acknowledgements This research used resources of the National Center for Computational Sciences at Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05- 00OR22725. Thanks to Steve, Norm, Howard, and others for help investigating and understanding these results CUG2008 38

Environment (CLE) Performance Jeff Larkin Jeff Kuehn Cray Inc. - PowerPoint PPT Presentation

A Micro-Benchmark Evaluation of Catamount and Cray Linux Environment (CLE) Performance Jeff Larkin Jeff Kuehn Cray Inc. ORNL <larkin@cray.com> <kuehn@ornl.gov> Does CLE waddle like a penguin, or run like a catamount? THE BIG

CLE Command Live Environment Michael Arbet November 2019 AGENDA Command Line Environment

IPSEC VPN overview IPSEC VPN overview Basic VPN Architecture CPE/CLE CPE/CLE PE

US-WA-5105 Cle Elum (T-Mobile SE09034J) Proposed 150 Monopole City of Cle Elum Strictly

The Work Environment Act and The Work Environment Ordinance The Work Environment Act and The

Advanced Texturing Environment Mapping Environment Mapping reflections Environment Mapping

Ne w T ra nsmissio n, Ca rb o n, a nd F inding a Pa th F o rwa rd E L PC April 2010 Who

Review of training provided & its impacts on patient care Overview Review CLE-GASHA

Medical Malpractice By Matt Powell Matt@MattLaw.com (813) 222-2222 FJA Ski CLE seminar Park City

Sana Shams Center for Language Engineering (CLE) Al-Khawarizimi Institute of Computer Science

CLE & e-ID Management: Issues, Prospects and Opportunities Chris E Onyemenam Director

Marketing for Lawyers Presenting Your Authentic Self to Make Meaningful Connections Materials and

Who L ove s Cle an Wate r ? Co mpa c t L a se r I nduc e d Bre a kdo wn Spe c tro sc o py (L

Women in Products Liability CLE Workshop Thursday, November 15, 2012 Massachusetts The ABA

Future of Par+cle Astrophysics, Michel Spiro President of IUPAP Malargue November 15 th , 2019

UN Environment Program UN Environment Program UN Environment Program UN Environment Program UN

Canon CLE Seminar The Policy Environment: Whats on the Horizon? An Update on Antitrust and

First-Timer's Guide to the 2017 National Brownfields Training Conference We Webinar Presenters

Spark & Spark SQL High-Speed In-Memory Analytics over Hadoop and Hive Data Instructor:

FIRST FRAMEWORK ON SHaRK OS Mlardalen University Giuseppe Lipari, Michael Trimarchi The

Pricing Derivatives with Barriers in a Stochastic Interest Rate Environment Carole Bernard

W l Welcome! ! The webinar will begin at The webinar will begin at 2:00 Eastern/11:00 Pacific

FLORA STEVEN PRIMARY SCHOOL PARENT COUNCIL MEETING 21st of January 2018 1. Welcome,

PRAYER FOR THAT WHICH WE ARE HERE FOR 19 th June 2005 Romans 15:14-33 I myself am convinced, my

SEPECC Meeting 9:00 Welcome + OCDEL & ELRC Updates 9:30 Funding Update Tuesday, June 30,

Environment (CLE) Performance Jeff Larkin Jeff Kuehn Cray Inc. - PowerPoint PPT Presentation

A Micro-Benchmark Evaluation of Catamount and Cray Linux Environment (CLE) Performance Jeff Larkin Jeff Kuehn Cray Inc. ORNL <larkin@cray.com> <kuehn@ornl.gov> Does CLE waddle like a penguin, or run like a catamount? THE BIG

CLE Command Live Environment Michael Arbet November 2019 AGENDA Command Line Environment

IPSEC VPN overview IPSEC VPN overview Basic VPN Architecture CPE/CLE CPE/CLE PE

US-WA-5105 Cle Elum (T-Mobile SE09034J) Proposed 150 Monopole City of Cle Elum Strictly

The Work Environment Act and The Work Environment Ordinance The Work Environment Act and The

Advanced Texturing Environment Mapping Environment Mapping reflections Environment Mapping

Ne w T ra nsmissio n, Ca rb o n, a nd F inding a Pa th F o rwa rd E L PC April 2010 Who

Review of training provided &amp; its impacts on patient care Overview Review CLE-GASHA

Medical Malpractice By Matt Powell Matt@MattLaw.com (813) 222-2222 FJA Ski CLE seminar Park City

Sana Shams Center for Language Engineering (CLE) Al-Khawarizimi Institute of Computer Science

CLE &amp; e-ID Management: Issues, Prospects and Opportunities Chris E Onyemenam Director

Marketing for Lawyers Presenting Your Authentic Self to Make Meaningful Connections Materials and

Who L ove s Cle an Wate r ? Co mpa c t L a se r I nduc e d Bre a kdo wn Spe c tro sc o py (L

Women in Products Liability CLE Workshop Thursday, November 15, 2012 Massachusetts The ABA

Future of Par+cle Astrophysics, Michel Spiro President of IUPAP Malargue November 15 th , 2019

UN Environment Program UN Environment Program UN Environment Program UN Environment Program UN

Canon CLE Seminar The Policy Environment: Whats on the Horizon? An Update on Antitrust and

First-Timer's Guide to the 2017 National Brownfields Training Conference We Webinar Presenters

Spark &amp; Spark SQL High-Speed In-Memory Analytics over Hadoop and Hive Data Instructor:

FIRST FRAMEWORK ON SHaRK OS Mlardalen University Giuseppe Lipari, Michael Trimarchi The

Pricing Derivatives with Barriers in a Stochastic Interest Rate Environment Carole Bernard

W l Welcome! ! The webinar will begin at The webinar will begin at 2:00 Eastern/11:00 Pacific

FLORA STEVEN PRIMARY SCHOOL PARENT COUNCIL MEETING 21st of January 2018 1. Welcome,

PRAYER FOR THAT WHICH WE ARE HERE FOR 19 th June 2005 Romans 15:14-33 I myself am convinced, my

SEPECC Meeting 9:00 Welcome + OCDEL &amp; ELRC Updates 9:30 Funding Update Tuesday, June 30,

Review of training provided & its impacts on patient care Overview Review CLE-GASHA

CLE & e-ID Management: Issues, Prospects and Opportunities Chris E Onyemenam Director

Spark & Spark SQL High-Speed In-Memory Analytics over Hadoop and Hive Data Instructor:

SEPECC Meeting 9:00 Welcome + OCDEL & ELRC Updates 9:30 Funding Update Tuesday, June 30,