Introduction to Cheyenne 3 November, 2016 Consulting Services Group - - PowerPoint PPT Presentation

introduction to cheyenne
SMART_READER_LITE
LIVE PREVIEW

Introduction to Cheyenne 3 November, 2016 Consulting Services Group - - PowerPoint PPT Presentation

Introduction to Cheyenne 3 November, 2016 Consulting Services Group Brian Vanderwende Topics we will cover Technical specs of the Cheyenne supercomputer and expanded GLADE file systems The Cheyenne computing environment Accessing


slide-1
SLIDE 1

Introduction to Cheyenne

3 November, 2016 Consulting Services Group Brian Vanderwende

slide-2
SLIDE 2

Topics we will cover

  • Technical specs of the Cheyenne supercomputer

and expanded GLADE file systems

  • The Cheyenne computing environment
  • Accessing software on Cheyenne
  • Compilers
  • MPI/Parallelism
  • Submitting batch jobs using the PBS scheduler
  • Data storage
  • Q&A
slide-3
SLIDE 3

User-facing hardware specifications

  • 4032 dual-socket nodes
  • 18-core 2.3 Ghz Intel Xeon (Broadwell) processors
  • 36 total CPUs per node (16 on Yellowstone)
  • Hyperthreading supported for up to 72 virtual CPUs
  • Regular and high-memory nodes
  • 3164 nodes with 64 GB of memory
  • 864 nodes with 128 GB of memory
  • Laramie test system has 70 usable nodes with 64 GB memory
  • Infiniband interconnects for message passing
  • Six login nodes with 256 GB of memory
slide-4
SLIDE 4

The GLADE file systems will be expanded accordingly

  • Will continue to use IBM GPFS/Spectrum Scale

technology

  • Existing capacity: 16 PB
  • New capacity to be added: 21 PB
  • Total capacity of 37 PB, with potential for

expansion to 58 PB in future upgrades

  • Data transfer rates will be more than doubled
  • Home, work, and scratch spaces will be shared

between Yellowstone and Cheyenne!

slide-5
SLIDE 5

Cheyenne is an evolutionary increase from Yellowstone

Yellowstone

  • 1.5 petaflops peak

compute

  • 72,256 cores
  • 145 TB total memory
  • 56 GB/s interconnects

Cheyenne

  • 5.34 petaflops peak

compute

  • 145,152 cores
  • 313 TB total memory
  • 100 GB/s interconnects

1 Yellowstone core-hour = 0.82 Cheyenne core-hours

slide-6
SLIDE 6

Timeline for HPC/Cheyenne

  • 1. Test system (Laramie) in place since July
  • 2. Cheyenne assembled in August
  • 3. Cheyenne shipped to NWSC in September
  • 4. Acceptance testing and integration with file

systems in the fall

  • 5. NCAR acceptance in December
  • 6. Start of production on Cheyenne: January 2017

a. Accelerated Scientific Discovery (ASD) projects begin (early user access in December 2016)

  • 7. Yellowstone production ends: December 2017
slide-7
SLIDE 7

Logging into the new systems

  • As before, use your authentication token (yubikey)

along with your username to login ssh -X -l username cheyenne.ucar.edu

  • You will then be on one of six login nodes
  • Your default shell is tcsh, but others are available

through SAM

  • SUSE Linux OS provides typical UNIX commands
  • Users of the test system should replace “cheyenne”

with “laramie” where appropriate

slide-8
SLIDE 8

The login nodes are a shared resource - use them lightly!

  • As with Yellowstone, the six login nodes on

Cheyenne will be a shared space

  • Your processes will compete with those of 10-100s
  • f other users for processing and memory
  • So limit your usage to:
  • Reading and writing text/code
  • Compiling programs
  • Performing small data transfers
  • Interacting with the job scheduler
  • Programs that use excessive resources on the

login nodes will be terminated

slide-9
SLIDE 9

CISL builds software for users to load with environment modules

  • We build programs and libraries that you can

enable by loading an environment module

  • Compilers, MPI, NetCDF, MKL, Python, etc.
  • Modules configure your computing environment so

you can find binaries/executables, libraries and headers to compile with, and manuals to reference

  • Modules are also used to prevent conflicting

software from being loaded

  • You don’t need to use modules, but they simplify

things greatly, and we recommend their use

slide-10
SLIDE 10

Note that Yellowstone and Cheyenne will each have their own module/software tree!

slide-11
SLIDE 11

Intel 16.0.3 Software built with Intel

The Cheyenne module tree will add choice and clarity

Yellowstone Cheyenne

Compiler Intel GNU MKL netCDF pnetCDF Compiler Intel 16.0.3 Intel 17.0.0 GNU 6.2.0 MPI SGI MPT 2.15 Intel MPI 5.1.3.210 OpenMPI 10.2.0 MKL netCDF Intel 16.0.3 MPT 2.15 pnetCDF

slide-12
SLIDE 12

Some useful module commands

  • module add/remove/load/unload <software>
  • module avail - show all community software

installed on the system

  • module list - show all software currently loaded

within your environment

  • module purge - clear your environment of all

loaded software

  • module save/restore <name> - create or load a

saved set of software

  • module show <software> - show the commands

a module runs to configure your environment

slide-13
SLIDE 13

Compiling software on Cheyenne

  • We will support Intel, GCC, and PGI
  • As on Yellowstone, wrapper scripts are loaded by

default (ncarcompilers module) which make including code and linking to libraries much easier

  • Building with netCDF using the wrappers:
  • ifort model.f90 -o model
  • Building with netCDF without the wrappers:
  • setenv NETCDF /path/to/netcdf

ifort -I${NETCDF}/include model.f90

  • L${NETCDF}/lib -lnetcdff -o model
  • Do not expect a parallel program compiled with one

MPI library to run using a different library!

slide-14
SLIDE 14

Where you compile code depends

  • n where you intend to run it
  • Cheyenne has newer Intel processors than

Yellowstone and Caldera, which in turn have newer chips than Geyser

  • If you must run a code across systems, either:
  • 1. Compile for the oldest system you want to use,

to ensure that results are consistent

  • 2. For best performance, make copies of the code

and compile separately for each system

slide-15
SLIDE 15

To access compute resources, use the PBS job manager

#!/bin/bash #PBS -N WRF_PBS #PBS -A <project> #PBS -q regular #PBS -l walltime=00:30:00 #PBS -l select=4:ncpus=36:mpiprocs=36 #PBS -j oe #PBS -o log.oe # Run WRF with SGI MPT mpiexec_mpt -n 144 ./wrf.exe #!/bin/bash #BSUB -J WRF_PBS #BSUB -P <project> #BSUB -q regular #BSUB -W 30:00 #BSUB -n 144 #BSUB -R “span[ptile=16]” #BSUB -o log.oe #BSUB -e log.oe # Run WRF with IBM MPI mpirun.lsf ./wrf.exe

LSF (Yellowstone) PBS (Cheyenne)

slide-16
SLIDE 16

A (high-memory) shared queue will be available on Cheyenne

Queue name Priority Wall clock (hours) Nodes Queue factor Description capability 1 12 1153 - 4032 1.0 Execution window: Midnight Friday to 6 a.m. Monday premium 1 12 ≤ 1152 1.5 share 1 6 1 2.0 Interactive use for debugging and other tasks

  • n a single, shared, 128-GB node.

small 1.5 2 ≤ 18 1.5 Interactive and batch use for testing, debugging, profiling; no production workloads. regular 2 12 ≤ 1152 1.0 economy 3 12 ≤ 1152 0.7 standby 4 12 ≤ 1152 0.0 Do not submit to standby. Used when you have exceeded usage or allocation limits.

slide-17
SLIDE 17

Submitting jobs to and querying information from PBS

  • To submit a job to PBS, use qsub:
  • Script: qsub job_script.pbs
  • Interactive: qsub -I -l select=ncpus:36:mpiprocs:36 -l

walltime=10:00 -q share -A <project>

  • qstat <job_id> - query information about the job
  • qstat -u $USER - summary of your active jobs
  • qstat -Q <queue> - show status of specified or all

queues

  • qdel <job_id> - delete and/or kill the specified job

It is not possible to search for backfill windows in PBS!

slide-18
SLIDE 18

Using threads/OpenMP to exploit shared-memory parallelism

Only OpenMP Hybrid MPI/OpenMP

#!/bin/tcsh #PBS -N OPENMP #PBS -A <project> #PBS -q small #PBS -l walltime=10:00 #PBS -l select=1:ncpus=10 #PBS -j oe #PBS -o log.oe # Run program with 10 threads ./executable_name #!/bin/tcsh #PBS -N HYBRID #PBS -A <project> #PBS -q small #PBS -l walltime=10:00 #PBS -l select=2:ncpus=36:mpiprocs=1:ompthreads=36 #PBS -j oe #PBS -o log.oe ### Make sure threads are distributed across the node setenv MPI_OPENMP_INTEROP 1 # Run program with one MPI task and 36 OpenMP # threads per node (two nodes) mpiexec_mpt ./executable_name

slide-19
SLIDE 19

Pinning threads to CPUs with SGI MPT’s omplace command

  • Normally threads will migrate across available

CPUs throughout execution

  • Sometimes it is advantageous to “pin” threads to a

particular CPU (e.g., OpenMP across a socket)

#PBS -l select=2:ncpus=36:mpiprocs=2:ompthreads=18 # Need to turn off Intel affinity management as it interferes with omplace setenv KMP_AFFINITY disabled # Run program with one MPI task and 18 OpenMP threads per socket # (two per node with two nodes) mpiexec_mpt omplace ./executable_name

slide-20
SLIDE 20

Managing your compute time allocation

  • After compiling a program, try running small test

jobs before your large simulation

  • For single core jobs, use the share queue, to avoid

being charged for unused core-hours: Exclusive: wall-clock hours ✖ nodes used ✖ 36 cores per node ✖ queue factor Shared: core-seconds/3600 ✖ queue factor

  • Use the DAV clusters for R and Python scripts as

well as interactive visualization (VAPOR)

slide-21
SLIDE 21

How to store data on Cheyenne

File space Quota Data Safety Description Home /glade/u/home/$USER 50 GB Backups & Snapshots Store settings, code, and other valuables Work /glade/p/work/$USER 512 GB Stable but no backups Good place for keeping run directories and input data Project /glade/p/project Varies Stable but no backups HPSS hsi -> /home/$USER TB/yr Charge Stable but no backups Storage limits depend on your allocation, data cannot be used interactively Scratch /glade/scratch/$USER 10 TB At-risk! Purged! Use as temporary data storage only; manually back up files (e.g., to HPSS)

slide-22
SLIDE 22

Storage tips

  • Keep track of your allocations using “gladequota”
  • Archive large numbers of small files to limit wasted

space on GLADE spaces

  • If data is not needed for immediate access, move

to the HPSS tape archive:

  • hsi cput <filename>
  • hsi cget <filename>
  • Large collections of files can be combined while

transferring to HPSS using HTAR. Efficient!

  • htar -cvf <archive.tar> <directory>
slide-23
SLIDE 23

The future of the DAV systems

  • Geyser and Caldera will continue to serve as data

analysis and visualization machines

  • Integration with Cheyenne is still TBD
  • Current plan is to make 4 of the 12 Geyser nodes available

within Cheyenne using the SLURM scheduler

  • Caldera will likely be accessible only from Yellowstone
  • In early stages of a procurement for a Geyser

replacement and a many-core system Target: 2018

slide-24
SLIDE 24

Things to keep in mind...

  • Yellowstone, Geyser, and Caldera will continue to

run the LSF scheduler. Keep your job scripts

  • rganized.
  • The shared file systems should make data

management easier, but pay attention to where you have compiled programs.

  • If you want to configure settings in your startup files

for Yellowstone and Cheyenne, you should make sure they only run on that system...

slide-25
SLIDE 25

How to make .tcshrc/.profile machine specific

~/.tcshrc ~/.profile (bash)

alias rm=”rm -i” PS1="\u@\h:\w> " if [[ $HOSTNAME == yslogin* ]]; then # Yellowstone settings alias bstat “bjobs -u all” else # Cheyenne settings alias qjobs=”qstat -u $USER” fi tty > /dev/null if ( $status == 0 ) then alias rm ”rm -i” set prompt = "%n@%m:%~" if ( $HOSTNAME =~ yslogin* ) then # Yellowstone settings alias bstat “bjobs -u all” else # Cheyenne settings alias qjobs “qstat -u $USER” endif endif

slide-26
SLIDE 26

CISL Helpdesk/Consulting

https://www2.cisl.ucar.edu/user-support/getting-help

  • Walk-in: ML 1B Suite 55
  • Email: cislhelp@ucar.edu
  • Phone: 303-497-2400

Specific questions from today and/or feedback:

  • Email: vanderwb@ucar.edu

For science questions (e.g., running CESM/WRF), consult relevant support resources