Introduction to Cheyenne 3 November, 2016 Consulting Services Group - - PowerPoint PPT Presentation
Introduction to Cheyenne 3 November, 2016 Consulting Services Group - - PowerPoint PPT Presentation
Introduction to Cheyenne 3 November, 2016 Consulting Services Group Brian Vanderwende Topics we will cover Technical specs of the Cheyenne supercomputer and expanded GLADE file systems The Cheyenne computing environment Accessing
Topics we will cover
- Technical specs of the Cheyenne supercomputer
and expanded GLADE file systems
- The Cheyenne computing environment
- Accessing software on Cheyenne
- Compilers
- MPI/Parallelism
- Submitting batch jobs using the PBS scheduler
- Data storage
- Q&A
User-facing hardware specifications
- 4032 dual-socket nodes
- 18-core 2.3 Ghz Intel Xeon (Broadwell) processors
- 36 total CPUs per node (16 on Yellowstone)
- Hyperthreading supported for up to 72 virtual CPUs
- Regular and high-memory nodes
- 3164 nodes with 64 GB of memory
- 864 nodes with 128 GB of memory
- Laramie test system has 70 usable nodes with 64 GB memory
- Infiniband interconnects for message passing
- Six login nodes with 256 GB of memory
The GLADE file systems will be expanded accordingly
- Will continue to use IBM GPFS/Spectrum Scale
technology
- Existing capacity: 16 PB
- New capacity to be added: 21 PB
- Total capacity of 37 PB, with potential for
expansion to 58 PB in future upgrades
- Data transfer rates will be more than doubled
- Home, work, and scratch spaces will be shared
between Yellowstone and Cheyenne!
Cheyenne is an evolutionary increase from Yellowstone
Yellowstone
- 1.5 petaflops peak
compute
- 72,256 cores
- 145 TB total memory
- 56 GB/s interconnects
Cheyenne
- 5.34 petaflops peak
compute
- 145,152 cores
- 313 TB total memory
- 100 GB/s interconnects
1 Yellowstone core-hour = 0.82 Cheyenne core-hours
Timeline for HPC/Cheyenne
- 1. Test system (Laramie) in place since July
- 2. Cheyenne assembled in August
- 3. Cheyenne shipped to NWSC in September
- 4. Acceptance testing and integration with file
systems in the fall
- 5. NCAR acceptance in December
- 6. Start of production on Cheyenne: January 2017
a. Accelerated Scientific Discovery (ASD) projects begin (early user access in December 2016)
- 7. Yellowstone production ends: December 2017
Logging into the new systems
- As before, use your authentication token (yubikey)
along with your username to login ssh -X -l username cheyenne.ucar.edu
- You will then be on one of six login nodes
- Your default shell is tcsh, but others are available
through SAM
- SUSE Linux OS provides typical UNIX commands
- Users of the test system should replace “cheyenne”
with “laramie” where appropriate
The login nodes are a shared resource - use them lightly!
- As with Yellowstone, the six login nodes on
Cheyenne will be a shared space
- Your processes will compete with those of 10-100s
- f other users for processing and memory
- So limit your usage to:
- Reading and writing text/code
- Compiling programs
- Performing small data transfers
- Interacting with the job scheduler
- Programs that use excessive resources on the
login nodes will be terminated
CISL builds software for users to load with environment modules
- We build programs and libraries that you can
enable by loading an environment module
- Compilers, MPI, NetCDF, MKL, Python, etc.
- Modules configure your computing environment so
you can find binaries/executables, libraries and headers to compile with, and manuals to reference
- Modules are also used to prevent conflicting
software from being loaded
- You don’t need to use modules, but they simplify
things greatly, and we recommend their use
Note that Yellowstone and Cheyenne will each have their own module/software tree!
Intel 16.0.3 Software built with Intel
The Cheyenne module tree will add choice and clarity
Yellowstone Cheyenne
Compiler Intel GNU MKL netCDF pnetCDF Compiler Intel 16.0.3 Intel 17.0.0 GNU 6.2.0 MPI SGI MPT 2.15 Intel MPI 5.1.3.210 OpenMPI 10.2.0 MKL netCDF Intel 16.0.3 MPT 2.15 pnetCDF
Some useful module commands
- module add/remove/load/unload <software>
- module avail - show all community software
installed on the system
- module list - show all software currently loaded
within your environment
- module purge - clear your environment of all
loaded software
- module save/restore <name> - create or load a
saved set of software
- module show <software> - show the commands
a module runs to configure your environment
Compiling software on Cheyenne
- We will support Intel, GCC, and PGI
- As on Yellowstone, wrapper scripts are loaded by
default (ncarcompilers module) which make including code and linking to libraries much easier
- Building with netCDF using the wrappers:
- ifort model.f90 -o model
- Building with netCDF without the wrappers:
- setenv NETCDF /path/to/netcdf
ifort -I${NETCDF}/include model.f90
- L${NETCDF}/lib -lnetcdff -o model
- Do not expect a parallel program compiled with one
MPI library to run using a different library!
Where you compile code depends
- n where you intend to run it
- Cheyenne has newer Intel processors than
Yellowstone and Caldera, which in turn have newer chips than Geyser
- If you must run a code across systems, either:
- 1. Compile for the oldest system you want to use,
to ensure that results are consistent
- 2. For best performance, make copies of the code
and compile separately for each system
To access compute resources, use the PBS job manager
#!/bin/bash #PBS -N WRF_PBS #PBS -A <project> #PBS -q regular #PBS -l walltime=00:30:00 #PBS -l select=4:ncpus=36:mpiprocs=36 #PBS -j oe #PBS -o log.oe # Run WRF with SGI MPT mpiexec_mpt -n 144 ./wrf.exe #!/bin/bash #BSUB -J WRF_PBS #BSUB -P <project> #BSUB -q regular #BSUB -W 30:00 #BSUB -n 144 #BSUB -R “span[ptile=16]” #BSUB -o log.oe #BSUB -e log.oe # Run WRF with IBM MPI mpirun.lsf ./wrf.exe
LSF (Yellowstone) PBS (Cheyenne)
A (high-memory) shared queue will be available on Cheyenne
Queue name Priority Wall clock (hours) Nodes Queue factor Description capability 1 12 1153 - 4032 1.0 Execution window: Midnight Friday to 6 a.m. Monday premium 1 12 ≤ 1152 1.5 share 1 6 1 2.0 Interactive use for debugging and other tasks
- n a single, shared, 128-GB node.
small 1.5 2 ≤ 18 1.5 Interactive and batch use for testing, debugging, profiling; no production workloads. regular 2 12 ≤ 1152 1.0 economy 3 12 ≤ 1152 0.7 standby 4 12 ≤ 1152 0.0 Do not submit to standby. Used when you have exceeded usage or allocation limits.
Submitting jobs to and querying information from PBS
- To submit a job to PBS, use qsub:
- Script: qsub job_script.pbs
- Interactive: qsub -I -l select=ncpus:36:mpiprocs:36 -l
walltime=10:00 -q share -A <project>
- qstat <job_id> - query information about the job
- qstat -u $USER - summary of your active jobs
- qstat -Q <queue> - show status of specified or all
queues
- qdel <job_id> - delete and/or kill the specified job
It is not possible to search for backfill windows in PBS!
Using threads/OpenMP to exploit shared-memory parallelism
Only OpenMP Hybrid MPI/OpenMP
#!/bin/tcsh #PBS -N OPENMP #PBS -A <project> #PBS -q small #PBS -l walltime=10:00 #PBS -l select=1:ncpus=10 #PBS -j oe #PBS -o log.oe # Run program with 10 threads ./executable_name #!/bin/tcsh #PBS -N HYBRID #PBS -A <project> #PBS -q small #PBS -l walltime=10:00 #PBS -l select=2:ncpus=36:mpiprocs=1:ompthreads=36 #PBS -j oe #PBS -o log.oe ### Make sure threads are distributed across the node setenv MPI_OPENMP_INTEROP 1 # Run program with one MPI task and 36 OpenMP # threads per node (two nodes) mpiexec_mpt ./executable_name
Pinning threads to CPUs with SGI MPT’s omplace command
- Normally threads will migrate across available
CPUs throughout execution
- Sometimes it is advantageous to “pin” threads to a
particular CPU (e.g., OpenMP across a socket)
#PBS -l select=2:ncpus=36:mpiprocs=2:ompthreads=18 # Need to turn off Intel affinity management as it interferes with omplace setenv KMP_AFFINITY disabled # Run program with one MPI task and 18 OpenMP threads per socket # (two per node with two nodes) mpiexec_mpt omplace ./executable_name
Managing your compute time allocation
- After compiling a program, try running small test
jobs before your large simulation
- For single core jobs, use the share queue, to avoid
being charged for unused core-hours: Exclusive: wall-clock hours ✖ nodes used ✖ 36 cores per node ✖ queue factor Shared: core-seconds/3600 ✖ queue factor
- Use the DAV clusters for R and Python scripts as
well as interactive visualization (VAPOR)
How to store data on Cheyenne
File space Quota Data Safety Description Home /glade/u/home/$USER 50 GB Backups & Snapshots Store settings, code, and other valuables Work /glade/p/work/$USER 512 GB Stable but no backups Good place for keeping run directories and input data Project /glade/p/project Varies Stable but no backups HPSS hsi -> /home/$USER TB/yr Charge Stable but no backups Storage limits depend on your allocation, data cannot be used interactively Scratch /glade/scratch/$USER 10 TB At-risk! Purged! Use as temporary data storage only; manually back up files (e.g., to HPSS)
Storage tips
- Keep track of your allocations using “gladequota”
- Archive large numbers of small files to limit wasted
space on GLADE spaces
- If data is not needed for immediate access, move
to the HPSS tape archive:
- hsi cput <filename>
- hsi cget <filename>
- Large collections of files can be combined while
transferring to HPSS using HTAR. Efficient!
- htar -cvf <archive.tar> <directory>
The future of the DAV systems
- Geyser and Caldera will continue to serve as data
analysis and visualization machines
- Integration with Cheyenne is still TBD
- Current plan is to make 4 of the 12 Geyser nodes available
within Cheyenne using the SLURM scheduler
- Caldera will likely be accessible only from Yellowstone
- In early stages of a procurement for a Geyser
replacement and a many-core system Target: 2018
Things to keep in mind...
- Yellowstone, Geyser, and Caldera will continue to
run the LSF scheduler. Keep your job scripts
- rganized.
- The shared file systems should make data
management easier, but pay attention to where you have compiled programs.
- If you want to configure settings in your startup files
for Yellowstone and Cheyenne, you should make sure they only run on that system...
How to make .tcshrc/.profile machine specific
~/.tcshrc ~/.profile (bash)
alias rm=”rm -i” PS1="\u@\h:\w> " if [[ $HOSTNAME == yslogin* ]]; then # Yellowstone settings alias bstat “bjobs -u all” else # Cheyenne settings alias qjobs=”qstat -u $USER” fi tty > /dev/null if ( $status == 0 ) then alias rm ”rm -i” set prompt = "%n@%m:%~" if ( $HOSTNAME =~ yslogin* ) then # Yellowstone settings alias bstat “bjobs -u all” else # Cheyenne settings alias qjobs “qstat -u $USER” endif endif
CISL Helpdesk/Consulting
https://www2.cisl.ucar.edu/user-support/getting-help
- Walk-in: ML 1B Suite 55
- Email: cislhelp@ucar.edu
- Phone: 303-497-2400
Specific questions from today and/or feedback:
- Email: vanderwb@ucar.edu