Cray Programming Environment Update & Roadmap
This Presentation May Contain Some Preliminary Information, Subject To Change
Cray Programming Environment Update & Roadmap Luiz DeRose - - PowerPoint PPT Presentation
Cray Programming Environment Update & Roadmap Luiz DeRose Programming Environment Director Cray Inc. This Presentation May Contain Some Preliminary Information, Subject To Change Cray Programming Environment Focus It is the role of the
This Presentation May Contain Some Preliminary Information, Subject To Change
May 6, 2008 Cray Inc. Proprietary Slide 2 October 2, 2007 Cray Inc. Confidential Slide 2
It is the role of the Programming Environment to close the gap between
The Cray Programming Environment addresses issues of scale and
User productivity is enhanced with
May 6, 2008 Cray Inc. Proprietary Slide 3 Slide 3
Programming Languages
Fortran C C++ Chapel # Java (Service nodes)
Programming models
Distributed Memory
Shared Memory
PGAS
Tools
Environment setup
Debuggers
Performance analysis
Optimized Math Libraries
LibSci
Cray PETSc
CRAFFT 2# Fast-mv 2#
1: X2 Only 2: XT Only #: Under development
May 6, 2008 Cray Inc. Proprietary Slide 4 Slide 4
2008 Q1 Q2 Q3 Q4 2009 Q1 Q2 Q3 Q4 2010 Q1 Q2 Q3 Q4 2011 Q1 Q2 Q3 Q4
Cascade Debugger
CDB ▼1.0 ▼1.1 ▼2.0
Cray Performance Tools
CPT ▼4.2 ▼5.0 ▼5.1 ▼6.0 ▼4.3
Message Passing Toolkit
▼3.0 ▼3.1 ▼4.0 ▼5.1 MPT ▼4.1 ▼5.0
Scientific Libraries
LibSci ▼10.2.1 ▼10.3 ▼11.0 ▼11.1 ▼12.0 ▼10.4 ▼12.1
Chapel
Chapel ▼0.7 ▼1.0 ▼1.2 ▼2.0 ▼1.1 ▼2.1 ▼3.0 ▼3.1 Brule Calhoun Diamond Eagle Alpine
Cray Compiling Environment
CCE ▼7.0 ▼7.1 ▼7.2 ▼8.0 ▼ PE 6.0
May 6, 2008 Cray Inc. Proprietary Slide 5
PGI
PathScale
GNU
UPC
May 6, 2008 Cray Inc. Proprietary Slide 6
May 6, 2008 Cray Inc. Proprietary Slide 7
May 6, 2008 Cray Inc. Proprietary Slide 8
Cray XT MPI 3.0 uses Cray X2 MPI as base and merge of MPICH 1.0.5 Cray MPI 3.0 (Released in April 08)
May 6, 2008 Cray Inc. Proprietary Slide 9 May 08 Slide 9
Single copy
activated at 128K bytes message and above Huge improvements for small to medium messages
May 6, 2008 Cray Inc. Proprietary Slide 10 May 08 Slide 10
SMP aware collective
default
May 6, 2008 Cray Inc. Proprietary Slide 11 May 08 Slide 11
43% gain in the Barotropic phase
May 6, 2008 Cray Inc. Proprietary Slide 12
Must be easy and flexible to use
Integrated performance tools solution
May 6, 2008 Cray Inc. Proprietary Slide 13
May 6, 2008 Cray Inc. Proprietary Slide 14
May 6, 2008 Cray Inc. Proprietary Slide 15
May 6, 2008 Cray Inc. Proprietary Slide 16
# You can edit this file, if desired, and use it # to reinstrument the program for tracing like this: # # pat_build -O ft.ind.B.2+pat+5257-770sdt.apa # # These suggested trace options are based on data from: # # /work/users/luizd/COE_Workshop/run/ft.ind.B.2+pat+5257- 770sdt.xf # ---------------------------------------------------------------------- # HWPC group to collect by default.
# ---------------------------------------------------------------------- # Libraries to trace.
# ---------------------------------------------------------------------- # User-defined functions to trace, sorted by % of samples. # Limited to top 200. A function is commented out if it has < 1% # of samples, or if a cumulative threshold of 90% has been reached.
# Note: -u should NOT be specified as an additional option. # 37.70%
# 26.23%
# 9.37%
# 8.96%
# 7.82%
# Functions below this point account for less than 10% of samples. # 6.43% # -T transpose2_finish_ # 2.72% # -T cfftz_ # 0.48% # -T vranlc_ # 0.28% # -T compute_indexmap_ # ----------------------------------------------------------------------
/work/users/luizd/COE_Workshop/bin/ft.ind.B.2 # Original program.
May 6, 2008 Cray Inc. Proprietary Slide 17
May 6, 2008 Cray Inc. Proprietary Slide 18
Released LibSci 10.2.0 (and 10.2.1)
Released PETSc 2.3.3
Released IRT2.0 automatic interfaces libsci-10.3.0 will contain considerable performance improvements
May 6, 2008 Cray Inc. Proprietary Slide 19
Solves linear systems in single precision whilst obtaining solutions
Serial and Parallel versions of LU, Cholesky, and QR With LibSci-10.2.0, there are now 2 ways to use the library
May 6, 2008 Cray Inc. Proprietary Slide 20
0.8 1 1.2 1.4 1.6 1.8 2 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
matrix size
speed-up
LU Cholesy QR
Measuring speed-up of IRT over full precision solver
May 6, 2008 Cray Inc. Proprietary Slide 21
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
1 . E + 1 1 . E + 2 1 . E + 3 1 . E + 4 1 . E + 5 1 . E + 6 1 . E + 7 1 . E + 8 1 . E + 9
Condition Number
Speed-up
n=3000 n=2000 n=1000
Measuring speed-up for various condition numbers, irt_lu_real_serial used IRT works well IRT may help IRT will not help
May 6, 2008 Cray Inc. Proprietary Slide 22
rf heating in tokamak Maxwell-Bolzmann Eqns FFT Dense linear system Calc Quasi-linear op INCITE: “High Power
Courtesy Richard Barrett
May 6, 2008 Cray Inc. Proprietary Slide 23
Courtesy Richard Barrett
May 6, 2008 Cray Inc. Proprietary Slide 24
May 6, 2008 Cray Inc. Proprietary Slide 25
10.1.0 10.1.0
IRT2.0 IRT2.0
10.2.1 10.2.1
Quad Quad Core Core Tuning Tuning
10.2.0 10.2.0
LibGoto LibGoto BLAS, BLAS, LAPACK LAPACK Mixed Mixed-
mode SCaLAPACK SCaLAPACK
10.3.0 10.3.0 11.0.0 11.0.0
Baker Baker Support Support
CAF CAF-
ScaLAPACK
2.3.3 2.3.3 2.3.4 2.3.4
PETSc PETSc + + CASK CASK
LibSci ACML PETSc FFTW
3.0 3.0 3.6 3.6 3.2 3.2 1Q07 2Q07 3Q07 4Q07 today 1Q08 3Q08 4Q08 1Q09
XT4 XT5 Baker
4.1 4.1
CASK
1.0 1.0 1.1 1.1
CRAFFT
1.0 1.0 3.1 3.1
Fast Libm
1.0 1.0
May 6, 2008 Cray Inc. Proprietary Slide 26
May 6, 2008 Cray Inc. Proprietary Slide 27