Leveraging modern supercomputing infrastructure for tensor contractions in large electronic-structure calculations
Ilya A. Kaliman
University of Southern California
September 18-19, 2017
Leveraging modern supercomputing infrastructure for tensor - - PowerPoint PPT Presentation
Leveraging modern supercomputing infrastructure for tensor contractions in large electronic-structure calculations Ilya A. Kaliman University of Southern California September 18-19, 2017 Tensors in Quantum Chemistry ^ H = E Coupled
September 18-19, 2017
2
Coupled Cluster Equations
3
4
5
Permutational symmetry Spin symmetry Molecular point-group symmetry Canonical tensor blocks Non-canonical blocks (computed from canonical blocks) Zero blocks a ji=−aij
6
C11 C12 C13 C21 C22 C23
A11 A12 A13 A21 A22 A23 B11 B12 B21 B22 C11 C12 C13 C21 C22 C23 C31 C32 C33
A11 A12 A13 A21 A22 A23 A31 A32 A33 B11 B12 B13 B21 B22 B23 B31 B32 B33
blocks (orange) need to be computed
independently in parallel
C11=A11⊗B11+A21⊗B12
x Unfolding + BLAS/BLIS
C12=A12⊗B11+ A22⊗B12
7
Shared Memory
Canonical tensor blocks
CPU CPU CPU CPU
8
Compute node Compute node Compute node Compute node Compute node Compute node
Canonical tensor blocks
Shared Filesystem
9
Compute node Compute node Compute node Compute node Compute node Compute node
Canonical tensor blocks
Shared Filesystem
Fast cache (SSD, etc)
10
http://www.nersc.gov/users/computational-systems/cori/burst-buffer/burst-buffer/
6.5 Gb/sec read/write bandwidth
11
–
xm_contract(1.0, A, B, 2.0, C, “abcd”, “ijcd”, “ijab”);
–
xm_add(1.0, A, 2.0, B, “ij”, “ji”);
–
...
– MPI-aware disk-backed memory allocator – Code for tensor operations – Auxiliary routines
– Static load balancing between the nodes (MPI) – Dynamic load balancing within a node (OpenMP)
12
Total tensor data size is over 2 Tb, time in seconds, speedup relative to one node in parenthesis
13
14
– Prof. Anna Krylov, USC – Dr. Evgeny Epifanovsky, Q-Chem