Department of Materials
Accelerated Sparse Matrix Multiplication for Quantum Chemistry with CP2K
- n Hybrid Supercomputers
Ole Sch¨ utt
- le.schuett@mat.ethz.ch
Nanoscale Simulations
- le.schuett@mat.ethz.ch
1 / 17
Accelerated Sparse Matrix Multiplication for Quantum Chemistry with - - PowerPoint PPT Presentation
Department of Materials Accelerated Sparse Matrix Multiplication for Quantum Chemistry with CP2K on Hybrid Supercomputers Ole Sch utt ole.schuett@mat.ethz.ch Nanoscale Simulations ole.schuett@mat.ethz.ch 1 / 17 Application: Emerging
Department of Materials
1 / 17
2 / 17
Guess initial density ρ Calculate matrix H from ρ Costs: O(N), but dominates for small systems Calculate eigenvectors ψi of H Costs: O(N 3) Calculate new density ρ =
i |ψi|2
Calculate ρ directly as matrix function of H Costs: O(N) Calculate energy from ρ SCF Iteration Dense linear algebra Sparse linear algebra
n )
3 / 17
10000 20000 30000 40000 50000 60000 Number of atoms 20 40 60 80 100 Wall time [min]
Diagonalization Linear scaling
DFT on 46.656 cores DFTB on 9.216 cores
3 / 17
H O H H O H
H O H H O H
H O H H O H
4 / 17
Cluster Node GPU
MPI Parallelization
Cache Optimization
Stack generation
CPU/GPU Load balancing
5 / 17
MPI send MPI receive generate stacks host to device process stacks MPI send generate stacks host to device process stack MPI receive MPI send MPI receive generate stacks host to device process stacks Host Buffer 1 Device Buffer 1 Host Buffer 2 Device Buffer 2 time
6 / 17
7 / 17
8 / 17
1000 2000 3000 4000 5000 6000 7000 # Parameter Set 50 100 150 200 250 300 Performance GFlop/s
9 / 17
10 / 17
11 / 17
2 4 6 8 10 12 Cores 50 100 150 200 250 300 350 Performance [GFLOP/s]
12 / 17
13 / 17
14 / 17
15 / 17
200 400 600 800 1000 1200 # nodes 50 100 150 200 250 Total Perfomance [TFlop/s]
16 / 17
17 / 17