Advancing first-principle symmetry-guided nuclear modeling for studies of nucleosynthesis and fundamental symmetries in nature
NCSA Blue Waters Symposium for Petascale Science and Beyond, 2019
Advancing first-principle symmetry-guided nuclear modeling for - - PowerPoint PPT Presentation
Advancing first-principle symmetry-guided nuclear modeling for studies of nucleosynthesis and fundamental symmetries in nature Students & Postdocs Collaborators NCSA Blue Waters Symposium for Petascale Science and Beyond, 2019 Nuclear
NCSA Blue Waters Symposium for Petascale Science and Beyond, 2019
Residual strong force between quarks → highly complex two-, three- and four-body forces
Energy
solve the Schrodinger equation for a system of interacting nucleons
1+ 2+ 0+ 4+
Nuclear Hamiltonian – operator of energy Lanczos algorithm eigenvalues eigenvectors
Limits application of ab initio studies to lightest nuclei Use partial symmetries of nuclear collective motion to adopt smaller physically relevant model spaces
Large aggregate memory and amount of memory per node (64GB) High peak memory bandwidth (102.4 GB/s)
[courtesy of Pieter Maris]
number of harmonic oscillator excitations total proton, total neutron and total intrinsic spins deformation rotation
Computational effort: 90 % - computing matrix elements 10% - solving eigenvalue problem
C++/Fortran code parallelized using hybrid MPI/OpenMP Open source: https://sourceforge.net/p/lsu3shell/home/Home/
Original density structure
15 processes 378 processes 37,950 processes
(0 0) (1 1) (0 3) (3 0) (2 2) (1 4) (4 1) (3 3) (0 6) (6 0) (2 5) (5 2) (4 4) (7 1) (6 3) (9 0) (8 2) (10 1) (12 0) 0.0% 0.3% 0.5% (0 1) (2 0) 0% 60% (1 0) (0 2) (2 1) (4 0) 0% 7% 14% (0 0) (1 1) (0 3) (3 0) (2 2) (4 1) (6 0) 0% 5% 10% (0 1) (2 0) (1 2) (3 1) (0 4) (2 3) (5 0) (4 2) (6 1) (8 0) 0% 2% 4% (1 0) (0 2) (2 1) (1 3) (4 0) (3 2) (0 5) (2 4) (5 1) (4 3) (7 0) (6 2) (8 1) (10 0) 0.00% 0.75% 1.50%
( 1 ) ( 2 ) ( 1 2 ) ( 3 1 ) ( 4 ) ( 2 3 ) ( 5 ) ( 4 2 ) ( 1 5 ) ( 3 4 ) ( 6 1 ) ( 7 ) ( 5 3 ) ( 2 6 ) ( 4 5 ) ( 8 ) ( 7 2 ) ( 6 4 ) ( 9 1 ) ( 8 3 ) ( 1 1 ) ( 1 2 ) ( 1 2 1 ) ( 1 4 )
0.00% 0.06% 0.11%
remaining Sp Sn S Sp=1/2 Sn=3/2 S=2 Sp=3/2 Sn=1/2 S=2 Sp=3/2 Sn=3/2 S=3 Sp=1/2 Sn=1/2 S=1
60.77% 18.82% 11.63% 5.37% 2.28% 0.85% 0.27%
Dytrych, Launey, Draayer, et al., PRL 111 (2013) 252501
Low spin Large deformation
Probability to find cluster structure
Nuclear reaction:
Probability to find cluster structure astrophysical simulation
Nuclear reaction:
Nucleus response to external probe (photon, neutrino, etc ..)
New approach: SA-NCSM + Lorentz Integral Transform Method
Response functions – input for neutrino experiments Nuclear input - 2nd largest source of uncertainties SA-NCSM + LIT: preliminary results : component of neutrino detectors
Dynamic allocation – slow and dependend on malloc implementation.
tcmalloc (Google), jemalloc (Facebook), tbbmalloc (Intel), litemalloc, LLAlloc, SuperMalloc Matrix construction involves lot of concurrent small allocations tcmalloc – best performance & memory footprint decrease
Resulting speedup
20Ne J=0 20Ne J=2 16O J=0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 legacy code
speedup
Memory pooling allocating large number of small objects of constant size is inneficient Solution: memory pooling Boost.Pool – best performance Small buffer optimizations small static buffer for a small number of elements, and dynamic memory over the specified threshold. https://dspace.cvut.cz/handle/10467/80473 For more results see Martin Kocicka's MSc thesis:
Description of 99.9% mass of the Universe
Ultimate source of energy in the Universe Aggregate memory and high memory bandwidth
Many papers in top journals and reaching beyond what competitives theories could accomplish
Excellent support and guidance as needed
Training students in using HPC resources
Codes and results publicly available
https://sourceforge.net/p/lsu3shell/home/Home/