porting the plasma simulation picongpu to heterogeneous
play

Porting the Plasma Simulation PIConGPU to Heterogeneous - PowerPoint PPT Presentation

Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka Ren Widera 1 , Erik Zenker 1,2 , Guido Juckeland 1 , Benjamin Worpitz 1,2 , Axel Huebl 1,2 , Andreas Knpfer 2 , Wolfgang E. Nagel 2 , Michael Bussmann 1 1


  1. Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka René Widera 1 , Erik Zenker 1,2 , Guido Juckeland 1 , Benjamin Worpitz 1,2 , Axel Huebl 1,2 , Andreas Knüpfer 2 , Wolfgang E. Nagel 2 , Michael Bussmann 1 1 Helmholtz-Zentrum Dresden – Rossendorf 2 Technische Universität Dresden Prof. Peter Mustermann I Institut xxxxx I www.hzdr.de

  2. PICon GPU Electron Acceleration Ion Acceleration Plasma Instabilities with Lasers with Lasers    Compact X-Ray sources Tumor Therapy Astrophysics Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 2 { r.widera, e.zenker, g.juckeland }@hzdr.de

  3. Domain Decomposition ─ Field and Particle Domain + + + ─ + ─ + ─ ─ ─ + + + + ─ ─ ─ + + + ─ ─ ─ + + ─ ─ + ─ + + + ─ ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 3 { r.widera, e.zenker, g.juckeland }@hzdr.de

  4. Domain Decomposition ─ Field and Particle Domain + + + ─ + ─ + ─  Moving Particles create Fields ─ ─ + + +  Fields act back on Particles +  ─ ─ Particles change Cells ─ + + + ─ ─ ─ + + ─ ─ + ─ + + + ─ ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 4 { r.widera, e.zenker, g.juckeland }@hzdr.de

  5. Creating Vectorized Data Structures for Particles and Fields Field Domain Particle Domain + + + 1 2 + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 5 { r.widera, e.zenker, g.juckeland }@hzdr.de

  6. Creating Vectorized Data Structures for Particles and Fields Field Domain Particle Domain Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 6 { r.widera, e.zenker, g.juckeland }@hzdr.de

  7. Creating Vectorized Data Structures for Particles and Fields Field Domain Particle Domain Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4  chunked in supercells  line wise aligned Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 7 { r.widera, e.zenker, g.juckeland }@hzdr.de

  8. Creating Vectorized Data Structures for Particles and Fields Field Domain Particle Domain Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4  chunked in supercells  fixed size frames  line wise aligned  struct of aligned arrays Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 8 { r.widera, e.zenker, g.juckeland }@hzdr.de

  9. Creating Vectorized Data Structures for Particles and Fields Field Domain Particle Domain Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4  chunked in supercells  fixed size frames  line wise aligned  struct of aligned arrays Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 9 { r.widera, e.zenker, g.juckeland }@hzdr.de

  10. Algorithm Driven Cache Strategy Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 10 { r.widera, e.zenker, g.juckeland }@hzdr.de

  11. Algorithm Driven Cache Strategy Global Memory Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 11 { r.widera, e.zenker, g.juckeland }@hzdr.de

  12. Algorithm Driven Cache Strategy Global Memory Shared Memory Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 12 { r.widera, e.zenker, g.juckeland }@hzdr.de

  13. Algorithm Driven Cache Strategy Shared Memory Global Memory Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 13 { r.widera, e.zenker, g.juckeland }@hzdr.de

  14. High Utilization of Threads Shared Memory Global Memory Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + THREAD BLOCK 3 4 THREAD 1 THREAD 2 THREAD 3 THREAD 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 14 { r.widera, e.zenker, g.juckeland }@hzdr.de

  15. Task-Parallel Execution of Kernels + Asynchronous Communication Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 15 { r.widera, e.zenker, g.juckeland }@hzdr.de

  16. PIConGPU ─ Scales up to 16,384 GPUs strong scaling 10000 1000 speedup 100 ideal 1 to 32 10 8 to 256 64 to 2048 512 to 16384 4096 to 16384 1 1 10 100 1000 10000 number of GPUs Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 16 { r.widera, e.zenker, g.juckeland }@hzdr.de

  17. PIConGPU ─ Scales up to 16,384 GPUs strong scaling weak scaling efficiency 105 10000 100 1000 efficiency [%] 95 speedup 100 90 ideal 1 to 32 10 8 to 256 85 64 to 2048 512 to 16384 ideal 4096 to 16384 PIConGPU 1 80 1 10 100 1000 10000 1 10 100 1000 10000 number of GPUs number of GPUs Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 17 { r.widera, e.zenker, g.juckeland }@hzdr.de

  18. PIConGPU ─ Scales up to 16,384 GPUs strong scaling weak scaling efficiency 105 10000 100 1000 efficiency [%] 95 Efficiency >95% speedup 100 90 ideal 1 to 32 10 8 to 256 85 64 to 2048 512 to 16384 ideal 4096 to 16384 PIConGPU 1 80 1 10 100 1000 10000 1 10 100 1000 10000 number of GPUs number of GPUs Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 18 { r.widera, e.zenker, g.juckeland }@hzdr.de

  19. PIConGPU ─ Scales up to 16,384 GPUs strong scaling weak scaling efficiency 105 10000 100 1000 efficiency [%] 95 Efficiency >95% speedup 100 6.9 PFlop/s (SP) 90 ideal 1 to 32 10 8 to 256 85 64 to 2048 512 to 16384 ideal 4096 to 16384 PIConGPU 1 80 1 10 100 1000 10000 1 10 100 1000 10000 number of GPUs number of GPUs Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 19 { r.widera, e.zenker, g.juckeland }@hzdr.de

  20. More Physics, More Computations, More Power! ─ + s 1 s 2 ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 20 { r.widera, e.zenker, g.juckeland }@hzdr.de

  21. More Physics, More Computations, More Power! Old atom state ─ s 1,1 + s 1,2 s 1 s 2 s 1,3 ... s n,m ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 21 { r.widera, e.zenker, g.juckeland }@hzdr.de

  22. More Physics, More Computations, More Power! Atom-physical Old atom state effects ─ t 1,1 t 1,2 t 1,3 … t 1,n s 1,1 + t 2,1 . s 1,2 s 1 s 2 t 3,1 . s 1,3 … . ... t n,1 t n,n s n,m ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 22 { r.widera, e.zenker, g.juckeland }@hzdr.de

  23. More Physics, More Computations, More Power! Atom-physical Old atom New atom state state effects ─ t 1,1 t 1,2 t 1,3 … t 1,n s 1,1 s 1,1 + t 2,1 . s 1,2 s 1,2 s 1 s 2 t 3,1 . s 1,3 s 1,3 = … . ... ... t n,1 t n,n s n,m s n,m ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 23 { r.widera, e.zenker, g.juckeland }@hzdr.de

  24. More Physics, More Computations, More Power! Atom-physical Old atom New atom state state effects ─ t 1,1 t 1,2 t 1,3 … t 1,n s 1,1 s 1,1 + t 2,1 . s 1,2 s 1,2 s 1 s 2 t 3,1 . s 1,3 s 1,3 = … . ... ... t n,1 t n,n s n,m s n,m ─ Really Big Data Task ◾ Random access on big amounts of data > 100 GB ◾ Good job for powerful CPUs ◾ Efficient CPU/GPU cooperation Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 24 { r.widera, e.zenker, g.juckeland }@hzdr.de

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend