Finite Element Multigrid Solvers for PDE Problems on GPUs and GPU - PowerPoint PPT Presentation

Finite Element Multigrid Solvers for PDE Problems on GPUs and GPU Clusters Robert Strzodka Dominik Göddeke Integrative Scientific Computing Institute for Applied Mathematics Max Planck Institut Informatik Technical University of Dortmund www.mpi$inf.mpg.de/ www.mathematik.tu$dortmund.de/ ~strzodka ~goeddeke

Structure of Double Lecture 2 x 90 min � PART 1 � PART 2 � Parallelism � FEM on GPU Clusters � Grid Discretization � New MPI Application (Rewrite) � Multigrid & Smoothers � Legacy MPI Code � Mixed$Precision (Accelerate) � Data Layout � ��

Overview � Levels of Parallelism � Grid Discretizations of PDEs � Multigrid and Strong Smoothers � Mixed Precision Iterative Refinement � Layout of Multi$valued Data � � ��

Parallelism � Sequential execution is an illusionary software concept � All transistors always do something in parallel ! � Billions of transistors in modern CPUs(>0.5) & GPUs(>2) � Old: Implicit parallelism with caches, ILP, speculation � � diminishing returns, power constraints � � � New: Explicit parallelism on multiple levels � much more efficient & natural � � � � ��

SIMD Parallelism ��+ ��+ instructions ��" ��" ��0 ��0 ��/ ��/ �� - ��- ��# ��# ��. ��. �� "- ��"- � It is impossible to execute just one instruction � �� (add, nop, nop, nop, >) � Penalty for ignoring SIMD � �� !"#��$��%&'�()*� � "#! +��,��

Many$Core Parallelism ��+ ��+ ��" ��" Host ��0 ��0 ��/ ��/ �� - ��- ��# ��# Input Assembler ��. ��. ��"- ��"- Execution Manager Local Local Local Local Local Local Local Local Memory Memory Memory Memory Memory Memory Memory Memory Cache Cache Cache Cache Cache Cache Cache Cache Cache � Penalty for ignoring many$cores Load/store Load/store Load/store Load/store Load/store Load/store � �! �� Global Memory � /0!� ��$��()*� � "+!/+��,��

Intra$Node Parallelism (multiple CPUs/GPUs per PC) �� " ��" ��/ �� !�� ,��" ,��0 ,��/ ,�� Penalty for ignoring intra$node parallelism � �! ��$��!�� ! ��$��!,��

Inter$Node Parallelism in a Cluster �� Penalty for ignoring inter$node parallelism � ��1��2��3��$��2��3�� 1��1��4 " ��

Bandwidth in a CPU$GPU System ��5��!"0�� 3�43!��2�,��5�� (� ��!�3�1 �(� ��!�3�1 �� (� ��!�3�1 �(� ��!�3�1 �(� ��!�3�1 6�� 2 TB/s �� (� ��!�3�1 6�� 6�� (� ��!�3�1 6�� 6�� (� ��!�3�1 6�� 40 GB/s �+�,*7� 6�� +�,*7� 6�� +�,*7� 20 GB/s 200 GB/s 2 GB/s �� 2�8�� 4 GB/s �� # ��

Overview � Levels of Parallelism � Grid Discretizations of PDEs � Multigrid and Strong Smoothers � Mixed Precision Iterative Refinement � Layout of Multi$valued Data ��

Generalized Poisson Problem � → ℜ � ⊆ ℜ � � �� − � � = − ∇ = �� ∂ ν � = � = � � ��2�� ∂ � �� 9��2��3��"��0��2�0��:��3��1��1��; ,�8��8��$��2��8 " ��'��' 8 0 ��'��:��2�$��< ��

Discretization Grids �=��2��4��2 ��1��4�<��1�� 4��<��1�� <� direct ,��>�2��!1��2��4��2 ��1��4�<��1�� 4��<��1�� <� direct ��

Discretization Grids �2�1��8��4��2 ��1��4�<� explicit 4��<��1��7��1�� <�3��3'��1�4�� 2�4��2 ��1��4�<� explicit 4��<��1�� <��2��

nD Arrays ,��>�2��!1��2��4��2 ��1��4�<��1�� 4��<��1��7��1�� <� direct �� simple ��1�� 1��1��4��8�� precious 4��2:�2�3 ��:��$$��!�3�1��1�� 4��2��?@��>�� 1��4��5��A��2��4��2�$$�� $��=��1�:��$��8��

Deformation Adaptivity � This grid is a tensor$product ! � Easier to accelerate in hardware than resolution adaptive grids � Anisotropy level determines optimal solver ��

Finite Element Multigrid Solvers for PDE Problems on GPUs and GPU - PowerPoint PPT Presentation

Finite Element Multigrid Solvers for PDE Problems on GPUs and GPU Clusters Robert Strzodka Dominik Gddeke Integrative Scientific Computing Institute for Applied Mathematics Max Planck Institut Informatik Technical University of Dortmund

Finite Element Multigrid Solvers for PDE Problems on GPUs and GPU Clusters Part 2: Applications

Math 211 Math 211 Lecture #14 M ATLAB s ODE Solvers September 26, 2003 2 Matlab Solvers

- A Finite Element Software Teresa Beck, Simon Gawlok and HiFlow team HiFlow-Finite Element

Finite Element Multigrid Framework for Mimetic Finite Difference Discretizations Xiaozhe Hu

Learning to optimize multigrid PDE solvers DANIEL GREENFELD, WEIZMANN INSTITUTE OF SCIENCE JOINT

Multigrid preconditioners for linear systems arising in PDE constrained optimization Andrei

Scientific Computing I Module 10: Algorithms and Data Storage for PDE Solvers Miriam Mehl Winter

Finite Element Method for netting Daniel.Priour@ifremer.fr IFREMER November 4, 2010

Amortized Finite Element Analysis for Fast PDE-Constrained Optimization Tianju Xue , Alex Beatson,

REVOLUTIONIZING LATTICE QCD PHYSICS WITH HETEROGENEOUS MULTIGRID Kate Clark, April 6th 2016

Compact Fourier Analysis for Multigrid Methods Cortona 2008 Thomas Huckle joint work with

Local convergence of adaptive finite element methods for nonlinear problems Gantumur Tsogtgerel

PDE Models PDE Models Goal is to set up design software frameworks Determine scope

Why use GPUs for graph processing? FOSDEM 2020 2 GPUs and Graphs Graphs GPUs Found

Assessing the algorithmic scaling behavior of PDE solvers Matthias Bolten University of

E aStencils Overview Task: Solve a PDE (efficiently). Multigrid methods is a framework for

Contents Foundations of Artificial Intelligence Problem-Solving Agents 1 3. Solving Problems by

Bijective proof and generalization of Siladi cs partition theorem Isaac KONAN IRIF, Paris

Dark Matter Candidates and Detection Hyunjong Jeon Adviser: Martin Ritter Ludwig Maximilian

What You Need to Know Definitions e.g., linear independence, basis, null space Concepts

Families: Opportunities for Offense and Building Power in in th the States August 2, 2017

Regression and Difference of Two Proportions August 28, 2019 August 28, 2019 1 / 34 Regression

Malaysian Healthy Ageing Society Stress and Trauma Recovery of Elderly Post 2010 Merapi Eruption:

Investigation of the background sources of muography Lszl Olh 1 , Hiroyuki Tanaka 1 , Dezs