a scalable multi level preconditioner for matrix free
play

A Scalable Multi-level Preconditioner for Matrix-Free -Finite - PowerPoint PPT Presentation

FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions A Scalable Multi-level Preconditioner for Matrix-Free -Finite Element Analysis of Human Bone Structures Peter Arbenz 1 1 Institute of


  1. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions A Scalable Multi-level Preconditioner for Matrix-Free µ -Finite Element Analysis of Human Bone Structures Peter Arbenz 1 1 Institute of Computational Science, ETH Z¨ urich, Comput. Methods with Applications, Harrachov, Aug 20–24, 2007 CMA, Harrachov, August 20–24, 2007 1/32

  2. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Coworkers Institute of Computational Science, ETH Z¨ urich Uche Mennel Marzio Sala Cyril Flaig Institute for Biomechanics, ETH Z¨ urich Harry van Lenthe Ralph M¨ uller Andreas Wirth IBM Research Division, Z¨ urich Research Lab Costas Bekas Alessandro Curioni CMA, Harrachov, August 20–24, 2007 2/32

  3. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Outline of the talk µ FE Modeling of Trabecular Bone Structures 1 The Mathematical Model 2 Solving the system of equations 3 Algebraic multilevel preconditioning 4 Numerical experiments 5 Conclusions 6 CMA, Harrachov, August 20–24, 2007 3/32

  4. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions The need for µ FE analysis of bones Osteoporosis is disease characterized by low bone mass and deterioration of bone microarchitecture. Lifetime risk for osteoporotic fractures in women is estimated close to 40%; in men risk is 13% Enormous impact on individual, society and health care systems (as health care problem second only to cardiovascular diseases) Since global parameters like bone density do not admit to predict the fracture risk, patients have to be treated in a more individual way. Today’s approach consists of combining 3D high-resolution CT scans of individual bones with a micro-finite element ( µ FE) analysis. CMA, Harrachov, August 20–24, 2007 4/32

  5. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Cortical vs. trabecular bone CMA, Harrachov, August 20–24, 2007 5/32

  6. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions In vivo assessment of bone strength pQCT: Peripheral Quantitative Computed Tomography CMA, Harrachov, August 20–24, 2007 6/32

  7. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions The mathematical model Equations of linearized 3D elasticity (pure displacement formulation): Find displacement field u that minimizes total potential energy � � � � µε ( u ) : ε ( u ) + λ 2 (div u ) 2 − f t u g t d Ω − S u d Γ , Ω Γ N with Lam´ e’s constants λ, µ , volume forces f , boundary tractions g , symmetric strain tensor ε ( u ) := 1 2( ∇ u + ( ∇ u ) T ) . Domain Ω is a union of voxels CMA, Harrachov, August 20–24, 2007 7/32

  8. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Discretization using µ FE Voxel has 8 nodes/vertices In each node we have 3 degrees of freedom: displacements in ( x -, y -, z -direction) In total 24 degrees of freedom Finite element approximation: displacements u represented by piecewise trilinear polynomials strains / stresses computable by means of nodal displacements CMA, Harrachov, August 20–24, 2007 8/32

  9. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Solving the system of equations I System of equation K x = b A is large (actually HUGE) sparse, symmetric positive definite. Approach by people of ETH Biomechanics: preconditioned conjugate gradient (PCG) algorithm element-by-element (EBE) matrix multiplication n el � T e K e T T K = e , (1) e =1 Note: all element matrices are identical! diagonal (Jacobi) preconditioning very memory economic, slow convergence as problems get big CMA, Harrachov, August 20–24, 2007 9/32

  10. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Solving the system of equations II Our new approach: pcg which smoothed aggregation AMG preconditioning (It is known that this works, see Adams et al. [3]) Requires assembling K Parallelization for distributed memory machines Employ software: Trilinos (Sandia Nat’l Lab) In particular we use Distributed (multi)vectors and (sparse) matrices (Epetra). Domain decomposition (load balance) with ParMETIS Iterative solvers and preconditioners (AztecOO) Smoothed aggregation AMG preconditioner (ML) Direct solver on coarsest level (AMESOS) CMA, Harrachov, August 20–24, 2007 10/32

  11. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Setup procedure for an abstract multigrid solver 1: Define the number of levels, L 2: for level ℓ = 0 , . . . , L − 1 do if ℓ < L − 1 then 3: Define prolongator P ℓ ; 4: Define restriction R ℓ = P T ℓ ; 5: K ℓ +1 = R ℓ K ℓ P ℓ ; 6: Define smoother S ℓ ; 7: else 8: Prepare for solving with K ℓ ; 9: end if 10: 11: end for CMA, Harrachov, August 20–24, 2007 11/32

  12. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Smoothed aggregation (SA) AMG preconditioner I 1 Build adjacency graph G 0 of K 0 = K . (Take 3 × 3 block structure into account.) 2 Group graph vertices into contiguous subsets, called aggregates . Each aggregate represents a coarser grid vertex. Typical aggregates: 3 × 3 × 3 nodes (of the graph) up to 5 × 5 × 5 nodes (if aggressive coarsening is used) ParMETIS Note: The matrices K 1 , K 2 , . . . need much less memory space than K 0 ! Typical operator complexity for SA: 1.4 (!!!) CMA, Harrachov, August 20–24, 2007 12/32

  13. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Smoothed aggregation (SA) AMG preconditioner II 3 Define a grid transfer operator: Low-energy modes, in our case, the rigid body modes (near-kernel) are ‘chopped’ according to aggregation   B ( ℓ ) B ( ℓ ) = rows of B ℓ corresponding 1  .  j . to grid points assigned to j th ag- B ℓ =   . B ( ℓ ) gregate. n ℓ +1 Let B ( ℓ ) = Q ( ℓ ) R ( ℓ ) be QR factorization of B ( ℓ ) then j j j j B ℓ = � � ℓ � P T P ℓ B ℓ +1 , P ℓ = I n ℓ +1 ,   R ( ℓ ) with 1  .  P ℓ = diag( Q ( ℓ ) � 1 , . . . , Q ( ℓ ) . n ℓ +1 ) and B ℓ +1 =  .  . R ( ℓ ) n ℓ +1 Columns of B ℓ +1 span the near kernel of K ℓ +1 . Notice: matrices K ℓ are not used in constructing tentative prolongators � P ℓ , near kernels B ℓ , and graphs G ℓ . CMA, Harrachov, August 20–24, 2007 13/32

  14. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Smoothed aggregation (SA) AMG preconditioner III 4 For elliptic problems, it is advisable to perform an additional step, to obtain smoothed aggregation (SA). 4 / 3 P ℓ = ( I ℓ − ω ℓ D − 1 ℓ K ℓ ) � P ℓ , ω ℓ = ℓ K ℓ ) , λ max ( D − 1 smoothed prolongator In non-smoothed aggregation: P ℓ = � P ℓ 5 Smoother S ℓ : polynomial smoother Choose a Chebyshev polynomial that is small on the upper part of the spectrum of K ℓ (Adams, Brezina, Hu, Tuminaro, 2003). Parallelizes perfectly, quality independent of processor number. CMA, Harrachov, August 20–24, 2007 14/32

  15. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions ‘Matrix-free’ multigrid We do NOT form K = K 0 but do an element-by-element (EBE) matrix multiplication n el � T e K e T T K = e e =1 In our implementation: P 0 is not smoothed. Matrices K 1 , K 2 , . . . are formed. All graphs, including G 0 are constructed. Memory savings (crude approximation): 1 . 4 0 . 4 = 3 . 5 Clever formation of K 1 . CMA, Harrachov, August 20–24, 2007 15/32

  16. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Procedure I 1 Definition of the aggregates on G 0 . 2 Definition of the (tentative) prolongator P 0 . This requires the 1 , and the ‘near null space’. aggregates defined in step 3 Computation of the ( i , j ) block-elements of K 1 for non-smoothed aggregation: K 1 ( i , j ) = Φ T i K 0 Φ j , where Φ i is the i -th block column of P 0 . If two Φ j and Φ k are “far-away”, we can group them together in a Φ ′ = Φ j + Φ k , then compute K 0 Φ ′ with one matvec CMA, Harrachov, August 20–24, 2007 16/32

  17. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Procedure II Courtesy Radim Blaheta, U. of Ostrava CMA, Harrachov, August 20–24, 2007 17/32

  18. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Procedure III 4 Building K 1 : Construct (in parallel) the graph G 1 of K 1 , by working on G 0 Color G 1 using (parallel) distance-2 coloring Apply K 0 to all Φ j belonging to the same color Fewer colors for non-smoothed aggregation (typically from 15 to 25 colors) 5 Smoother for level 0: Chebyshev polynomials need to determine D 0 = diag ( K 0 ) with a distance-1 coloring CMA, Harrachov, August 20–24, 2007 18/32

  19. µ FE Modeling Mathematical Model System solving AMG preconditioning Experiments Conclusions Weak scalability test Problem size scales with the number of processors. Computations done on Cray XT3 at Swiss National Supercomputer Center (CSCS) and on IBM Blue Gene/L at Z¨ urich Research Lab CMA, Harrachov, August 20–24, 2007 19/32

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend