simulating human aorta material behavior
play

Simulating Human Aorta Material Behavior Using a GPU Explicit Finite - PowerPoint PPT Presentation

Simulating Human Aorta Material Behavior Using a GPU Explicit Finite Element Solver Vukasin Strbac , David M. Pierce, Jos Vander Sloten , Nele Famaey Biomechanics Section, Mechanical Engineering, KULeuven, Leuven, BE Mechanical


  1. Simulating Human Aorta Material Behavior Using a GPU Explicit Finite Element Solver Vukasin Strbac †, David M. Pierce‡, Jos Vander Sloten †, Nele Famaey † †Biomechanics Section, Mechanical Engineering, KULeuven, Leuven, BE ‡Mechanical Engineering, Biomedical Engineering, Mathematics, Interdisciplinary Mechanics Lab University of Connecticut, Storrs, CT, US Vukasin Strbac GTC2016

  2. Introduction: general biomech. motivation  Accelerating FE analysis provides new clinical opportunities:  pre-operative (e.g. faster custom stent design)  intra-operative stress monitoring  post-operative damage monitoring/fatigue estimation at lower cost  Ever-advancing capabilities of modern hardware, e.g. GPGPUs, offer opportunities to accelerate established algorithms  angioplasty  stenting  heterogeneous composition, aorta  tissue behavior 2/21 Vukasin Strbac GTC2016 14.04.16

  3. Introduction: core facts  Explicit FE is pleasingly parallel (for the most part)  Explicit FE is sensitive to material and geometric parameters  Complex material model is necessary for accurate results  GPUs are sensitive to floating point precision used  What can we expect?  How does anisotropy affect GPU explicit FE?  How do hexahedral element formulations affect GPU explicit FE?  Particularly in terms of Gaussian integration schemes  How does that affect our research? 3/21 Vukasin Strbac GTC2016 14.04.16

  4.  𝑵 {𝒗} + 𝑑 𝑒 [𝑵]{𝒗 } + {𝑮 𝒗 } = [𝑺] Introduction: GPU-based FE solver  Nonlinear, explicit, large strain, central differences  Assign Boundary Conditions  Trilinear hexahedral elements, unstructured grid Compute stress  Templated Integrate stress  per single/double precision, textures, output, etc..   element  Assemble global  Boundary conditions: kinematic, constant force, pressure internal force vector  Materials – following slides (linear, nonlinear)   per Pre-processing  Forward time-  node  Custom input file structure for geometry, material and BCs marching step  Post-processing  Check energy  Binary .vtu files + Paraview balance  Real-time rendering  Validated against  n  Co - Abaqus (Dassault Systèmes) and nv? - FEAP(University of California, Berkeley)  y  End 4/21 Vukasin Strbac GTC2016 14.04.16

  5. Element technology: Biofidelic materials Compute stress Integrate stress • Linear elastic model (Hookean)  H  𝜏 𝑗𝑘 = 𝑔 𝜗 𝑗𝑘 = λ𝜀 𝑗𝑘 𝜗 𝑗𝑘 + 2𝜈𝜗 𝑗𝑘 = Cε Nonlinear elastic model, isotropic (neo-Hookean) • 𝜖Ψ  𝜏 = 𝑔( 𝜖𝑮 )  NH Nonlinear elastic, anisotropic (fiber-reinforced arterial tissue model [Gasser et al., 2006] ) •  GHO  Anisotropic constituent  [Weisbecker et al., 2012] 5/21 Vukasin Strbac GTC2016 14.04.16

  6. Element technology: Gaussian integration Compute stress Integrate stress  Arithmetic  Memory  expense  expense  Under-integration  ζ  -Fast  1x  1x  UI  -Inaccurate  -Hourglassing  ξ  -No volumetric locking  (Not appropriate for anisotropic materials  -No shear locking  µ with low mesh density)  ζ  Full integration (FI)  -Slow  FI  ~3x  -Very accurate ~8x  -Volumetric locking  ξ  -Shear locking  µ  ζ  Selective reduced (SR)  SR  -Very slow  ~9x  ~4x  -Very accurate  ξ  -No volumetric locking  -Shear locking  µ 6/21 Vukasin Strbac GTC2016 14.04.16

  7. Ideal case: extension-inflation test  Extension 5% + systolic pressure  Reference solutions  FEAP & ABAQUS  We implement the same materials in all solvers  We solve using 3 different generations: Fermi, Kepler and Maxwell (no optimization)  GHO material (+neo-Hooke for ref.)  Scaling  Convergence criteria based on reference solutions  RMS < 0.0005mm  deltaRMS < 0.0001mm 7/21 Vukasin Strbac GTC2016 14.04.16

  8. Ideal case: extension-inflation test  Under-integration  Full integration  Selective-reduced integration 8/21 Vukasin Strbac GTC2016 14.04.16

  9. Ideal case: extension-inflation test  FERMI  (C2075) 9/21 Vukasin Strbac GTC2016 14.04.16

  10. Ideal case: extension-inflation test  KEPLER  (K20c) 10/21 Vukasin Strbac GTC2016 14.04.16

  11. Ideal case: extension-inflation test  MAXWELL  (GTX980) 11/21 Vukasin Strbac GTC2016 14.04.16

  12. Ideal case: extension-inflation test  Anisotropy cost  (GHO/NH)  Integration cost  (SR/UI) 12/21 Vukasin Strbac GTC2016 14.04.16

  13. Ideal case: conclusions  Speed-ups are considerable  Difficult to say exactly why one GPU is faster in a specific scenario  No architecture-specific considerations are employed, speedup is free  Useful for  Parameter-fitting and geometry identification  Sensitivity analyses  …anything made possible by large numbers of FE simulations  Not a clinically accurate scenario 13/21 Vukasin Strbac GTC2016 14.04.16

  14. Near incompressibility and floating point precision  MPa  Single  precision  Double  precision 14/21 Vukasin Strbac GTC2016 14.04.16

  15.  FI  UI  SR  Double  Single 15/21 Vukasin Strbac GTC2016 14.04.16

  16. Clinically relevant test case: AAA inflation  p1  p2  p3  p4  p5  Patient-specific FE meshes of abdominal aortic aneurysms [Tarjuelo-Gutierrez et al., 2014] 16/21 Vukasin Strbac GTC2016 14.04.16

  17. Clinically relevant test case: AAA inflation  thrombus  The ‘silent killer’  Peak Wall Stress (PWS) estimate needed  Thrombus:  Separation  Different material  Layer specific material properties  aorta 17/21 Vukasin Strbac GTC2016 14.04.16

  18. 18/21 Presenter Type of presentation 14.04.16

  19. Clinically relevant test case: AAA inflation  p1  p2  p3  p4  p5 FEAP[h] 21.12 22.79 21.01 21.52 21.86 CUDA[h] 2.93 1.22 2.75 3.03 1.31 factor x7.2 x18.7 x7.6 x7.1 x16.8 19/21 Vukasin Strbac GTC2016 14.04.16

  20. Poisson = 0.4995 20/21 Vukasin Strbac GTC2016 14.04.16

  21. Conclusion  We maintain significant speedup even using state-of-the-art materials, high- order integration and double precision on GPUs, with no compromise whatsoever on accuracy. Even for less than ideal meshes.  Single precision becomes ineffective quickly, and depends on Poisson ratio. Double precision is necessary.  Practical opportunities, enabling technology:  FE sensitivity analysis  Inverse FE simulations  Indications of clinical use  Generally:  Memory-bound algorithm  Lots of random reads and atomic writes due to unstructured grid  For details on implementation/optimization see: S4497, Strbac GTC2014 21/21 Vukasin Strbac GTC2016 14.04.16

  22.  Thank you for your attention.  Questions? 22/21 Vukasin Strbac GTC2016 14.04.16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend