Simulating Human Aorta Material Behavior Using a GPU Explicit Finite - - PowerPoint PPT Presentation

simulating human aorta material behavior
SMART_READER_LITE
LIVE PREVIEW

Simulating Human Aorta Material Behavior Using a GPU Explicit Finite - - PowerPoint PPT Presentation

Simulating Human Aorta Material Behavior Using a GPU Explicit Finite Element Solver Vukasin Strbac , David M. Pierce, Jos Vander Sloten , Nele Famaey Biomechanics Section, Mechanical Engineering, KULeuven, Leuven, BE Mechanical


slide-1
SLIDE 1

Simulating Human Aorta Material Behavior Using a GPU Explicit Finite Element Solver

Vukasin Strbac†, David M. Pierce‡, Jos Vander Sloten†, Nele Famaey† †Biomechanics Section, Mechanical Engineering, KULeuven, Leuven, BE ‡Mechanical Engineering, Biomedical Engineering, Mathematics, Interdisciplinary Mechanics Lab University of Connecticut, Storrs, CT, US

Vukasin Strbac GTC2016

slide-2
SLIDE 2

Introduction: general biomech. motivation

2/21 Vukasin Strbac GTC2016

  • Accelerating FE analysis provides new clinical opportunities:
  • pre-operative (e.g. faster custom stent design)
  • intra-operative stress monitoring
  • post-operative damage monitoring/fatigue estimation at lower cost
  • Ever-advancing capabilities of modern hardware, e.g. GPGPUs, offer
  • pportunities to accelerate established algorithms

14.04.16

  • angioplasty
  • stenting
  • heterogeneous composition, aorta
  • tissue behavior
slide-3
SLIDE 3

3/21 14.04.16

Introduction: core facts

  • Explicit FE is pleasingly parallel (for the most part)
  • Explicit FE is sensitive to material and geometric parameters
  • Complex material model is necessary for accurate results
  • GPUs are sensitive to floating point precision used
  • What can we expect?
  • How does anisotropy affect GPU explicit FE?
  • How do hexahedral element formulations affect GPU explicit FE?

Particularly in terms of Gaussian integration schemes

  • How does that affect our research?

Vukasin Strbac GTC2016

slide-4
SLIDE 4

4/21 14.04.16

  • Nonlinear, explicit, large strain, central differences
  • Trilinear hexahedral elements, unstructured grid
  • Templated
  • single/double precision, textures, output, etc..
  • Boundary conditions: kinematic, constant force, pressure
  • Materials – following slides (linear, nonlinear)
  • Pre-processing
  • Custom input file structure for geometry, material and BCs
  • Post-processing
  • Binary .vtu files + Paraview
  • Real-time rendering
  • Validated against
  • Abaqus (Dassault Systèmes) and
  • FEAP(University of California, Berkeley)

Vukasin Strbac GTC2016

Introduction: GPU-based FE solver

  • End

Compute stress Integrate stress

  • Assemble global

internal force vector

  • Forward time-

marching step

  • Co

nv?

  • Check energy

balance

  • y
  • n
  • Assign Boundary

Conditions

  • 𝑵 {𝒗}

+ 𝑑𝑒[𝑵]{𝒗 } + {𝑮 𝒗 } = [𝑺]

  • per
  • node
  • per
  • element
slide-5
SLIDE 5

Element technology: Biofidelic materials

  • Nonlinear elastic, anisotropic (fiber-reinforced arterial tissue model [Gasser et al., 2006])

5/21 14.04.16 Vukasin Strbac GTC2016

  • Linear elastic model (Hookean)
  • 𝜏𝑗𝑘 = 𝑔 𝜗𝑗𝑘 = λ𝜀𝑗𝑘𝜗𝑗𝑘 + 2𝜈𝜗𝑗𝑘 = Cε
  • Nonlinear elastic model, isotropic (neo-Hookean)
  • 𝜏 = 𝑔(

𝜖Ψ 𝜖𝑮)

  • Anisotropic constituent
  • [Weisbecker et al., 2012]

Compute stress Integrate stress

  • NH
  • GHO
  • H
slide-6
SLIDE 6

6/21 14.04.16

Element technology: Gaussian integration

  • Under-integration
  • -Fast
  • -Inaccurate
  • -Hourglassing
  • -No volumetric locking
  • -No shear locking
  • Full integration (FI)
  • -Slow
  • -Very accurate
  • -Volumetric locking
  • -Shear locking
  • Selective reduced (SR)
  • -Very slow
  • -Very accurate
  • -No volumetric locking
  • -Shear locking
  • ξ
  • µ
  • ζ
  • (Not appropriate for anisotropic materials

with low mesh density)

  • ξ
  • µ
  • ζ
  • ξ
  • µ
  • ζ
  • 1x
  • 1x

~8x

  • ~3x
  • ~9x
  • ~4x
  • Arithmetic
  • expense
  • Memory
  • expense

Vukasin Strbac GTC2016

Compute stress Integrate stress

  • UI
  • FI
  • SR
slide-7
SLIDE 7

Ideal case: extension-inflation test

7/21 14.04.16

  • Extension 5% + systolic pressure
  • Reference solutions
  • FEAP & ABAQUS
  • We implement the same materials in all solvers
  • We solve using 3 different generations: Fermi,

Kepler and Maxwell (no optimization)

  • GHO material (+neo-Hooke for ref.)
  • Scaling
  • Convergence criteria based on

reference solutions

  • RMS < 0.0005mm
  • deltaRMS < 0.0001mm

Vukasin Strbac GTC2016

slide-8
SLIDE 8

8/21 14.04.16

Ideal case: extension-inflation test

Vukasin Strbac GTC2016

  • Under-integration
  • Full integration
  • Selective-reduced integration
slide-9
SLIDE 9

9/21 14.04.16

Ideal case: extension-inflation test

Vukasin Strbac GTC2016

  • FERMI
  • (C2075)
slide-10
SLIDE 10

10/21 14.04.16

Ideal case: extension-inflation test

Vukasin Strbac GTC2016

  • KEPLER
  • (K20c)
slide-11
SLIDE 11

11/21 14.04.16

Ideal case: extension-inflation test

Vukasin Strbac GTC2016

  • MAXWELL
  • (GTX980)
slide-12
SLIDE 12

12/21 14.04.16

Ideal case: extension-inflation test

Vukasin Strbac GTC2016

  • Anisotropy cost
  • (GHO/NH)
  • Integration cost
  • (SR/UI)
slide-13
SLIDE 13

Ideal case: conclusions

  • Speed-ups are considerable
  • Difficult to say exactly why one GPU is faster in a specific scenario
  • No architecture-specific considerations are employed, speedup is free
  • Useful for
  • Parameter-fitting and geometry identification
  • Sensitivity analyses
  • …anything made possible by large numbers of FE simulations
  • Not a clinically accurate scenario

13/21 14.04.16 Vukasin Strbac GTC2016

slide-14
SLIDE 14

Near incompressibility and floating point precision

14/21 14.04.16 Vukasin Strbac GTC2016

  • MPa
  • Single
  • precision
  • Double
  • precision
slide-15
SLIDE 15

15/21 14.04.16

  • Double
  • Single

Vukasin Strbac GTC2016

  • FI
  • UI
  • SR
slide-16
SLIDE 16

Clinically relevant test case: AAA inflation

16/21 14.04.16 Vukasin Strbac GTC2016

  • p1
  • p2
  • p3
  • p4
  • p5
  • Patient-specific FE meshes of abdominal aortic aneurysms [Tarjuelo-Gutierrez et al., 2014]
slide-17
SLIDE 17
  • The ‘silent killer’
  • Peak Wall Stress (PWS) estimate needed
  • Thrombus:
  • Separation
  • Different material
  • Layer specific material properties

17/21 14.04.16

Clinically relevant test case: AAA inflation

Vukasin Strbac GTC2016

  • thrombus
  • aorta
slide-18
SLIDE 18

18/21 14.04.16 Presenter Type of presentation

slide-19
SLIDE 19

19/21 14.04.16

Clinically relevant test case: AAA inflation

FEAP[h] 21.12 22.79 21.01 21.52 21.86 CUDA[h] 2.93 1.22 2.75 3.03 1.31

factor x7.2

x18.7 x7.6 x7.1 x16.8

Vukasin Strbac GTC2016

  • p1
  • p2
  • p3
  • p4
  • p5
slide-20
SLIDE 20

20/21 14.04.16

Poisson = 0.4995

Vukasin Strbac GTC2016

slide-21
SLIDE 21

Conclusion

  • We maintain significant speedup even using state-of-the-art materials, high-
  • rder integration and double precision on GPUs, with no compromise

whatsoever on accuracy. Even for less than ideal meshes.

  • Single precision becomes ineffective quickly, and depends on Poisson ratio.

Double precision is necessary.

  • Practical opportunities, enabling technology:
  • FE sensitivity analysis
  • Inverse FE simulations
  • Indications of clinical use
  • Generally:
  • Memory-bound algorithm
  • Lots of random reads and atomic writes due to unstructured grid
  • For details on implementation/optimization see: S4497, Strbac GTC2014

21/21 14.04.16 Vukasin Strbac GTC2016

slide-22
SLIDE 22

22/21 14.04.16

  • Thank you for your attention.
  • Questions?

Vukasin Strbac GTC2016