Dynamics Framework on the GPU Daniel Melanz, Luning Fang, Ang Li, - - PowerPoint PPT Presentation

dynamics framework on the gpu
SMART_READER_LITE
LIVE PREVIEW

Dynamics Framework on the GPU Daniel Melanz, Luning Fang, Ang Li, - - PowerPoint PPT Presentation

GPU TECHNOLOGY CONFERENCE: S5400: Chrono::SPIKE A Nonsmooth Contact Dynamics Framework on the GPU Daniel Melanz, Luning Fang, Ang Li, Hammad Mazhar, Radu Serban, Dan Negrut Simulation-Based Engineering Laboratory University of Wisconsin -


slide-1
SLIDE 1

GPU TECHNOLOGY CONFERENCE:

S5400: Chrono::SPIKE – A Nonsmooth Contact Dynamics Framework on the GPU

Daniel Melanz, Luning Fang, Ang Li, Hammad Mazhar, Radu Serban, Dan Negrut Simulation-Based Engineering Laboratory University of Wisconsin - Madison

slide-2
SLIDE 2

Overview

1)

Nonsmooth Contact Dynamics

2)

Quadratic Optimization w/ Conic Constraints

3)

Preconditioning with SPIKE

4)

Numerical Results

5)

Conclusions & Future Work

3/19/2015 University of Wisconsin 2

slide-3
SLIDE 3

Nonsmooth Contact Dynamics

3/19/2015 University of Wisconsin 3

slide-4
SLIDE 4

Nonsmooth Dynamics

3/19/2015 University of Wisconsin 4

slide-5
SLIDE 5

Nonsmooth Dynamics: Frictionless Case

The Signorini Conditions: Every relative velocity should be zero

  • r separating

Every contact impulse should be non- attractive No impulse at separating contacts:

Antonio Signorini

3/19/2015 University of Wisconsin 5

Tonge, 2012

slide-6
SLIDE 6

Nonsmooth Dynamics: Frictionless Case

The Signorini Conditions: This is a compact way to write the three conditions in

  • ne line of math

3/19/2015 University of Wisconsin 6

Tonge, 2012

Antonio Signorini

slide-7
SLIDE 7

Nonsmooth Dynamics: Frictionless Case

The final model can be expressed by these equations:

3/19/2015 University of Wisconsin 7

Tonge, 2012

slide-8
SLIDE 8

Nonsmooth Dynamics: Friction Case

Stewart and Trinkle, 1996

3/19/2015 University of Wisconsin 8

slide-9
SLIDE 9

Nonsmooth Dynamics: Friction Case

Anitescu and Hart, 2004

3/19/2015 University of Wisconsin 9

slide-10
SLIDE 10

Nonsmooth Dynamics: The Cone Complementarity Problem (CCP)

where

3/19/2015 University of Wisconsin 10

slide-11
SLIDE 11

Nonsmooth Dynamics: The Quadratic Programming Angle…

  • The CCP captures the first-order optimality condition for a quadratic optimization problem

with conic constraints:

  • Notation used:

3/19/2015 University of Wisconsin 11

slide-12
SLIDE 12

Quadratic Optimization w/ Conic Constraints (CCQO’s)

3/19/2015 University of Wisconsin 12

slide-13
SLIDE 13

CCQO’s: First Order Methods

3/18/2015 13

slide-14
SLIDE 14

CCQO’s: Second Order Methods

3/18/2015 14

  • Original problem:
  • Reformulation via an indicator function:
  • Approximation via logarithmic barrier:

where

  • therwise
slide-15
SLIDE 15

Interior Point

3/18/2015 15

slide-16
SLIDE 16

Numerical Results

3/19/2015 University of Wisconsin 16

slide-17
SLIDE 17

Results: Physical Model

  • Several numerical experiments were performed using a model of spheres falling

into a bucket

3/19/2015 University of Wisconsin 17

slide-18
SLIDE 18

Time [s]

2 2.2 2.4 2.6 2.8 3 3.2

Weight [N]

  • 16000
  • 14000
  • 12000
  • 10000
  • 8000
  • 6000
  • 4000
  • 2000

1e-1 1e-2 1e-3 1e-4 1e-5

Time [s]

2 2.2 2.4 2.6 2.8 3 3.2

Weight [N]

  • 16000
  • 14000
  • 12000
  • 10000
  • 8000
  • 6000
  • 4000
  • 2000

1e-1 1e-2 1e-3 1e-4 1e-5

Results: Comparison of Solver Results

  • Simulations of the filling simulation were performed for 3 seconds with a step size,

h=10-3 seconds using the APGD and PDIP solvers

APGD PDIP

3/19/2015 University of Wisconsin 18

slide-19
SLIDE 19

Time [s]

0.5 1 1.5 2 2.5 3 3.5

Iterations [#]

50 100 150 200 250 300 350 400 450 500

1e-1 1e-2 1e-3 1e-4 1e-5

Time [s]

0.5 1 1.5 2 2.5 3 3.5

Iterations [#]

10 20 30 40 50 60

1e-1 1e-2 1e-3 1e-4 1e-5

Results: Comparison of Solver Iterations

  • Simulations of the filling simulation were performed for 3 seconds with a step size,

h=10-3 seconds using the APGD and PDIP solvers

APGD PDIP

3/19/2015 University of Wisconsin 19

slide-20
SLIDE 20

Results: Comparison of Solver Execution Time

  • Simulations of the filling simulation were performed for 3 seconds with a step size,

h=10-3 seconds using the APGD and PDIP solvers

APGD PDIP

3/19/2015 University of Wisconsin 20

slide-21
SLIDE 21

Time [s]

0.5 1 1.5 2 2.5 3 3.5

Iterations [#]

50 100 150 200 250 300 350 400 450 500

1e-1 1e-2 1e-3 1e-4 1e-5

Results: Comparison of Solvers

  • Simulations of the filling simulation were performed for 3 seconds with a step size,

h=10-3 seconds using the APGD and PDIP solvers

APGD PDIP

3/19/2015 University of Wisconsin 21

slide-22
SLIDE 22

Preconditioning with SPIKE

3/19/2015 University of Wisconsin 22

slide-23
SLIDE 23

The SPIKE algorithm

  • SPIKE: a divide-and-conquer approach to solving banded dense systems.
  • Proposed by A. H. Sameh and D. J. Kuck in 1978.

(see also E. Polizzi and A. H. Sameh, Parallel Computing 32(2), 2006)

  • Basic idea:
  • Partition the matrix A.
  • Factorize A to isolate independent blocks.
  • Solve a reduced system to account for coupling information.
  • Recover solution of original system.
  • SPIKE comes in two main flavors:
  • Full-SPIKE: recursively solve an exact reduced system (direct solver for banded matrices).
  • Truncated-SPIKE: solve an approximate reduced system in one step (needs iterative refinement).

3/19/2015 University of Wisconsin 23

slide-24
SLIDE 24

SPIKE: algorithmic details

Partitioning and Factorization

  • Partition and factorize A into block diagonal matrix D and spike matrix S.

3/19/2015 University of Wisconsin 24

slide-25
SLIDE 25

SPIKE: algorithmic details

Solving Dg=b

  • Reduced to solving P independent (banded dense) linear systems.
  • Map these systems to P blocks on GPU.
  • Apply classical LU (or UL) methods to each sub-system.

3/19/2015 University of Wisconsin 25

slide-26
SLIDE 26

SPIKE factorization in plain math

  • The right (Vi) and left (Wi) spike blocks can be obtained through the solution of P

independent multiple-RHS banded linear systems.

3/19/2015 University of Wisconsin 26

slide-27
SLIDE 27

SPIKE: algorithmic details

Solving Sx=g (full SPIKE)

  • Combine all coupling blocks into a reduced matrix
  • (Recursively) solve the reduced system
  • Recover solution from reduced solution

Combine coupling blocks

3/19/2015 University of Wisconsin 27

slide-28
SLIDE 28

SPIKE: algorithmic details

Solving Sx=g (truncated SPIKE)

  • Justified for diagonally dominant systems only.
  • All spike blocks W and V are approximated by their top and bottom parts, respectively.
  • Results in a decoupling of the reduced matrix into (P-1) small independent systems (2K x 2K).

Truncate spike blocks

3/19/2015 University of Wisconsin 28

slide-29
SLIDE 29

Truncated SPIKE as a preconditioner

  • Fundamental idea:
  • Reorder a sparse matrix to obtain a banded matrix with as “heavy” a diagonal as possible.
  • Drop small entries far from the main diagonal in an attempt to produce an even narrower band.
  • Use truncated SPIKE on resulting banded matrix.
  • Sparse matrix reordering
  • Reordering is critical
  • Non-zeroes can spread while we prefer them to gather around diagonals.
  • Both truncated SPIKE and BiCGStab(2) prefer diagonal elements with large absolute values.
  • Reordering strategies
  • Use row permutations to maximize product of absolute diagonal values: A  QA
  • Apply symmetric RCM for bandwidth reduction: QA + AT QT  P ( QA + AT QT ) PT

3/19/2015 University of Wisconsin 29

slide-30
SLIDE 30

Numerical Results

3/19/2015 University of Wisconsin 30

slide-31
SLIDE 31

Results: Preconditioned PDIP (P-PDIP)

  • Adding preconditioning to the search direction computation drastically improves

computation time

3/19/2015 University of Wisconsin 31

slide-32
SLIDE 32

Results: Effect of Problem Size

  • A series of simulations on filling models of increasing size were performed to

estimate how the solver performance scales with problem dimension

3/19/2015 University of Wisconsin 32

slide-33
SLIDE 33

Conclusions & Future Work

3/19/2015 University of Wisconsin 33

slide-34
SLIDE 34

Conclusions

  • Interior point methods require much less iterations than gradient

descent methods, but each iteration is much more computationally expensive

  • Preconditioning is responsible for an four-fold reduction in run times

when simulating nonsmooth contact problems

  • Although used with the nonsmooth dynamics, this speed-up is

independent of the specific formalism adopted for the formulation of the equations of motion

3/19/2015 University of Wisconsin 34

slide-35
SLIDE 35

Future Work

  • Investigate improvements to the interior point algorithm
  • Investigate SPIKE update strategies and preconditioner re-use
  • Investigate the effectiveness of spectral reordering methods
  • Understand and gauge the software implementation effort and

simulation efficiency trade-offs related to moving from the GPU to parallel multi-core CPU architectures

3/19/2015 University of Wisconsin 35

slide-36
SLIDE 36

Thank you.

  • Source available for download under BSD-3

http://spikegpu.sbel.org/

  • For all of our animations, please visit

https://vimeo.com/uwsbel

  • For more information about the Simulation-

Based Engineering Laboratory, please visit http://sbel.wisc.edu/

3/19/2015 University of Wisconsin 36

slide-37
SLIDE 37

Thank You.

melanz@wisc.edu Simulation Based Engineering Lab Wisconsin Applied Computing Center

3/19/2015 University of Wisconsin 37