dynamics framework on the gpu
play

Dynamics Framework on the GPU Daniel Melanz, Luning Fang, Ang Li, - PowerPoint PPT Presentation

GPU TECHNOLOGY CONFERENCE: S5400: Chrono::SPIKE A Nonsmooth Contact Dynamics Framework on the GPU Daniel Melanz, Luning Fang, Ang Li, Hammad Mazhar, Radu Serban, Dan Negrut Simulation-Based Engineering Laboratory University of Wisconsin -


  1. GPU TECHNOLOGY CONFERENCE: S5400: Chrono::SPIKE – A Nonsmooth Contact Dynamics Framework on the GPU Daniel Melanz, Luning Fang, Ang Li, Hammad Mazhar, Radu Serban, Dan Negrut Simulation-Based Engineering Laboratory University of Wisconsin - Madison

  2. Overview Nonsmooth Contact Dynamics 1) Quadratic Optimization w/ Conic Constraints 2) Preconditioning with SPIKE 3) Numerical Results 4) Conclusions & Future Work 5) 3/19/2015 2 University of Wisconsin

  3. Nonsmooth Contact Dynamics 3/19/2015 3 University of Wisconsin

  4. Nonsmooth Dynamics 3/19/2015 4 University of Wisconsin

  5. Nonsmooth Dynamics: Frictionless Case The Signorini Conditions : Every relative velocity should be zero or separating Every contact impulse should be non- attractive No impulse at separating contacts: Antonio Signorini Tonge, 2012 3/19/2015 5 University of Wisconsin

  6. Nonsmooth Dynamics: Frictionless Case The Signorini Conditions : This is a compact way to write the three conditions in one line of math Antonio Signorini Tonge, 2012 3/19/2015 6 University of Wisconsin

  7. Nonsmooth Dynamics: Frictionless Case The final model can be expressed by these equations: Tonge, 2012 3/19/2015 7 University of Wisconsin

  8. Nonsmooth Dynamics: Friction Case Stewart and Trinkle, 1996 3/19/2015 8 University of Wisconsin

  9. Nonsmooth Dynamics: Friction Case Anitescu and Hart, 2004 3/19/2015 9 University of Wisconsin

  10. Nonsmooth Dynamics: The Cone Complementarity Problem (CCP) where 3/19/2015 10 University of Wisconsin

  11. Nonsmooth Dynamics: The Quadratic Programming Angle… • The CCP captures the first-order optimality condition for a quadratic optimization problem with conic constraints: • Notation used: 3/19/2015 11 University of Wisconsin

  12. Quadratic Optimization w/ Conic Constraints (CCQO’s) 3/19/2015 12 University of Wisconsin

  13. CCQO’s: First Order Methods 3/18/2015 13

  14. CCQO’s: Second Order Methods • Original problem: • Reformulation via an indicator function: where otherwise • Approximation via logarithmic barrier: 3/18/2015 14

  15. Interior Point 3/18/2015 15

  16. Numerical Results 3/19/2015 16 University of Wisconsin

  17. Results: Physical Model • Several numerical experiments were performed using a model of spheres falling into a bucket 3/19/2015 17 University of Wisconsin

  18. Results: Comparison of Solver Results • Simulations of the filling simulation were performed for 3 seconds with a step size, h=10 -3 seconds using the APGD and PDIP solvers -2000 -2000 1e-1 1e-1 1e-2 1e-2 PDIP APGD 1e-3 1e-3 -4000 1e-4 -4000 1e-4 1e-5 1e-5 -6000 -6000 Weight [N] -8000 Weight [N] -8000 -10000 -10000 -12000 -12000 -14000 -14000 -16000 -16000 2 2.2 2.4 2.6 2.8 3 3.2 2 2.2 2.4 2.6 2.8 3 3.2 Time [s] Time [s] 3/19/2015 18 University of Wisconsin

  19. Results: Comparison of Solver Iterations • Simulations of the filling simulation were performed for 3 seconds with a step size, h=10 -3 seconds using the APGD and PDIP solvers 500 60 1e-1 1e-1 1e-2 1e-2 450 1e-3 PDIP APGD 1e-3 1e-4 1e-4 1e-5 50 1e-5 400 350 40 300 Iterations [#] Iterations [#] 250 30 200 20 150 100 10 50 0 0 0 0.5 1 1.5 2 2.5 3 3.5 0 0.5 1 1.5 2 2.5 3 3.5 Time [s] Time [s] 3/19/2015 19 University of Wisconsin

  20. Results: Comparison of Solver Execution Time • Simulations of the filling simulation were performed for 3 seconds with a step size, h=10 -3 seconds using the APGD and PDIP solvers PDIP APGD 3/19/2015 20 University of Wisconsin

  21. Results: Comparison of Solvers • Simulations of the filling simulation were performed for 3 seconds with a step size, h=10 -3 seconds using the APGD and PDIP solvers 500 1e-1 1e-2 450 PDIP APGD 1e-3 1e-4 1e-5 400 350 300 Iterations [#] 250 200 150 100 50 0 0 0.5 1 1.5 2 2.5 3 3.5 Time [s] 3/19/2015 21 University of Wisconsin

  22. Preconditioning with SPIKE 3/19/2015 22 University of Wisconsin

  23. The SPIKE algorithm • SPIKE: a divide-and-conquer approach to solving banded dense systems. • Proposed by A. H. Sameh and D. J. Kuck in 1978. (see also E. Polizzi and A. H. Sameh, Parallel Computing 32(2), 2006) • Basic idea: • Partition the matrix A . • Factorize A to isolate independent blocks. • Solve a reduced system to account for coupling information. • Recover solution of original system. • SPIKE comes in two main flavors: • Full-SPIKE : recursively solve an exact reduced system (direct solver for banded matrices). • Truncated-SPIKE : solve an approximate reduced system in one step (needs iterative refinement). 3/19/2015 23 University of Wisconsin

  24. SPIKE: algorithmic details Partitioning and Factorization • Partition and factorize A into block diagonal matrix D and spike matrix S. 3/19/2015 24 University of Wisconsin

  25. SPIKE: algorithmic details Solving Dg=b • Reduced to solving P independent (banded dense) linear systems. • Map these systems to P blocks on GPU. • Apply classical LU (or UL) methods to each sub-system. 3/19/2015 25 University of Wisconsin

  26. SPIKE factorization in plain math • The right ( V i ) and left ( W i ) spike blocks can be obtained through the solution of P independent multiple-RHS banded linear systems. 3/19/2015 26 University of Wisconsin

  27. SPIKE: algorithmic details Solving Sx=g (full SPIKE) • Combine all coupling blocks into a reduced matrix • (Recursively) solve the reduced system • Recover solution from reduced solution Combine coupling blocks 3/19/2015 27 University of Wisconsin

  28. SPIKE: algorithmic details Solving Sx=g (truncated SPIKE) • Justified for diagonally dominant systems only. • All spike blocks W and V are approximated by their top and bottom parts, respectively. • Results in a decoupling of the reduced matrix into ( P -1 ) small independent systems ( 2 K x 2 K ). Truncate spike blocks 3/19/2015 28 University of Wisconsin

  29. Truncated SPIKE as a preconditioner • Fundamental idea: • Reorder a sparse matrix to obtain a banded matrix with as “heavy” a diagonal as possible. • Drop small entries far from the main diagonal in an attempt to produce an even narrower band. • Use truncated SPIKE on resulting banded matrix. • Sparse matrix reordering • Reordering is critical • Non-zeroes can spread while we prefer them to gather around diagonals. • Both truncated SPIKE and BiCGStab(2) prefer diagonal elements with large absolute values. • Reordering strategies • Use row permutations to maximize product of absolute diagonal values: A  QA • Apply symmetric RCM for bandwidth reduction: QA + A T Q T  P ( QA + A T Q T ) P T 3/19/2015 29 University of Wisconsin

  30. Numerical Results 3/19/2015 30 University of Wisconsin

  31. Results: Preconditioned PDIP (P-PDIP) • Adding preconditioning to the search direction computation drastically improves computation time 3/19/2015 31 University of Wisconsin

  32. Results: Effect of Problem Size • A series of simulations on filling models of increasing size were performed to estimate how the solver performance scales with problem dimension 3/19/2015 32 University of Wisconsin

  33. Conclusions & Future Work 3/19/2015 33 University of Wisconsin

  34. Conclusions • Interior point methods require much less iterations than gradient descent methods, but each iteration is much more computationally expensive • Preconditioning is responsible for an four-fold reduction in run times when simulating nonsmooth contact problems • Although used with the nonsmooth dynamics, this speed-up is independent of the specific formalism adopted for the formulation of the equations of motion 3/19/2015 34 University of Wisconsin

  35. Future Work • Investigate improvements to the interior point algorithm • Investigate SPIKE update strategies and preconditioner re-use • Investigate the effectiveness of spectral reordering methods • Understand and gauge the software implementation effort and simulation efficiency trade-offs related to moving from the GPU to parallel multi-core CPU architectures 3/19/2015 35 University of Wisconsin

  36. Thank you. • Source available for download under BSD-3 http://spikegpu.sbel.org/ • For all of our animations, please visit https://vimeo.com/uwsbel • For more information about the Simulation- Based Engineering Laboratory, please visit http://sbel.wisc.edu/ 3/19/2015 36 University of Wisconsin

  37. Thank You. melanz@wisc.edu Simulation Based Engineering Lab Wisconsin Applied Computing Center 3/19/2015 37 University of Wisconsin

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend