high performance multiscale simulation for crack
play

High Performance Multiscale Simulation for Crack Propagation - PowerPoint PPT Presentation

1 High Performance Multiscale Simulation for Crack Propagation Guillaume Anciaux, Olivier Coulaud and Jean Roman ScAlApplix Project HPSEC 2006 - 18 th August 2 Outline 1. Introduction Motivations State of the art 2. Our


  1. 1 High Performance Multiscale Simulation for Crack Propagation Guillaume Anciaux, Olivier Coulaud and Jean Roman ScAlApplix Project HPSEC 2006 - 18 th August

  2. 2 Outline 1. Introduction • Motivations • State of the art 2. Our Approach • Presentation of the method • Coupling Algorithm • Parallel algorithms and implementation 3. Results • 1D & 2D case : wave propagations • 2D case : crack 4.Conclusion

  3. 3 Introduction

  4. 4 Introduction : context Collaboration with the CEA DIF-DPTA - G. Zerah Goals : produce a tool for Impact on laser optics • CEA 1. Better understanding of miscroscale phenomena 2. Reduce computing time of molecular dynamics simulations • ScAlApplix 1. Analysis of coupling algorithms 2. HPC parallel scaling processes 3. Development of a framework for multiscale computation 4. Generics : use of legacy codes

  5. 5 Introduction : atomistic approach simulation with molecular dynamics tool: 2 x  t  M d =− V M  x  t  2 dt an empirical potential (one, two or three body interactions) V M fine description of the studied system. All structural phenomena are captured Teams : • A. Nakano (Louisiana State University) • H. Gao (Max Planck Institute for Metals Research) • …

  6. 6 Introduction : continuum approach Elastodynamics : ω Ω Lagrangian formulation ● y ( x , t ) x ρ = div + =   T y T b T T Constitutive law for the material ● = T ( ∇ T y ) Discretization : P1 finite elements ● Variational problem – energy minimization ● Allow to manipulate huge objects ● Well known technology ●

  7. 7 Difficulties and limitations of the methods [ [ ] ] y ≠ 0 MD limitations FE limitations 1.Space scale : 1.Discontinuities (Crack) A 100 nm 3 of silicon crystal contains Constitutive law not valid anymore near the fracture : the model needs near billion atoms to be extended (XFEM, …) 2.Time scale : 2.Needs a fine mesh to capture -15 s.) Time step = femto-second(10 pertinent information. 3.Boundary conditions (periodic) 4.Huge data volumes ~ 1 Terabyte per steps

  8. 8 Introduction : Multiscale approach (1) Idea : using advantages of both models • Continuum model – Reduce the size of the domain. – Take into account complex boundary conditions. • discrete model – Near the discontinuities. How to couple this two models ?

  9. 9 Introduction : Multiscale approach (2) Multi-scale approaches : • Junction – QC-method (Tadmor and al. 1996) Static simulations and T= 0 – Macroscopic, Atomistic and Ab initio Dynamics (MAAD) (Abraham and al. 1998) • Bridging : duplication of the data – T. Belytschko (Bridging Method) – Bridging Scale Method (Liu)

  10. 10 Introduction : Bridging approach Numerical difficulties : Avoiding that h = inter-atomic distance ● Different time/length scales ● ➢ mecanical wave reflexions Algorithmic difficulties 1. Need for smart data handling at the interface 2. Efficient computation of the FE shape functions in the overlapping zone 3. Decomposition of domain 4. Load balancing Problem 3 and 4 are tied together

  11. 11 Our approach

  12. 12 Discrete/continuum coupling The Bridging Method introduced by T. Belytschko & S. Xiao Idea : • Imposing equivalent displacements on atom sites • Introduction of Lagrangian multipliers • Weighting the computations of the multipliers to keep predominances of each models α

  13. 13 Discrete/continuum coupling A  = rhs y  X i −  The constraint : g i   d i =  y  X i −  x i = 0 ∀ i , A i , j = ∑ J  X i  X j  with : 1 −  i , j  J M J  i m i y J  J  X i  −  rhs i = ∑ J  x i A i = ∑ j A i , j A is condensed on its diagonal Then the correction of the velocities :  I M I ∑ j  j  I  X j   t y I   t y I =  x I =   x I −  i   1 − i  m i

  14. 14 Solution to algorithmic difficulties Needs to identify atoms in a given element Double loop O(N atoms x N elements ) Introduction of a grid – Place atoms and elements in the grid. – Map atoms to elements. Complexity O(N atoms x N box-elements )

  15. 15 Initialization of the bridging zone (2) Pre-computations of the shape functions for all atom sites •The shape values are stored in an appropriate data structure •Allowing, through atom/element mapping, accesses at constant time for any given atom site

  16. 16 Mapping the codes to processors Strategy : distinct processor sets for each model Molecular dynamics set Continuum mechanics set MD weighting FE weighting Coupling interaction

  17. 17 Diagram for the coupling model of parallel codes (SPMD) a  Initialization b  Initialization  T i  T i c   T i Bridging Zone initialization 1a  1b   T s Position update Position update  T s 2a  Force computation 2b   T s  T s Force computation Computation of the Lagrange multipliers  T c  3a  3b   T s Velocity update Velocity update  T s no no T = T max ? Parallel Molecular dynamics Parallel Continuum elasticity yes

  18. Details on the computation of the Lagrange 18 Multipliers Computation of RHS contribution Computation of RHS contribution 1a   T c 1b  ∑ J  −  x i  T c y J  X i  Summing the contributions 2   T c ∑ J  y J  X i −  x i = rhs i Solving the constraint system Solving the constraint system 3  3   T c  T c  i = rhs i / A i  i = rhs i / A i Correcting the velocities Correcting the velocities ∆ t m ∆ t ∑  =   new =  + λ ϕ y y ( x ) new − λ d d 4a  4b   T c  T c I I α j I j I M i i − α ) i ( 1 i m = j 1 I i Parallel Molecular dynamics Parallel Continuum elasticity

  19. 19 Constraint system data redistribution To illustrate the talk we consider the following distribution over processors : Atomic zone Bridging zone Continuum Zone

  20. 20 Constraint system data redistribution Case where the mapping of models is made onto two distinct sets of processors : both set of processor own a parallel vector of the Lagrangian unknowns .

  21. Dynamical effects : atoms migrations 21 management (1) Coherency in the redistribution scheme is maintained by protocol involving processors of the bridging zone

  22. Dynamical effects : atoms migrations 22 management (2) Coherency in the redistribution scheme is maintained by protocol involving communication between the two set. Standard MD Send new atom migration owner Send new position

  23. 23 Results

  24. 24 Wave Reflexions on th 1D Model Wave reflexions a caused by : • Number of degree of freedom reducing • Overlap zone impedance which depends on: – Waves frequencies 400 atoms – Waves phase 40 elements

  25. 25 Reproduction of the absorbsion results Initial condition e c n a d e p m i d e t p a d a n u a d a p t e d i m p e d Imepdence depends on : a n c e ● overlapping size ● element size ● settings of the projection

  26. 26 2D Model : wave propagation 580 Ang 150 Ang Some numbers : Simulation : • Atoms : 53 743 (Lennard Jones) • 7 + 1 processors • Finite elements : 2 749 nodes, 5 249 elements • 20 000 time steps • Overlap zone • 358 ms / time steps – 29 868 atoms (~55%) – 62 % atoms – 523 nodes (19%), 891 elements (~17%) – 34 % elasticity – 4% coupling

  27. 27 2D example : crack propagation Box of 600nmx800nm Numbers : • 91 556 atoms (Lennard Jones) • 1 129 nodes, 2 082 elements Overlapping zone : • 45 732 atoms • 912 nodes et 1 582 elements Crack : ellipse 50Ang x 1Ang

  28. 28 Computational time repartition for a 2D sequential simulation 0,15% 7,45% 0,37% 4,68% Atom Part Elast Part BuildRHS Correct Surface Effect Correcting Solving constraint 30,29% 57,06% optimal : 20MD/16FE

  29. Simulation times of the different tasks 29 on 36 processors for 100 simulation timesteps 4 3,5 time in seconds 3 Ts(c) 2,5 Ts(a) Tc 2 1,5 1 0,5 0 4 8 16 20 24 32 number of processors assigned to molecular dynamics

  30. 30 Domain decomposition issues 2x8 vs 2x10 4x5 vs 2x8

  31. 31 Overhead time due to coupling on 36 processors during 100 timesteps number of processors assigned to molecular dynamics 4 8 Tc(4a,4b) 16 Tc(3) Tc(2) Tc(1a,1b) 20 24 32 0 0,25 0,5 0,75 1 1,25 1,5 1,75 2 2,25 2,5 2,75 3 time in seconds

  32. 32 3D case of real size : under construction 2662400 atoms 36450 elements 332800 atoms in the overlapping zone 8100 elements in the overlapping zone

  33. 33 Conclusion

  34. 34 Conclusion First version of the simulator ● Better understanding of the multi-scales problems ● Wave reflexion management Results in 2D ● Waves ● Crack Going to 3D simulations Enhancing the parallelism Molecular dynamics codes : limitation for the 3D cases ● ● Domain decomposition in boxes ● Take into account cost functions to map tasks to processors

  35. 35 Our simulator T. Belytschko Model 1D, 2D and 3D tests parallel version based on MPI communication paradigm C++ code Interfaced with : finite elements : libMesh molecular dynamics : Stamp (CEA), Lamps (Sandia) vizualisation and steering : EPSN (ScalApplix – INRIA)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend