High Performance Multiscale Simulation for Crack Propagation - - PowerPoint PPT Presentation

high performance multiscale simulation for crack
SMART_READER_LITE
LIVE PREVIEW

High Performance Multiscale Simulation for Crack Propagation - - PowerPoint PPT Presentation

1 High Performance Multiscale Simulation for Crack Propagation Guillaume Anciaux, Olivier Coulaud and Jean Roman ScAlApplix Project HPSEC 2006 - 18 th August 2 Outline 1. Introduction Motivations State of the art 2. Our


slide-1
SLIDE 1

1

High Performance Multiscale Simulation for Crack Propagation

HPSEC 2006 - 18th August

Guillaume Anciaux, Olivier Coulaud and Jean Roman ScAlApplix Project

slide-2
SLIDE 2

2

Outline

  • 1. Introduction
  • Motivations
  • State of the art
  • 2. Our Approach
  • Presentation of the method
  • Coupling Algorithm
  • Parallel algorithms and implementation
  • 3. Results
  • 1D & 2D case : wave propagations
  • 2D case : crack

4.Conclusion

slide-3
SLIDE 3

3

Introduction

slide-4
SLIDE 4

4

Introduction : context

Collaboration with the CEA DIF-DPTA - G. Zerah Goals : produce a tool for

  • CEA
  • 1. Better understanding of miscroscale phenomena
  • 2. Reduce computing time of molecular dynamics simulations
  • ScAlApplix
  • 1. Analysis of coupling algorithms
  • 2. HPC parallel scaling processes
  • 3. Development of a framework for multiscale computation
  • 4. Generics : use of legacy codes

Impact on laser optics

slide-5
SLIDE 5

5

Introduction : atomistic approach

simulation with molecular dynamics tool: an empirical potential (one, two or three body interactions) fine description of the studied system. All structural phenomena are captured

M

V

Teams :

  • A. Nakano (Louisiana State University)
  • H. Gao (Max Planck Institute for Metals Research)

M d

2xt

dt

2

=− VMx t

slide-6
SLIDE 6

6

Introduction : continuum approach

Elastodynamics :

  • Lagrangian formulation
  • Constitutive law for the material
  • Discretization : P1 finite elements
  • Variational problem – energy minimization
  • Allow to manipulate huge objects
  • Well known technology

T

T T b T y = + = div   ρ ) y (∇ = T T

x Ω ) , ( t x y ω

slide-7
SLIDE 7

7

Difficulties and limitations of the methods

MD limitations

1.Space scale : A 100 nm3 of silicon crystal contains near billion atoms 2.Time scale : Time step = femto-second(10

  • 15 s.)

3.Boundary conditions (periodic) 4.Huge data volumes ~ 1 Terabyte per steps

FE limitations

1.Discontinuities (Crack) Constitutive law not valid anymore near the fracture : the model needs to be extended (XFEM, …) 2.Needs a fine mesh to capture pertinent information.

[ [ ] ]

y ≠

slide-8
SLIDE 8

8

Introduction : Multiscale approach (1)

Idea : using advantages of both models

  • Continuum model

– Reduce the size of the domain. – Take into account complex boundary conditions.

  • discrete model

– Near the discontinuities.

How to couple this two models ?

slide-9
SLIDE 9

9

Introduction : Multiscale approach (2)

Multi-scale approaches :

  • Junction

– QC-method (Tadmor and al. 1996) Static simulations and T= 0 – Macroscopic, Atomistic and Ab initio Dynamics (MAAD) (Abraham and al. 1998)

  • Bridging : duplication of the data

– T. Belytschko (Bridging Method) – Bridging Scale Method (Liu)

slide-10
SLIDE 10

10

Introduction : Bridging approach

Numerical difficulties :

  • Avoiding that h = inter-atomic distance
  • Different time/length scales

➢ mecanical wave reflexions

Algorithmic difficulties

  • 1. Need for smart data handling at the interface
  • 2. Efficient computation of the FE shape functions

in the overlapping zone

  • 3. Decomposition of domain
  • 4. Load balancing

Problem 3 and 4 are tied together

slide-11
SLIDE 11

11

Our approach

slide-12
SLIDE 12

12

Discrete/continuum coupling

The Bridging Method introduced by T. Belytschko & S. Xiao Idea :

  • Imposing equivalent displacements on atom sites
  • Introduction of Lagrangian multipliers
  • Weighting the computations of the multipliers to keep

predominances of each models

α

slide-13
SLIDE 13

13

Discrete/continuum coupling

The constraint : with : Then the correction of the velocities :

gi  yXi−  di=  y Xi−  xi=0

A=rhs

∀i , rhsi=∑ J  y J J Xi−  xi Ai , j=∑ J  Xi X j  J M J − 1 imi i , j Ai=∑ j Ai , j

A is condensed on its diagonal

 y I=  y I t I M I∑ j  j I X j  x I=  xI− t 1−imi i

slide-14
SLIDE 14

14

Needs to identify atoms in a given element

Solution to algorithmic difficulties

Double loop O(Natoms x Nelements) Introduction of a grid

– Place atoms and elements in the grid. – Map atoms to elements.

Complexity O(Natoms x Nbox-elements)

slide-15
SLIDE 15

15

Pre-computations of the shape functions for all atom sites

Initialization of the bridging zone (2)

  • The shape values are stored

in an appropriate data structure

  • Allowing, through

atom/element mapping, accesses at constant time for any given atom site

slide-16
SLIDE 16

16

Mapping the codes to processors

Strategy : distinct processor sets for each model

Molecular dynamics set Continuum mechanics set

Coupling interaction

MD weighting FE weighting

slide-17
SLIDE 17

17

Diagram for the coupling model of parallel codes (SPMD)

Initialization

Bridging Zone initialization Force computation Computation of the Lagrange multipliers Force computation Position update Position update T = Tmax? Velocity update Velocity update Initialization no no yes

Parallel Molecular dynamics Parallel Continuum elasticity T i

a

T i

b

T i

c

T s

1b

T s

1a

T s

2a

T s

2b

T c T s

3b

T s

3a

slide-18
SLIDE 18

18

Details on the computation of the Lagrange Multipliers

Computation of RHS contribution

Summing the contributions

Solving the constraint system Solving the constraint system

Correcting the velocities Correcting the velocities

Computation of RHS contribution Parallel Molecular dynamics Parallel Continuum elasticity

−  xi

∑ J 

y J Xi ∑ J  y J  Xi−  xi=rhsi

Tc

1a

Tc

1b

Tc

2

Tc

3

Tc

3

i=rhsi/ Ai i=rhsi/ Ai

Tc

4a

Tc

4b

=

∆ + =

m j j I j I IM

t

1 I new I

) x ( y y ϕ λ α  

i i i m

t

λ α )

1 ( d d

i new i

− ∆ − = 

slide-19
SLIDE 19

19

Constraint system data redistribution

To illustrate the talk we consider the following distribution over processors :

Atomic zone Bridging zone Continuum Zone

slide-20
SLIDE 20

20

Constraint system data redistribution

Case where the mapping of models is made onto two distinct sets of processors : both set of processor own a parallel vector of the Lagrangian unknowns.

slide-21
SLIDE 21

21

Dynamical effects : atoms migrations management (1)

Coherency in the redistribution scheme is maintained by protocol involving processors of the bridging zone

slide-22
SLIDE 22

22

Dynamical effects : atoms migrations management (2)

Coherency in the redistribution scheme is maintained by protocol involving communication between the two set.

Send new

  • wner

Send new position

Standard MD atom migration

slide-23
SLIDE 23

23

Results

slide-24
SLIDE 24

24

Wave Reflexions on th 1D Model

Wave reflexions a caused by :

  • Number of degree of freedom reducing
  • Overlap zone impedance which depends on:

– Waves frequencies – Waves phase

400 atoms 40 elements

slide-25
SLIDE 25

25

Reproduction of the absorbsion results

u n a d a p t e d i m p e d a n c e a d a p t e d i m p e d a n c e

Initial condition

Imepdence depends on :

  • overlapping size
  • element size
  • settings of the projection
slide-26
SLIDE 26

26

2D Model : wave propagation

Some numbers :

  • Atoms : 53 743 (Lennard Jones)
  • Finite elements : 2 749 nodes, 5 249 elements
  • Overlap zone

– 29 868 atoms (~55%) – 523 nodes (19%), 891 elements (~17%)

Simulation :

  • 7 + 1 processors
  • 20 000 time steps
  • 358 ms / time steps

– 62 % atoms – 34 % elasticity – 4% coupling

580 Ang

150 Ang

slide-27
SLIDE 27

27

2D example : crack propagation

Box of 600nmx800nm Numbers :

  • 91 556 atoms (Lennard Jones)
  • 1 129 nodes, 2 082 elements

Overlapping zone :

  • 45 732 atoms
  • 912 nodes et 1 582 elements

Crack : ellipse 50Ang x 1Ang

slide-28
SLIDE 28

28

57,06% 30,29% 4,68% 0,37%

7,45% 0,15%

Computational time repartition for a 2D sequential simulation

Atom Part Elast Part BuildRHS Correct Surface Effect Correcting Solving constraint

  • ptimal : 20MD/16FE
slide-29
SLIDE 29

29

4 8 16 20 24 32 0,5 1 1,5 2 2,5 3 3,5 4

Simulation times of the different tasks

  • n 36 processors

Ts(c) Ts(a) Tc

number of processors assigned to molecular dynamics time in seconds for 100 simulation timesteps

slide-30
SLIDE 30

30

Domain decomposition issues

4x5 vs 2x8 2x8 vs 2x10

slide-31
SLIDE 31

31

32 24 20 16 8 4 0,25 0,5 0,75 1 1,25 1,5 1,75 2 2,25 2,5 2,75 3

Overhead time due to coupling on 36 processors during 100 timesteps

Tc(4a,4b) Tc(3) Tc(2) Tc(1a,1b)

number of processors assigned to molecular dynamics time in seconds

slide-32
SLIDE 32

32

3D case of real size : under construction

2662400 atoms 36450 elements 332800 atoms in the overlapping zone 8100 elements in the overlapping zone

slide-33
SLIDE 33

33

Conclusion

slide-34
SLIDE 34

34

Conclusion

First version of the simulator

  • Better understanding of the multi-scales problems
  • Wave reflexion management

Results in 2D

  • Waves
  • Crack

Going to 3D simulations Enhancing the parallelism

  • Molecular dynamics codes : limitation for the 3D cases
  • Domain decomposition in boxes
  • Take into account cost functions to map tasks to processors
slide-35
SLIDE 35

35

Our simulator

  • T. Belytschko Model

1D, 2D and 3D tests parallel version based on MPI communication paradigm C++ code Interfaced with : finite elements : libMesh molecular dynamics : Stamp (CEA), Lamps (Sandia) vizualisation and steering : EPSN (ScalApplix – INRIA)