Requirements for iterative linear equation solvers in problems of - - PowerPoint PPT Presentation

requirements for iterative linear equation solvers in
SMART_READER_LITE
LIVE PREVIEW

Requirements for iterative linear equation solvers in problems of - - PowerPoint PPT Presentation

Requirements for iterative linear equation solvers in problems of solid- and structural mechanics Reijo Kouhia Tampere University http://www.tut.fi/rakmek/personnel/kouhia/ MTU60: Lets make it sparse, Prague, Czech Republic, Jan 31 - Feb 2,


slide-1
SLIDE 1

Requirements for iterative linear equation solvers in problems of solid- and structural mechanics

Reijo Kouhia Tampere University http://www.tut.fi/rakmek/personnel/kouhia/ MTU60: Let’s make it sparse, Prague, Czech Republic, Jan 31 - Feb 2, 2019

slide-2
SLIDE 2

Happy birthday Mirek!

Ver¨ aj¨ am¨ aki May 27, 2000.

slide-3
SLIDE 3

Some history

◮ I became interested in iterative linear solvers around 1995. ◮ After hitting my head to the wall with X preconditioners I found the paper:

  • M. Benzi, C.D. Meyer, M. T˚

uma, A sparse approximate inverse preconditioner for the CG method, SIAM J. Sci. Comp., 17,1135-1149 (1996) and I felt fascinated about the idea.

◮ I sent a mail to Michele (autumn 1996?) and then everything started to evolve fast. ◮ Mirek, Michele and I wrote our first journal paper and it was published 1998 in Commun.

  • Numer. Methods Eng.

◮ I met Mirek first time face to face in Prague May 1999 and Michele in Finland July 2004. ◮ Mirek visited Helsinki in May 2000.

slide-4
SLIDE 4

It started with shells

Bending dominated loading, 30 × 30 uniform mesh, stabilized quadrilateral MITC4 shell elements 5489 unknowns. Convergence of the PCG depends highly on the “thinness” of the shell, here t/R = 0.1, 0.01, 0.001 and convergence with the IC(0)-preconditioner shown below.

  • M. Benzi, RK, M. T˚

uma, An assessment of some preconditioning techniques in shell problems, Communications in Numerical Methods in Engineering, 14, 1998, 897-906

slide-5
SLIDE 5

In 1999 it continued with solids etc.

Ver¨ aj¨ am¨ aki May 27, 2000.

Block versions of the stabilized AINV Discussion about artificial rhs-vectors

  • M. Benzi, RK, M. T˚

uma, Stabilized and block approximate inverse preconditioners for problems in solid and structural mechanics, Computer Methods in Applied Mechanics and Engineering, 190, 2001, 6533-6554

slide-6
SLIDE 6

Why I am interested in shells? - Shells are prone to stability problems

R/L = 4 (a) R/t = 100 (b) R/t = 400

θ

L P R

θ

slide-7
SLIDE 7

Path-following methods

The non-linear mapping f defines the equilibrium path in the displacement q and load parameter λ space: f(q, λ) ≡ r(q) − λpr(q) = 0 (1) and constitutes the balance between internal- and external forces. To overcome limit points an additional equation is usually augmented to (1), thus we have to solve

  • f(q, λ) = r(q) − λpr(q) = 0

c(q, λ) = 0. (2) Newton’s linearisation step results in Aδx = −g, (3) where A = K pr cT e

  • , δx =
  • δq

δλ

  • , g =
  • f

c

  • , K = f ′ = ∂f

∂q , c = ∂c ∂q , e = ∂c ∂λ.

slide-8
SLIDE 8

Requirements for linear solvers in non-linear analysis

◮ Be able to handle indefinite matrices. ◮ Symmetry (if existing) of the Jacobian matrix of f should be exploitable. ◮ Information about definiteness (positive definite/indefinite) of the matrix should be

extractable.

slide-9
SLIDE 9

Stability of equilibrium state ?

Singularity test functions (STF)

◮ determinant ◮ smallest eigenvalue ◮ smallest pivot ◮ current stiffness parameter

Requirements for the STF

◮ mesh size independent ◮ “good predictor” ◮ indicates all kind of singular behaviour ◮ easy and cheap to evaluate

slide-10
SLIDE 10

Check for the positivity

  • f a matrix A when applying preconditioned Krylov subspace methods to solve Ax = b with

accuracy ri = b − Axi < εrb − Ax0 + εa.

◮ λi is an eigenvalue of A ◮ λ1 = min λi ◮ |λ1| < δ

Conjecture

If A is indefinite then there exists with probability 1 − η(δ, εr, εa) at least one iterate k such that dT

k Adk < 0.

To be usefull: η ≪ 1 and Adk should be available

  • 1. form M (or directly M −1),
  • 2. initialize r0 = b − Ax0, apply

preconditioner d0 = M −1r0, compute τ0 = rT

0 d0,

  • 3. iterate i = 0, 1, 2, ... until convergence:

3.1 compute: s = Adi, ai = τi/dT

i s,

3.2 update: xi+1 = xi + aidi, ri+1 = ri − ais, 3.3 apply preconditioner: z = M −1ri+1, 3.4 compute: τi+1 = rT

i+1z,

bi = τi+1/τi, 3.5 update: di+1 = z + bidi.

slide-11
SLIDE 11

It works with PCG but not with preconditioned Bi-CGSTAB - Why?

  • 1. form M (or directly M −1),
  • 2. compute r0 = b − Ax0, choose ˜

r, compute ρ0 = ˜ rT r0, set d0 = r0,

  • 3. iterate i = 0, 1, 2, ..., until converged:

3.1 apply preconditioner: z = M −1di, 3.2 compute: vi = Az, ai = ρi/˜ rT vi, s = ri − aivi, 3.3 apply preconditioner: ˜ s = M −1s, 3.4 compute: w = A˜ s, ωi = wT s/wT w, 3.5 update: xi+1 = xi + aiz + ωi˜ s, ri+1 = s − ωiw, 3.6 compute: ri+12, 3.7 compute: ρi+1 = ˜ rT ri+1 and bi+1 = (ρi+1/ρi)(ai/ωi), 3.8 update: di+1 = ri+1 + bi+1(di − ωivi).

slide-12
SLIDE 12

To the next topic - direct solution of singular points

My old favourite - stability analysis

M.C. Escher, Metamorphosis II, 1940.

Reijo Kouhia and Martti Mikk
  • la
r e z ; w x; u y ; v
  • p
2L ? 6 L
  • C
  • Q
  • P
P Figure 1: Simply supp
  • rted
plate. w a v e buc kle and the load b earing capacit y decreases. A t the load lev el
  • f
P = 7:84 2 D L 1 there exist t w
  • closely
spaced c hanges in the n um b er
  • f
negativ e eigen v alues, rst from
  • ne
to t w
  • and
then bac k from t w
  • to
  • ne,
after whic h the load starts to increase and the deformation mo de is fully dev elop ed three half-w a v e buc kle. After the load lev el
  • f
P = 9:27 2 D L 1 the three half-w a v e mo de branc h b ecomes stable, see g. 2. Th us, un- der pure load con trol the plate w
  • uld
exhibit a mo de jump from
  • ne
half-w a v e buc kle to three half-w a v e buc kle. Deformation plots are app ended in g. 5 and the mem brane stress resultan t distributions in g. 6. The initially t w
  • half-w
a v e branc h is stable up to load lev el P
  • 37
2 D L 1 , where
  • ne
unstable mo de app ears. F
  • r
the Huitfeldt's metho d dieren t p erturbation loads are tested. As already p
  • in
ted
  • ut,
the p erturbation load should cause deections in the direction
  • f
the branc h tangen ts. If not so, the BCC curv e can b e closed but some branc hes are missing. Another t yp e
  • f
problem exist with the lo cal p erturbation approac h. Ev en if the branc h connecting constrain t (6) denes a closed surface around the critical p
  • in
t, the BCC tracing seems to pro duce a close d curve
  • n
the surfac e
  • f
the c
  • nstr
aint, not p assing the p
  • int
  • f
dep artur e. These failures are demonstrated in detail b y Kouhia and Mikk
  • la.
40 The main computational lab
  • r
in the LSK-reduction tec hnique is the solution
  • f
the eigen v alue problem. Computing the second-order elds, co ecien ts
  • f
the reduced set
  • f
equations as w ell as solving the p
  • lynomial
system are
  • rders
  • f
magnitude less time consuming than solving the eigenproblem. F
  • r
Huitfeldt's lo cal p erturbation approac h the main w
  • rk
is to tra v erse the branc h connecting curv e, whic h is a standard path follo wing routine and th us exp ensiv e for large systems. Solution times c
  • f
the branc h switc hing pro cesses
  • f
this particular example are
  • f
the
  • rder
  • f
50 s for Huitfeldt's approac h c The CPU-times are
  • n
a Digital AlphaServ er 8400 at the Cen ter for Scien tic Computing, Esp
  • ,
Finland. 9 Reijo Kouhia and Martti Mikk
  • la
(a) (b) Figure 2: Out
  • f
plane deections vs. load curv es: (a) initially
  • ne
half-w a v e branc h, (b) initially t w
  • half-w
a v e branc h; solid line = w C dashed line = w Q . Load steps where c hange in signature
  • f
the tangen t stiness matrix
  • n
the p
  • st-buc
kling branc h
  • ccurs
are denoted b y . and
  • nly
2 s for the LSK-reduction based tec hnique. T
  • trace
  • ne
complete equilibrium curv e sho wn in g. 2 to
  • k
appro ximately 100 s and 55 s, resp ectiv ely (the t w
  • half-w
a v e path). In these computations the primary path is traced with 3 steps and 23 steps
  • n
the p
  • st-buc
kling path. An example
  • f
three mo de in teraction is a simply supp
  • rted
T-b eam loaded b y a transv erse force at midspan and in the direction that induced compression in the ange. This particular example has b een extensiv ely studied b y Menk en and his co-w
  • rk
ers at the Eindho v en Univ ersit y
  • f
T ec hnology. 41{ 43 The material
  • f
the b eam is aluminium (E =70 GP a,
  • =0.3)
and the dimensions
  • f
the cross-section are (see g. 3): ange: width 30 mm, thic kness 0.5 mm; w eb: heigh t 50 mm, thic kness 2 mm. Exp erimen ts w ere carried
  • ut
with sev eral lengths
  • f
the b eam. F
  • r
the shorter b eams lo cal ange buc kling w
  • uld
  • ccur
rst, while buc kling is initiated b y an
  • v
erall lateral-torsional mo de for the longer
  • nes.
The lo cal buc kling load is strongly inuenced b y the free width
  • f
the ange, therefore the
  • v
erlap b et w een ange and w eb is mo delled b y
  • rthotropic
elemen ts ha ving high rigidit y in transv erse direction. Ho w ev er, this will pro duce highly ill-conditioned stiness matrices, the sp ectral condition n um b er
  • f
the tangen t stiness matrice v aried within the range
  • f
10 15
  • 10
17
  • n
the computed paths. In earlier studies 34, 42, 43 the b eam has b een mo delled b y spline nite strips; all displace- men ts are in terp
  • lated
with B 3
  • splines
in the longitudinal direction and in transv erse direction linear in terp
  • lation
is used for mem brane displacemen ts and the cubic Hermi- tian p
  • lynomials
for the
  • ut
  • f
plane displacemen t and the kinematical mo del is based
  • n
the classical Kirc hho-Lo v e h yp
  • thesis.
In the ab
  • v
e men tioned references the mo del consists
  • f
22 strips divided in to 40 sections. In this study similar kind
  • f
a nite elemen t 10
slide-13
SLIDE 13

Stability eigenvalue problem

Definition for critical state: Find displacements qcr, critical load λcr and the corresponding eigenmode φ such, that

  • f ′(qcr, λcr)φ

= 0 f(qcr, λcr) = 0 (4) where f ′ = ∂f/∂q. System (4) is a non-linear eigenvalue problem, which is HARD TO SOLVE!

slide-14
SLIDE 14

Solution algorithm for non-linear eigenproblem

Extended system: g(x) = g(q, φ, λ) =      ˆ f(q, λ) ≡ f(q, λ) + f 0(q, λ) = 0 h(q, φ, λ) ≡ f ′(q, λ)φ + h0(φ, λ) = 0 c(q, φ, λ) = 0. Newton’s linearisation step results in Aδx = −g, where A =   Kf P Z Kh N Cq Cφ Cλ   , δx =    δq δφ δλ    , g =    ˆ f h c    . and Kf = ˆ f

′(q, λ) = K + f ′ 0,

Kh = ∂h/∂φ = K + ∂0/∂φ, Z = (f ′φ)′, P = ∂ ˆ f/∂λ, N = ∂h/∂λ, Cλ = ∂c/∂λ, Cq = ∂c/∂q, Cφ = ∂c/∂φ.

slide-15
SLIDE 15

Block elimination

The solution vector is partitioned as δq = qf + Qpδλ, δφ = φh + Φnδλ. Vectors qf, φh and the n × p matrices Qp, Φn solved from Kfqf = − ˆ f, KfQp = −P , Khφh = −h − Zqf, KhΦn = −N − ZQp, and for the control parameters δλ = −(Cλ + CqQp + CφΦn)−1(c + Cqqf + Cφφh). Suitable strategy if direct linear solver is used.

slide-16
SLIDE 16

Iterative solution of the Newton step

Use precontioned Bi-CGSTAB method (or some other applicable accelerator) for the full non-symmetric equation systems   Kf P Z Kh N Cq Cφ Cλ      δq δφ δλ    =    ˆ f h c    . I prefer monolithic solution schemes!

RK, M. T˚ uma, J. M¨ akinen, A. Fedoroff, H. Marjam¨ aki, Implementation of a direct procedure for critical point computations using preconditioned iterative solvers, Computers and Structures, 108-109, 2012, 110-117

slide-17
SLIDE 17

The crucial thing is the preconditioner

A =   Kf P Z Kh N Cq Cφ Cλ  , M =   M f Z M h Cq Cφ D   . M −1 =   M −1

f

−M −1

h ZM −1 f

M −1

h

−D−1(CqM −1

f

− CφM −1

h ZM −1 f )

−D−1CφM −1

h

D−1   . Preconditioning operation y = M −1s    y1 y2 y3    =    M −1

f s1

M −1

h (s2 − Zy1)

D−1 (s3 − Cφy2 − Cqy1)    .

slide-18
SLIDE 18

Error analysis

Accuracy: µ1(A) = A − MF Stability: µ2(A) = I − M −1AF If Kf = Kh = K, thus M f = M h = M 1, results in µ1(A) =

  • 2µ1(K)2 + P 2

F + N2 F + Cλ − D−12 F ,

where µ1(K) = K − M 1. µ2(A) ≤ η1µ2(K) + η2R13F + η3R23F , where R13 = M −1

1 P ,

R23 = M −1

1 N − M −1 1 ZM −1 1 P .

slide-19
SLIDE 19

Some examples - Shell type structure analysed as 3D-solid

RIC2S 10−4 RIC2S 10−3 RIC1 10−5 RIC1 10−4 r i2/r 02 iteraatio 2000 1500 1000 500 1 10−2 10−4 10−6 10−8 (a) (b)

slide-20
SLIDE 20

pohjustin ψ ψ2

  • prec. dens.

it p-time i-time p+i RIC1 10−4 3,3 2005 25,9 2360,4 2386,3 RIC1 10−5 8,1 824 95,7 1851,1 1946,8 RIC1 10−6 15,6 297 305,1 660,2 965,3 RIC2S 10−3 10−6 1,1 1864 59,4 779,6 839,0 RIC2S 2 · 10−4 10−6 2,7 435 167,5 308,2 475,7 RIC2S 10−4 10−6 3,9 229 273,1 217,5 490,6

slide-21
SLIDE 21

Stability analysis of a shallow shell

One layer with 27-node triquadratic 3D-solid elements. (a) R/t = 100, θ = 0.279 rad (b) R/t = 400, θ = 0.2 rad.

θ

L P R

θ

slide-22
SLIDE 22

Cylindrical shell, 64 × 64-mesh, solution times in seconds. BE denotes the block elimination strategy. The preconditioned iterations two values are given: the first one is the preconditioner setup time and the second one is the acceleration iteration time. preconditioner method R/t type ψ

  • av. it.
  • sol. time

BE-SKY 100

  • 10910

BE-SKYB

  • 1670

Bi-CGSTAB RIC2Si 10−3 229 360+3861 = 4221 RIC2S0 10−3 166 50+2558 = 2608 RIC2S0 10−4 35 875+1014 = 1889 SKYB0

  • 28

179+2754 =2933 BE-SKYB 400

  • 1840

Bi-CGSTAB RIC2Si 10−3 349 832+7389 = 8221 RIC2S0 10−4 62 991+2219 = 3210 SKYB0

  • 42

198+5571 = 5769 HP ProLiant DL785 G5 server with quad-core AMD Opteron 8360 SE (Barcelona) 2.5 GHz processor, 512 GB memory, at the CSC-IT Center for Science at Espoo, Finland.

slide-23
SLIDE 23

Coupled problems - salt migration known as Elder’s problem

Unknowns are pressure p and salt concentration C:

η

  • κf ∂p

∂t + ζc ∂C ∂t

  • + η (κf grad p + ζc grad C) · J f + div J f = 0,

η ∂C ∂t + grad C · J f + (κf grad p + ζc grad C) · J c + div J c = 0, with fluxes J f = − k µ [grad p − ((1 − C)¯ ρw + C ¯ ρc) g] , J c = −ηD [a1 grad C − a0C(1 − C)(¯ ρc − ¯ ρw) g] ,

✛ ✲

L = 600 m

❄ ✻

H = 150 m

✛ ✲

a = 150 m ✛

b = 300 m

✛ ✲

c = 150 m x y

s s

C(0 ≤ x ≤ L, 0) = 0 C(a ≤ x ≤ a + b, H) = Cin p(0, H) = p0 p(L, H) = p0

  • J. Hartikainen, RK, T. Wallroth. Permafrost simulations at Forsmark using a numerical 2D

thermo-hydro-chemical model. Technical Report SKB TR-09-17, 2010.

slide-24
SLIDE 24

Elder problem - evolution of salt concentration

After 1,2,3 and 4 years. 88-timesteps, maximum timestep 0.05 year.

0,1 0,2 0,3 0,4 0,5 0,2 0,4 0,6 0,8 1 0.1 . 5 . 9 x / L y / H 1 vuosi 0,1 0,2 0,3 0,4 0,5 0,2 0,4 0,6 0,8 1 . 1 0.5 0.9 x / L y / H 2 vuotta 0,1 0,2 0,3 0,4 0,5 0,2 0,4 0,6 0,8 1 0.1 0.1 0.5 0.5 0.9 x / L y / H 0,1 0,2 0,3 0,4 0,5 0,2 0,4 0,6 0,8 1 0.1 0.1 0.5 0.5 0.9 x / L y / H

slide-25
SLIDE 25

Elder’s problem -computational statistics

343643 unknowns, 88 time-steps, Newton-Raphson iteration for the non-linear system, ILUT-preconditioner with tolerance ψ

ψ

  • prec. dens.

ave it

  • max. it
  • total. it
  • ave. p-time
  • ave. i-time

total p+i 1 · 10−2 1,2 38 359 49608 0.7 18.6 28727 5 · 10−3 1,5 36 259 37209 0.9 22.0 23895 1 · 10−3 2,6 20 121 20367 2.0 19.2 21420 5 · 10−4 3,3 18 61 15529 2.7 14.9 15706 1 · 10−4 5,4 9 32 8617 6.5 12.9 18424 8 · 10−5 5,8 8 29 8060 7.4 11.7 18675 5 · 10−5 6,3 7 26 7247 9.0 10.1 20034 1 · 10−5 7,3 8 24 6427 12.3 12.9 20642 1 · 10−6 8,1 11 24 6171 16.7 18.8 20081

slide-26
SLIDE 26

Coupled problems - magnetoelasticity

Many ways to formulate the magnetic problem.

◮ Using magnetic vector potential

A, such that B = curl

  • A. Mixed type weak form

(curl ˆ A, H) + ( ˆ A, grad p) = ( ˆ A, J) − [ ˆ A, ¯ H], (5) (grad ˆ p, A) = 0, where the inner products are defined as ( B, H) =

H dV, and [ A, H] =

  • ∂Ω

H ds. In 3D curl-conforming interpolation for A should be used and standard C0-interpoation for p.

◮ Using the Lagrangian functional

L = 1 2(div B, div B) + ( p, curl H − J). Divergence conforming interpolation for B and curl-conforming for p.

slide-27
SLIDE 27

How to solve the curl-curl problem in 3D?

Saddle point problem (5) in the matrix form H C CT a p

  • =

h

  • The leading block H is singular!
slide-28
SLIDE 28

Monolithic magnetoelasticity

In strong form −div σ( u, B) = ρ b, curl H( B, u) = J, div B = 0, and using the vector potential approach for the magnetic part gives the following linearized discrete equation system   K L F H C CT      δu δa δp    = g What kind of preconditioner for such a system?

slide-29
SLIDE 29

Open source JuliaFEM

See www.juliafem.org! Julia programming language try to combine programming flexibility and high-performance computing. Eigenvalue analysis for natural frequences (10 lowest pairs), 12.6 millions unknowns. Solution time 3 h 4 min with 24 x Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60 GHz with 512 GB total memory.

Tero Frondelius, Jukka Aho. JuliaFEM – open source solver for both industrial and academia use. Rakenteiden Mekaniikka (Journal of Structural Mechanics). 50 (3) 2017, 229-233.

slide-30
SLIDE 30

Thank you!

Bohumil Kubiˇ sta, Taming snakes

I would like to express my sincere thanks to Mirek and Michele for fruitfull and enjoyable collaboration. I would also like to thank my numerous friends/colleaques with whom I have worked, especially Anouar Belahcen, Juha Hartikainen, Jari M¨ akinen, Paavo Rasilo and my teachers Professors Martti Mikkola and Markku Tuomala.