Adaptive Multilevel BDDC Jan Mandel Includes joint work with Clark - - PowerPoint PPT Presentation

adaptive multilevel bddc
SMART_READER_LITE
LIVE PREVIEW

Adaptive Multilevel BDDC Jan Mandel Includes joint work with Clark - - PowerPoint PPT Presentation

Adaptive Multilevel BDDC Jan Mandel Includes joint work with Clark Dohrmann, Bed rich Soused k, Jakub stek, Pavel Burda, Marta S Cert kov a, and Jaroslav Novotn y. Center for Computational Mathematics and


slide-1
SLIDE 1

Adaptive Multilevel BDDC

Jan Mandel Includes joint work with Clark Dohrmann, Bedˇ rich Soused´ ık, Jakub ˇ S´ ıstek, Pavel Burda, Marta ˇ Cert´ ıkov´ a, and Jaroslav Novotn´ y.

Center for Computational Mathematics and Department of Mathematical and Statistical Sciences University of Colorado Denver Supported by National Science Foundation under grant DMS-0713876

DD19, August 2009

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 1 / 49

slide-2
SLIDE 2

Outline

Some (biased) history BDDC formulation Adaptive Multilevel BDDC Implementation by global matrices and generalized change of variables Implementation on top of frontal solver

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 2 / 49

slide-3
SLIDE 3

Motivation

BDDC – Balancing Domain Decomposition by Constraints From Dohrmann (2003): Some remaining issues need to be addressed for improvement. First, it would be useful to have an effective method for select- ing additional corners and edges to improve performance for very poorly conditioned problems. Second, the performance of the multilevel extension should be investigated further. Recall that the multilevel extension is obtained by recursive application of the preconditioner to coarse problem stiffness matrices. Such an ex- tension would be beneficial for problems with very large numbers

  • f substructures...

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 3 / 49

slide-4
SLIDE 4

Basic substructuring

assemble elements into substructures, eliminate interiors = ⇒ reduced problem (Schur complement) on interface (Dirichlet-to-Neumann

  • perator H1/2 → H−1/2, a.k.a. Poincar´

e-Steklov operator) for parallel solution; Schur complement matrix-vector multiply = solve substructure Dirichlet problem

  • nly matrix datastructures needed; condition number O(1/h) better

in practice than the O

  • 1/h2

for the original problem (for small N)

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 4 / 49

slide-5
SLIDE 5

Early Preconditioners

diagonal preconditioning: Gropp and Keyes 1987, “probing” the diagonal of the Schur complement Chan and Mathew 1992 (because creating the diagonal of the Schur complement is expensive)... preconditioning by solving substructure Neumann problems (H−1/2 → H1/2) Glowinski and Wheeler 1988, Le Talleck and De Roeck 1991 (a.k.a. the Neumann-Neumann method) add coarse space – asymptotically optimal preconditioners: H/h → log2 (1 + H/h) (Bramble, Pasciak, Schatz 1986+, Widlund 1987, Dryja 1988, Dryja-Widlund-Smith 1994... ) but all these methods require access to mesh details and depend on details

  • f the Finite Element code, which makes them hard to

interface with existing FE software

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 5 / 49

slide-6
SLIDE 6

FETI and BDD

algebraic - need only substructure 1. solvers (Neumann, Dirichlet), 2. connectivity, 3. basis of nullspace (BDD - constant function for the Laplacian, FETI - actual nullspace) both involve singular substructure problems (Neumann) and build the coarse problem from local substructure nullspaces (in different ways) to assure that the singular systems are consistent Balancing Domain Decomposition (BDD, Mandel 1993): solve the system reduced to interfaces, interface degrees of freedom common (this is the Neumann-Neumann with a particular coarse space and multiplicative coarse correction) Finite Element Tearing and Interconnecting (FETI, Farhat and Roux 1991): enforce continuity across interfaces by Lagrange multipliers, solve the dual system for the multipliers

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 6 / 49

slide-7
SLIDE 7

FETI and BDD Developments

Both require only matrix level information available in FE software. So they became very popular and widely used. The methods work well in 2D and 3D (solids). But the performance for plates/shells/biharmonic was not so good. Reason: the condition numbers depend on the energy (trace norm) of functions with jumps across a substructure corner. In 2D, OK for H1/2 traces of H1 functions, not H3/2 traces of H2 functions (embedding theorem). Fix: avoid this by increasing the coarse space and so restricting the space where the method runs, to make sure that nothing gets torn across the corners (BDD: LeTallec Mandel Vidrascu 1998, FETI: Farhat Mandel 1998, Farhat Mandel Tezaur 1998). Drawback: complicated, expensive, a large coarse problem with custom basis functions

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 7 / 49

slide-8
SLIDE 8

The best so far:

FETI-DP and BDDC To assure that nothing gets torn across the corners, enforce identical values at corners from all neighboring substructures a-priori = ⇒ corner values are coarse degrees of freedom Continuity elsewhere at the interfaces by Lagrange multipliers as in FETI = ⇒ FETI-DP (Farhat et al 2001) Continuity elsewhere by common values as in BDD = ⇒ BDDC (Dohrmann 2003; independently Cros 2003, Frakagis and Papadrakakis 2003, with corner coarse degrees of freedom only) Additional coarse degrees of freedom (side/face averages) required in 3D for good conditioning: Farhat, Lesoinne, Pierson 2000 (algorithm only), Dryja Windlud 2002 (with proofs)

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 8 / 49

slide-9
SLIDE 9

Evolution of BDD/BDDC/FETI-DP

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 9 / 49

slide-10
SLIDE 10

Substructuring for the two-level method (with H/h=4)

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 10 / 49

slide-11
SLIDE 11

BDDC description - example of spaces

W =

N

  • i=1

Wi : space of block vectors, one block per substructure U ⊂

  • W

⊂ W continuous across whole continuous across no continuity substructure interfaces corners only required global matrix: assembled partially assembled not assembled substructures: assembled assembled assembled Want to solve u ∈ U : a (u, v) = f , v ∀v ∈ U.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 11 / 49

slide-12
SLIDE 12

Abstract two-level BDDC:

Variational setting of the problem and algorithm components

u ∈ U : a(u, v) = f , v , ∀v ∈ U form a (·, ·) is SPD on U and positive semidefinite on W ⊃ U, Example: W = W1 × · · · × WN (spaces on substructures) U = functions continuous across interfaces Choose preconditioner components:

1

space W , U ⊂ W ⊂ W , such that a (·, ·) is positive definite on W . Example: functions with continuous coarse dofs, such as values at substructure corners

2

projection E : W → U, range E = U . Example: averaging across substructure interfaces

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 12 / 49

slide-13
SLIDE 13

Abstract BDDC preconditioner

Theorem The abstract BDDC preconditioner M : U − → U defined by M : r − → u = Ew, w ∈ W : a (w, z) = r, Ez , ∀z ∈ W . satisfies κ = λmax(MA) λmin(MA) ≤ ω = sup

w∈f W

(I − E)w2

a

w2

a

. The space W is defined using so called coarse degrees of freedom, as

  • W = {w ∈ W : C (I − E) w = 0} ,

C . . . weights on the local coarse degrees of freedom, E . . . averaging, = ⇒ coarse degrees of freedom on adjacent substructures coincide.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 13 / 49

slide-14
SLIDE 14

Coarse degrees of freedom (in implementation)

We want to represent common values of coarse dofs as global coarse dofs, QP . . . the global coarse degrees of freedom selection matrix. Suppose there is a space Uc and operators QT

P : U → Uc

Rc : Uc → X R : U → W , where Rc and R are mapping (0 − 1) matrices, related as CR = RcQT

P .

In implementation, W is decomposed into

  • W =

W∆ ⊕ WΠ

  • W∆ = functions with zero coarse dofs ⇒ local problems on substructures
  • WΠ = functions given by coarse dofs & energy minimal ⇒

⇒ global coarse problem.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 14 / 49

slide-15
SLIDE 15

The coarse problem

0.2 0.4 0.6 0.8 1

A basis function from WΠ is energy minimal subject to given values of coarse degrees of freedom on the

  • substructure. The function is discon-

tinuous across the interfaces between the substructures but the values of coarse degrees of freedom on the dif- ferent substructures coincide.

The coarse problem has the same structure as the original FE problem = ⇒ solve it approximately by one iteration of BDDC = ⇒ three-level BDDC. Apply recursively = ⇒ Multilevel BDDC.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 15 / 49

slide-16
SLIDE 16

Substructuring for the three-level method

h

H1

H2

1,i

2,k

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 16 / 49

slide-17
SLIDE 17

Multilevel BDDC

Coarse problem solved by the BDDC preconditioner recursively. U

  • UI1

P1 ←

U1

E1 ←

  • W1
  • WΠ1

  • W∆1
  • UI2

P2 ←

U2

E2 ←

  • W2
  • WΠ2

  • W∆2
  • .

. .

  • UIL−1

PL−1 ←

UL−1

EL−1 ←

  • WL−1
  • WΠL−1

  • W∆L−1

The leaves of the tree are the local problems and the coarse actually solved.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 17 / 49

slide-18
SLIDE 18

Condition number bound

Theorem (Mandel, Soused´ ık, Dohrmann (2008)) The condition number bound κ ≤ ω of Multilevel BDDC is given by ω = ΠL−1

ℓ=1 ωℓ ,

ωℓ = sup

wℓ∈(I−Pℓ)f Wℓ

(I − Eℓ) wℓ2

a

wℓ2

a

Generalizes 3-level bounds by Tu (2006, 2007) to many levels.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 18 / 49

slide-19
SLIDE 19

Adaptive method idea (Mandel, Sousedik, 2006)

Condition number bound is Rayeigh quotient ωℓ ≤ sup

w∈(I−Pℓ)f Wℓ

(I − Eℓ) wℓ2

a

wℓ2

a

= sup

wℓ∈f Wℓ

wT

ℓ (I − Eℓ)T Sℓ (I − Eℓ) wℓ

wT

ℓ Sℓwℓ

Eigenvalues λ1 > λ2 > . . ., stationary points are eigenvectors. Optimal decrease of the Rayleigh quotient: make the space

  • Wℓ = ker C(I − Eℓ)

smaller by orthogonality to the dominant eigenvector u1: add uT

1 to C(I − Eℓ)

This reduces the condition bound to the second eigenvalue λ2. Adding more eigenvectors eats more eigenvalues.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 19 / 49

slide-20
SLIDE 20

The condition number bound as an eigenvalue problem

On the level ℓ, define λℓ by Πℓ (I − Eℓ)T Sℓ (I − Eℓ) Πℓwℓ = λℓΠℓSℓΠℓwℓ, where Sℓ . . . the Schur complement operator, Πℓ . . . the orthogonal projection in (I − Pℓ) Wℓ into (I − Pℓ) Wℓ. Then κ ≤ ω = ΠL−1

ℓ=1 max λℓ.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 20 / 49

slide-21
SLIDE 21

Condition number bound: Heuristic indicator

Solving the the global eigenvalue problems is expensive ... replace it by: local versions formulated for all pairs of adjacent substructures. Definition: A pair of substructures is adjacent, if they share a face. ωst

ℓ = max λst ℓ

: Πst

  • I − E st

T Sst

  • I − E st

  • Πst

ℓ wst ℓ = λst ℓ Πst ℓ Sst ℓ Πst ℓ wst ℓ .

The heuristic indicator of the condition number bound is defined as

  • ω = ΠL−1

ℓ=1

  • max

{st}∈Aℓ

ωst

  • .

The vectors wst

ℓ are used to generate constraints in Cℓ and QP,ℓ such that

  • W st

ℓ =

  • wst

ℓ ∈ W st ℓ : C st ℓ

  • I − E st

  • wst

ℓ = 0

  • .

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 21 / 49

slide-22
SLIDE 22

Adaptive selection of face constraints

Algorithm: adding of coarse degrees of freedom to guarantee that the condition number indicator ω ≤ τ L−1, for a given a target value τ:

for levels ℓ = 1 : L − 1 for all faces Fℓ on level ℓ

1

Compute the largest local eigenvalues and corresponding eigenvectors, until the first mst is found such that λst

mst ≤ τ, put k = 1, . . . , mst.

2

Compute the constraint weights cst

k = [cs k ct k] as

cst

k = wstT k

  • I − E st

T Sst

  • I − E st

  • ,

3

Take one block, e.g., cs

k and keep nonzero weights for the face Fℓ.

4

Add to global coarse dofs selection matrix QP,ℓ the k columns RsT

ℓ csT k .

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 22 / 49

slide-23
SLIDE 23

Adaptive-Multilevel BDDC: Implementation remarks

Reduce problem to interfaces on level ℓ = 1,

  • n levels ℓ > 1 apply interior pre/post-corrections in iterations.

Use the adaptive method for faces, with initial constraints as:

corners in 2D, corners and arithmetic averages over edges in 3D.

Treat substructures as (coarse) elements with:

energy minimal basis functions, variable number of nodes per element, variable number of degrees of freedom per node.

On each decomposition level ℓ = 1, . . . , L − 1:

create substructures with roughly the same number of dofs, minimize the number of “cuts“ (Metis 4) between substructures, repeat until it is suitable to factor the coarse problem directly (level L).

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 23 / 49

slide-24
SLIDE 24

Numerical results in 2D: compressible elasticity, λ = 1, µ = 2

Domain decomposition of the planar elasticity problem with 1182722 dof, 2304 subdomains on the second level and 9 subdomains on the third-level,

The two-level decomposition (left) and the three-level decomposition (right):

The coarsening ratio on both decomposition levels is Hi/Hi−1 = 16.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 24 / 49

slide-25
SLIDE 25

Numerical results in 2D: Adaptive 2-level method

Non-adaptive method: constraint Nc C κ it c 4794 9.3 18.41 43 c+f 13818 26.9 18.43 32 Adaptive constraints: τ Nc C

  • ω

κ it ∞(=c) 4794 9.3

  • 18.41

43 10 4805 9.4 8.67 8.34 34 3 18110 35.3 2.67 2.44 15 2 18305 35.7 1.97 1.97 13 c, c+f: constraints as arithmetic averages over corners, corners and faces, Nc: number of constraints, C: relative size of the coarse problem, τ: condition number target, ω: condition number indicator, κ: approximate condition estimate, it: number of iterations (tol 10−8).

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 25 / 49

slide-26
SLIDE 26

Numerical results in 2D: Local eigenvalues (2-level method)

Eigenvalues of the local problems for pairs of subdomains i and j:

(the jagged face is between subdomains 2 and 50) i j λij,1 λij,2 λij,3 λij,4 λij,5 λij,6 λij,7 λij,8 1 2 3.8 2.4 1.4 1.3 1.2 1.1 1.1 1.1 1 49 6.0 3.5 2.7 1.4 1.3 1.1 1.1 1.1 2 3 5.4 2.6 1.6 1.3 1.2 1.1 1.1 1.1 2 50 24.3 18.4 18.3 16.7 16.7 14.7 13.5 13.1 3 4 3.4 2.4 1.4 1.3 1.1 1.1 1.1 1.1 3 51 7.4 4.6 3.7 1.7 1.4 1.3 1.2 1.1 49 50 12.6 5.1 4.3 1.9 1.6 1.3 1.2 1.2 50 51 8.7 4.8 3.9 1.8 1.5 1.3 1.2 1.2 50 98 7.5 4.6 3.7 1.7 1.4 1.3 1.2 1.1

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 26 / 49

slide-27
SLIDE 27

Numerical results in 2D: Local eigenvalues (3-level method)

Eigenvalues of the local problems for pairs of subdomains i and j:

(the jagged face is between subdomains 2 and 5) i j λij,1 λij,2 λij,3 λij,4 λij,5 λij,6 λij,7 λij,8 1 2 16.5 9.0 5.4 2.6 2.1 1.4 1.3 1.3 1 4 6.5 4.7 1.9 1.7 1.3 1.2 1.2 1.1 2 3 23.1 9.4 4.6 3.2 2.1 1.6 1.4 1.3 2 5 84.3 61.4 61.4 55.9 55.8 49.3 48.0 46.9 3 6 13.7 8.8 4.4 2.2 1.9 1.4 1.3 1.2 4 7 6.5 4.7 1.9 1.7 1.3 1.2 1.2 1.1 5 6 18.9 13.1 11.3 3.8 2.6 2.1 1.9 1.5 5 8 17.3 12.9 10.8 3.6 2.3 2.0 1.8 1.4 8 9 13.7 8.8 4.4 2.2 1.9 1.4 1.3 1.2

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 27 / 49

slide-28
SLIDE 28

Numerical results in 2D: Adaptive 3-level method

Non-adaptive method: constraint Nc C κ it c 4794 + 24 1.0 67.5 74 c+f 13818 + 48 3.0 97.7 70 Adaptive constraints: τ Nc C

  • ω

κ it ∞(=c) 4794 + 24 1.0

  • 67.5

74 10 4805 + 34 1.0 > (9.80)2 37.42 60 3 18110 + 93 3.9 > (2.95)2 3.11 19 2 18305 + 117 4.0 > (1.97)2 2.28 15 c, c+f: constraints as arithmetic averages over corners, corners and faces, Nc: number of constraints, C: relative size of the coarse problem, τ: condition number target, ω: condition number indicator, κ: approximate condition estimate, it: number of iterations (tol 10−8).

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 28 / 49

slide-29
SLIDE 29

What next in adaptive multilevel BDDC?

In progress: parallel implementation further validation on practical engineering problems Implementation issues: The adaptive algorithm may give many additional face coarse degrees

  • f freedom on few substructure interfaces.

How to implement efficiently BDDC with a variable and sometime large number of coarse degrees of freedom on substructure faces?

Implementation by a generalized change of variable Implementation on top of frontal solver

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 29 / 49

slide-30
SLIDE 30

Change of variables on face or edge

Li and Widlund (2007) change of variables on an face/edge into

  • ne constant over the face/edge

all others with zero average In the new variables, the Implementation can then treat averages just like

  • corners. The subdomain matrix gets a little more dense.
  • ld =

     1 −1 . . . −1 1 1 ... 1 1      ∗ new One old variable needs to be chosen to be replaced by the constant

  • function. Here it was the first one; any other one would be just as good.

But what to do when there are more averages, with general weights, not just 1s?

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 30 / 49

slide-31
SLIDE 31

Implementation: Generalized change of variables

Adaptive BDDC in 3D, Mandel, Soused´ ık, ˇ S´ ıstek (in preparation) Permute variables so that those be replaced by averages are the first k Transform the variables by new = Bavg 0 I

  • ∗ old =

     vT

1

. . . vT

k

0 I      ∗ old = U V I

  • ∗ old

Rows of Bavg are linearly independent. The inverse transformation needs to be stable - need U well conditioned. QR decomposition with pivoting permutes linearly independent columns of Bavg first: Bavg ∗ permutation = QR = Q[triangular rectangular] = [U V ] Drop the averages where diagonal entries of R small.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 31 / 49

slide-32
SLIDE 32

Implementation: Global matrices

All BDDC/FETI-DP operators implemented as matrices on a virtual disconnected mesh, assembled at corners only

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 32 / 49

slide-33
SLIDE 33

Implementation: Global matrices (2)

All BDDC/FETI-DP operators implemented as matrices on a virtual disconnected mesh, assembled at corners only Operations on disconnected mesh vectors by the usual matrix-vector algebra

in a real implementation: by substructures, take advantage of the special form for speedup for testing: just use Matlab sparse matrices and write the formulas as usual

parallel implementation of M−1u by multifrontal method with

  • rdering: MUMPS

averages by transformation of variable, or projection - still OK sparsity

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 33 / 49

slide-34
SLIDE 34

Bridge test problem

Finite element discretization and substructuring of the bridge construction, consisting of 157 356 degrees of freedom, 16 substructures, 250 corners, 30 edges and 43 faces.

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 34 / 49

slide-35
SLIDE 35

Adaptive performance for the bridge test problem

constraint Nc κ it c 2 301.4 224 c+e 180 2 252.4 220 c+e+f 309 653.6 160 c+e+f (3eigv) 309 177.8 103 τ

  • ω

Nc κ it ∞(=c+e) 6 500.5 180 2 252.4 220 650 589.3 185 483.5 169 30 29.6 292 28.7 64 5 > 5 655 5.0 26 2 > 2 1301 2.0 14 τ = threshold for condition indicator ω = the condition number indicator achieved Nc = number of coarse dofs κ = actual condition number as estimated by PCG it = number of iterations

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 35 / 49

slide-36
SLIDE 36

BDDC Implementation on Top of Frontal Solver

ˇ S´ ıstek, Novotn´ y, Mandel, ˇ Cert´ ıkov´ a, Burda (in print) Frontal solver: Solve linear equations from finite elements with some variables free and some constrained. Matrix given as a collection of local element matrices, never assembled whole. In: f1 x2 Out: x1 Rea A11 A12 A21 A22 x1 x2

  • =

f1

  • +
  • Rea
  • Straightforward: use the frontal solver to:
  • solve the coarse problem in BDDC (substructure treated as element)
  • solve the local substructure problems in BDDC in the case of corner

constraints only New: by specially crafted calls and few extras, use the frontal solver to

  • solve the local substructure problems in BDDC in the case of general

constraints (averages)

  • build the coarse basis functions

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 36 / 49

slide-37
SLIDE 37

Frontal solver

  • B. M. Irons, 1970

direct solver for sparse matrices arising in FEM number of flops O(n · nfron2), where nfron << n is the front width memory demand – nfron2 if out-of-core element-by-element approach – element matrices read from file until whole line is assembled, then immediately eliminated basic scheme – block 1 - ‘free’ variables, block 2 - ‘constrained’ (also ‘fixed’) variables A11 A12 A21 A22 x1 x2

  • =

f1 f2

  • +
  • Rea2
  • ,

(1) x2, f1, f2 – inputs x1, Rea2 (reaction forces) – outputs

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 37 / 49

slide-38
SLIDE 38

General constraints vs. frontal solver on subdomain

central idea – split matrix C according to types of constraints corners as Dirichlet boundary conditions, i.e. fixed variables averages enforced by Lagrange multipliers – matrix Cf coarse problem construction on subdomain (index i omitted)   Aff Afc C T

f

Acf Acc Cf     Ψc

f

Ψavg

f

I λc λavg   =   rea rea I   . + get rid of indefiniteness by eliminating the first block

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 38 / 49

slide-39
SLIDE 39

Preconditioner setup

1

Forward step of frontal solver with corners marked as fixed variables in matrix A.

2

Find A−1

ff C T f

by backward solve by frontal solver Aff Afc Acf Acc A−1

ff C T f

  • =

C T

f

  • +
  • Cf A−1

ff Afc

T

  • .

3

Construct Cf A−1

ff C T f

and factor it by LAPACK.

4

Backward solve of dual problem by LAPACK for λ from Cf A−1

ff C T f λ = −

  • Cf A−1

ff Afc

I

  • .

5

Backward solve for Ψf by frontal solver Aff Afc Acf Acc Ψc

f

Ψavg

f

I

  • =

−C T

f λ

  • +
  • Rea
  • .

6

Compute local AC = ΨTAΨ = ΨT −C T

f λ

Rea

  • .

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 39 / 49

slide-40
SLIDE 40

General constraints vs. frontal solver

subdomain problem solution   Aff Afc C T

f

Acf Acc Cf     uf µ   =   r rea   . + get rid of indefiniteness by eliminating the first block

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 40 / 49

slide-41
SLIDE 41

Algorithm of preconditioning action on subdomain

1

Backward step of frontal solver for A−1

ff r

Aff Afc Acf Acc A−1

ff r

  • =

r

  • +
  • Rea
  • .

2

Backward step of LAPACK for µ Cf A−1

ff C T f µ = Cf A−1 ff r.

3

Backward step of frontal solver for uf Aff Afc Acf Acc uf

  • =

−C T

f µ + r

  • +
  • Rea
  • .

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 41 / 49

slide-42
SLIDE 42

Implementation

subdomain problems - frontal solver + LAPACK coarse problem - MUltifrontal Massively Parallel sparse direct Solver (MUMPS) http://mumps.enseeiht.fr mainly Fortran 77 programming language, partly Fortran 90, MPI library tested on

SGI Altix 4700, CTU, Prague, CR 72 processors Intel Itanium 2, OS Linux

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 42 / 49

slide-43
SLIDE 43

Hip joint replacement

33 186 quadratic elements, 544 734 unknowns 16 subdomains, 35 corners, 12 edges, and 35 faces 32 subdomains, 57 corners, 12 edges, and 66 faces 16 processors of SGI Altix 4700 Decomposition into 32 subdomains

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 43 / 49

slide-44
SLIDE 44

Hip joint replacement

von Mises stresses in improved design

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 44 / 49

slide-45
SLIDE 45

Hip joint replacement

20 40 60 80 100 1000 2000 3000 4000 condition number estimate [/] number of corners [/] Condition number SGI Altix 4700 16 processors nsub = 16 nsub = 32

Condition number for adding corners

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 45 / 49

slide-46
SLIDE 46

Hip joint replacement

200 400 1000 2000 3000 4000 wall time [seconds] number of corners [/] Wall times for variable coarse problem SGI Altix 4700 16 processors nsub = 16 nsub = 32

Wall clock time for adding corners

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 46 / 49

slide-47
SLIDE 47

Hip joint replacement, 16 subdomains

coarse problem C. C.+E. C.+F. C.+E.+F. iterations 35 34 26 26

  • cond. number est.

96 96 65 65 factorization (sec) 91 80 78 106 pcg iter (sec) 53 49 38 37 total (sec) 183 166 153 181 adding averages to 335 corners coarse degrees of freedom:

  • C. - Corners only

C.+E. - Corners and averages on Edges C.+F. - Corners and averages on Faces C.+E.+F. - Corners and averages on Edges and Faces

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 47 / 49

slide-48
SLIDE 48

Hip joint replacement, 32 subdomains

coarse problem C. C.+E. C.+F. C.+E.+F. iterations 35 32 30 27

  • cond. number est.

149 70 59 46 factorization (sec) 60 57 59 62 pcg iter (sec) 49 40 37 34 total (sec) 128 115 113 113 adding averages to 557 corners coarse degrees of freedom:

  • C. - Corners only

C.+E. - Corners and averages on Edges C.+F. - Corners and averages on Faces C.+E.+F. - Corners and averages on Edges and Faces

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 48 / 49

slide-49
SLIDE 49

Conclusion

distinguish between point constraints and averages for frontal solver many matrices needed in BDDC are just a simple side-product of the frontal solver (reactions) ‘minimal’ number of corners does not assure minimal solution time constraits on edges and/or faces can considerably shorten the solution time more sophisticated (adaptive) way for selection of constraints -

  • ngoing research

(Zhangjiajie, China) Adaptive Multilevel BDDC DD19, August 2009 49 / 49