IDR( s ) A family of simple and fast algorithms for solving large - - PowerPoint PPT Presentation

idr s
SMART_READER_LITE
LIVE PREVIEW

IDR( s ) A family of simple and fast algorithms for solving large - - PowerPoint PPT Presentation

IDR( s ) A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations Harrachov 2007 Peter Sonneveld en Martin van Gijzen August 24, 2007 1 Delft University of Technology Outline Introduction The


slide-1
SLIDE 1

1

Delft University of Technology

IDR(s)

A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations

Harrachov 2007 Peter Sonneveld en Martin van Gijzen August 24, 2007

slide-2
SLIDE 2

August 24, 2007 2

Outline

  • Introduction
  • The Induced Dimension Reduction Theorem
  • The IDR(s) algorithm
  • Numerical experiments
  • Conclusions
slide-3
SLIDE 3

August 24, 2007 3

Product Bi-CG methods

Bi-CG solves nonsymmetric linear systems using (short) CG recursions but needs extra matvec with AH. Idea of Sonneveld: use ’wasted’ matvec in a more useful way. Result: transpose-free methods:

  • CGS (Sonneveld, 1989)
  • Bi-CGSTAB (Van der Vorst, 1992)
  • BiCGSTAB2 (Gutknecht, 1993)
  • TFQMR (Freund, 1993)
  • BiCGstab(ℓ) (Sleijpen and Fokkema, 1994)
  • Many other variants
slide-4
SLIDE 4

August 24, 2007 4

Historical remarks

Sonneveld first developed IDR (1980). Analysis showed that IDR was Bi-CG combined with linear minimal residual steps. The fact that IDR is transpose free, combined with the relation with Bi-CG led to the development of a now famous algorithm: CGS. Later Van der Vorst proposed another famous method: Bi-CGSTAB, which is mathematicaly equivalent with IDR. As a result of these developments, the basic IDR idea was abandoned for the Bi-CG approach. IDR is now forgotten.

slide-5
SLIDE 5

August 24, 2007 5

The IDR idea

The IDR-idea is to generate a sequence of subspaces G0 · · · Gj with the following operations:

  • Intersect Gj−1 with fixed subspace S,
  • Compute Gj = (I − ωjA)(Gj−1 ∩ S).

The subspaces G0 · · · Gj are nested and of shrinking dimension.

slide-6
SLIDE 6

August 24, 2007 6

The IDR theorem

Theorem 1 (IDR) Let A be any matrix in CN×N, let v0 be any nonzero vector in CN, and let G0 be the complete Krylov space KN(A, v0). Let S denote any (proper) subspace of CN such that S and G0 do not share a nontrivial invariant subspace of A, and define the sequence Gj, j = 1, 2, . . . as Gj = (I − ωjA)(Gj−1 ∩ S) where the ωj’s are nonzero scalars. Then (i) Gj ⊂ Gj−1 for all j > 0. (ii) Gj = {0} for some j ≤ N.

slide-7
SLIDE 7

August 24, 2007 7

Making an IDR algorithm

The IDR theorem can be used to construct solution algorithms. This is done by constructing residuals rn ∈ Gj. According to the IDR theorem ultimately rn ∈ GM = {0}. In order to generate residuals and corresponding solution approximations we first look at the basic recursions.

slide-8
SLIDE 8

August 24, 2007 8

Krylov methods: basic recursion (1)

A Krylov-type solver produces iterates xn, for which the residuals rn = b − Axn are in the Krylov spaces Kn(A, r0) = r0 ⊕ Ar0 ⊕ A2r0 ⊕ · · · ⊕ Anr0 , The next residual rn+1 can be generated by rn+1 = −αArn +

b j

  • j=0

βjrn−j . The parameters α, βj determine the specific method and must be such that xn+1 can be computed.

slide-9
SLIDE 9

August 24, 2007 9

Krylov methods: basic recursion (2)

By using the difference vector ∆rk = rk+1 − rk = −A(xn+1 − xn), an explicit way to satisfy this requirement is rn+1 = rn − αArn −

b j

  • j=1

γj∆rn−j , which leads to the following update for the x estimate: xn+1 = xn + αrn −

b j

  • j=1

γj∆xn−j ,

slide-10
SLIDE 10

August 24, 2007 10

Computation of a new residual (1)

Residuals are computed that are forced to be in the subspaces Gj, by application of the IDR-theorem. The residual rn+1 is in Gj+1 if rn+1 = (I − ωj+1A)v with v ∈ Gj ∩ S . The main problem is to find v.

slide-11
SLIDE 11

August 24, 2007 11

Computation of a new residual (2)

slide-12
SLIDE 12

August 24, 2007 12

Computation of a new residual (3)

The vector v is a combination of the residuals rl in Gj. v = rn −

b j

  • j=1

γj∆rn−j . Let the space S be the left null space of some N × s matrix P: P = (p1 p2 . . . ps), S = N(PH) . Since v is also in S = N(PH), it must satisfy PHv = 0 . Combining these two yields an s × j linear system for the coefficients γj that (normally) is uniquely solvable if j = s.

slide-13
SLIDE 13

August 24, 2007 13

Computation of a new residual (4)

Hence with the residual rn, and a matrix ∆R consisting of the last s residual differences: ∆R = (∆rn−1 ∆rn−2 . . . ∆rn−s) a suitable v can be found by Solve s × s system (PH∆R)c = PHrn Calculate v = rn − ∆Rc

slide-14
SLIDE 14

August 24, 2007 14

Building Gj+1 (1)

Assume rn and all columns of ∆R are in Gj, and let rn+1 be calculated as rn+1 = v − ωj+1Av Then rn+1 ∈ Gj+1. Since Gj+1 ⊂ Gj (theorem) we automatically have rn+1 ∈ Gj Now the next ∆R is made by repeating the calculations. In this way we find s + 1 residuals in Gj+1

slide-15
SLIDE 15

August 24, 2007 15

Building Gj+1 (2)

slide-16
SLIDE 16

August 24, 2007 16

Building Gj+1 (3)

slide-17
SLIDE 17

August 24, 2007 17

Building Gj+1 (4)

slide-18
SLIDE 18

August 24, 2007 18

A few details

  • 1. The first s + 1 residuals, starting with r0 can be constructed

by any Krylov-based iteration, such as a local minimum residual method.

  • 2. In our actual implementation, all steps are identical.

However, in calculating the first residual in Gj+1, a new value ωj+1 may be chosen. We choose ωj+1 such that v − ωj+1Av is minimal.

slide-19
SLIDE 19

August 24, 2007 19

Basic IDR(s) algorithm.

while rn > TOL or n < MAXIT do for k = 0 to s do Solve c from PHdRnc = PHrn v = rn − dRnc; t = Av; if k = 0 then ω = (tHv)/(tHt); end if drn = −dRnc − ωt; dxn = −dXnc + ωv; rn+1 = rn + drn; xn+1 = xn + dxn; n = n + 1; dRn = (drn−1 · · · drn−s); dXn = (dxn−1 · · · dxn−s); end for end while

slide-20
SLIDE 20

August 24, 2007 20

Vector operations per MATVEC

Method DOT AXPY Memory Requirements IDR(1) 2 4 8 IDR(2) 22

3

55

6

11 IDR(4) 42

5

9 7

10

17 IDR(6) 62

7

13 9

14

23 Full GMRES

n+1 2 n+1 2

n + 2 BiCGSTAB 2 3 7

slide-21
SLIDE 21

August 24, 2007 21

Relation with other methods

Although the approach is different, IDR(s) is closely related to some Bi-CGSTAB methods:

  • IDR(1) and Bi-CGSTAB yield the same residuals at the even

steps.

  • ML(k)BiCGSTAB (Yeung and Chen, 1999) seems closely

related to IDR(s), BUT

  • IDR(s) is MUCH simpler (both conceptually and its

implementation)

  • Other, more natural extensions are possible, e.g. to

avoid breakdown.

slide-22
SLIDE 22

August 24, 2007 22

Performance of IDR(s)

The IDR theorem states that

  • it is possible to generate a sequence of nested subspace Gj
  • f shrinking dimension,
  • but does not say how fast the dimension shrinks

It can be proven that the dimension reduction is (normally) s, So dim(Gj+1) = dim(Gj) − s. IDR(s) requires at at most N + N

s matrix-vector multiplications to

compute the exact solution.

slide-23
SLIDE 23

August 24, 2007 23

Numerical experiments

We will present two typical numerical examples

  • A 2D Ocean Circulation Problem
  • A 3D Helmholtz Problem
slide-24
SLIDE 24

August 24, 2007 24

A 2D Ocean Circulation Problem

We compare IDR(s) with Full GMRES, restarted GMRES and Bi-CGSTAB. This ocean example is representative for a wide class of CFD problems. We will compare:

  • Rate of convergence
  • Stagnation level (of the true residual norm)
slide-25
SLIDE 25

August 24, 2007 25

Stommel’s model for ocean circulation

Balance between bottom friction, wind stress and Coriolis force. −r ∆ψ − β ∂ψ ∂x − = (∇ × F)z plus circulation condition around islands k

  • Γk

r ∂ψ ∂n ds = −

  • Γk

F · s ds.

  • ψ: streamfunction
  • r: bottom friction parameter
  • β: Coriolis parameter
  • F: Wind stress
slide-26
SLIDE 26

August 24, 2007 26

Discretization of the ocean problem

  • Discretization with linear finite elements
  • Results in nonsymmetric system of 42248
  • Eigenvalues are (almost) real

Solution parameters:

  • ILU(0) preconditioning
  • P: s − 1 random vectors plus r0 (for comparison with

Bi-CGSTAB)

slide-27
SLIDE 27

August 24, 2007 27

Solution of the ocean problem

50 100 150 200 250 300 350 −80 −60 −40 −20 20 40 60 80

slide-28
SLIDE 28

August 24, 2007 28

Convergence for the ocean problem

100 200 300 400 500 600 700 10

−20

10

−15

10

−10

10

−5

10 10

5

Number of MATVECS Scaled residual norm Convergence 2D ocean problem IDR1 IDR2 IDR4 IDR6 BICGSTAB GMRES

slide-29
SLIDE 29

August 24, 2007 29

Some observations

  • Required number of MATVECS decreases if s is increased.

IDR(4) and IDR(6) are close to the optimal convergence curve of full GMRES.

  • Convergence curves of IDR(1) and Bi-CGSTAB coincide.
  • Stagnation levels of IDR(s) comparable with Bi-CGSTAB.
slide-30
SLIDE 30

August 24, 2007 30

Required number of MATVECS

Method Number of MATVECS Vectors Full GMRES 265 268 GMRES(20) > 10000 23 GMRES(50) 4671 53 Bi-CGSTAB 411 7 IDR(1) 420 8 IDR(2) 339 11 IDR(4) 315 17 IDR(6) 307 23 Tolerance: b − Axn < 10−8b

slide-31
SLIDE 31

August 24, 2007 31

A 3D Helmholtz Problem

Example models sound propagation in a room of 4 × 4 × 4m3. A harmonic sound source gives acoustic pressure field p(x, t) = p(x)e2πift. The pressure function p can be determined from −(2πf)2 c2

  • p − ∆

p = δ(x − xs) in Ω. in which

  • c: the sound speed (340 m/s)
  • δ(x − xs): the harmonic point source, in the center of the

room.

slide-32
SLIDE 32

August 24, 2007 32

Boundary conditions

Five of the walls are reflecting, modeled by ∂ p ∂n = 0 , and the remaining wall is sound absorbing, ∂ p ∂n = −2πif c

  • p on Γ3.
slide-33
SLIDE 33

August 24, 2007 33

Discretization

Discretization with FEM yields linear system [−(2πf)2M + 2πifC + K]p = b

  • Frequency f = 100Hz.
  • System matrix complex, symmetric (but not Hermitian) and

indefinite: difficult for iterative methods

  • gridsize h = 8 cm: 132651 equations

Solution parameters:

  • ILU(0) preconditioning
  • P: Initial residual plus s − 1 real random vectors
  • Only comparison with BiCGstab(ℓ)
slide-34
SLIDE 34

August 24, 2007 34

Results Helmholtz Problem

Method Number of Elapsed time MATVECS [s] IDR(1) 1500 3322 IDR(2) 598 1329 IDR(4) 353 783 IDR(6) 310 698 BiCGstab(1) 1828 3712 BiCGstab(2) 1008 2045 BiCGstab(4) 656 1362 BiCGstab(8) 608 1337 Tolerance: b − Axn < 10−8b

slide-35
SLIDE 35

August 24, 2007 35

Conclusions

  • The IDR-theorem offers a new approach for the

development of iterative solution algorithms

  • The IDR(s) algorithm presented here is quite promising and

seems to outperform state-of-the-art Bi-CG-type methods for important classes of problems. More information: http://ta.twi.tudelft.nl/nw/users/gijzen/software.html

  • Report: IDR(s): a family of simple and fast algorithms for

solving large nonsymmetric linear systems, submitted

  • Matlab code, (includes preconditioning and deflation)