Challenges in Analysis of Algebraic Iterative Solvers Jrg Liesen - - PowerPoint PPT Presentation

challenges in analysis of algebraic iterative solvers
SMART_READER_LITE
LIVE PREVIEW

Challenges in Analysis of Algebraic Iterative Solvers Jrg Liesen - - PowerPoint PPT Presentation

Challenges in Analysis of Algebraic Iterative Solvers Jrg Liesen Technical University of Berlin and Zden ek Strako Charles University in Prague and Czech Academy of Sciences http://www.karlin.mff.cuni.cz/strakos Workshop in honor of


slide-1
SLIDE 1

Challenges in Analysis of Algebraic Iterative Solvers

Jörg Liesen

Technical University of Berlin and

Zdenˇ ek Strakoš

Charles University in Prague and Czech Academy of Sciences http://www.karlin.mff.cuni.cz/˜strakos Workshop in honor of K. R. Rajagopal Prague, March 2012

slide-2
SLIDE 2
  • Z. Strakoš

2

Cornelius Lanczos, March 9, 1947

“The reason why I am strongly drawn to such approximation mathematics problems is not the practical applicability of the solution, but rather the fact that a very “economical” solution is possible only when it is very “adequate”. To obtain a solution in very few steps means nearly always that one has found a way that does justice to the inner nature of the problem.”

slide-3
SLIDE 3
  • Z. Strakoš

3

Albert Einstein, March 18, 1947

“Your remark on the importance of adapted approximation methods makes very good sense to me, and I am convinced that this is a fruitful mathematical aspect, and not just a utilitarian one.”

slide-4
SLIDE 4
  • Z. Strakoš

4

Algebraic iterative computations

  • In iterative methods applied to linear algebraic problems, computational

cost of finding sufficiently accurate approximation to the exact solution heavily depends on the particular data, i.e.,

❋ on the underlying real world problem, ❋ on the mathematical model, ❋ on its discretisation.

  • Any evaluation of cost in iterative computations must take into account

effects of rounding errors.

  • In mathematical modeling of real world phenomena, the accuracy
  • f the computed approximation must be related to the underlying
  • phenomena. Its evaluation can not be restricted to algebra.
slide-5
SLIDE 5
  • Z. Strakoš

5

Is there any algebraic error worth consideration?

Knupp and Salari, 2003: “There may be incomplete iterative convergence (IICE) or round-off-error that is polluting the results. If the code uses an iterative solver, then one must be sure that the iterative stopping criteria is sufficiently tight so that the numerical and discrete solutions are close to one another. Usually in

  • rder-verification tests, one sets the iterative stopping criterion to just

above the level of machine precision to circumvent this possibility.” Why do we care? Is not all these algebraic stuff linear and simple?

slide-6
SLIDE 6
  • Z. Strakoš

6

Conjugate Gradients:

A

HPD, x0, r0, p0 = r0

x − xnA = min

u ∈ x0+Kn(A,r0) x − uA

Kn(A, r0) ≡ span {r0, Ar0, · · · , An−1r0} γn−1 = (rn−1, rn−1)/(pn−1, Apn−1) xn = xn−1 + γn−1 pn−1 rn = rn−1 − γn−1 Apn−1 δn = (rn, rn)/(rn−1, rn−1) pn = rn + δn pn−1. Hestenes and Stiefel (1952), Lanczos (1950, 1952) This algebraic stuff is nothing but linear!

slide-7
SLIDE 7
  • Z. Strakoš

7

CG is the Gauss-Christoffel Quadrature

Ax = b , x0 ← → ω(λ), ξ

ζ

f(λ) dω(λ) ↑ ↑ Tn yn = r0 e1 ← → ω(n)(λ),

n

  • i=1

ω(n)

i

f

  • θ(n)

i

  • xn = x0 + Wn yn

ω(n)(λ) − → ω(λ)

slide-8
SLIDE 8
  • Z. Strakoš

8

Distribution function

ω(λ)

λi, si are the eigenpairs of A , ωi = |(si, w1)|2 , w1 = r0/r0 . . . 1 ω1 ω2 ω3 ω4 ωN L λ1 λ2 λ3 . . .

. . .

λN U Hestenes and Stiefel (1952), Lanczos (1952, almost unknown)

slide-9
SLIDE 9
  • Z. Strakoš

9

CG does model reduction matching 2n moments

U

L

λ−1 dω(λ) =

n

  • i=1

ω(n)

i

  • θ(n)

i

−1 + Rn(f) x − x02

A

r02 = n-th Gauss quadrature + x − xn2

A

r02 With x0 = 0, b∗A−1b =

n−1

  • j=0

γjrj2 + r∗

nA−1rn .

Golub, Meurant, Reichel, Boley, Gutknecht, Saylor, Smolarski, ......... , Meurant and S (2006), Golub and Meurant (2010), S and Tichý (2011), Liesen, S, Krylov subspace methods, OUP (2012)

slide-10
SLIDE 10
  • Z. Strakoš

10

Outline

  • 1. CG convergence bounds based on Chebyshev polynomials
  • 2. Sensitivity of the Gauss-Christoffel quadrature
  • 3. PDE discretizations and matrix computations
slide-11
SLIDE 11
  • Z. Strakoš

11

1 Linear bounds for the nonlinear method?

x − xnA = min

p(0)=1 deg(p)≤n

A1/2p(A)(x − x0) = min

p(0)=1 deg(p)≤n

Y p(Λ)Y ∗A1/2(x − x0) ≤

  • min

p(0)=1 deg(p)≤n

max

1≤j≤N |p(λj)|

  • x − x0A

Using the shifted Chebyshev polynomials on the interval [λ1, λN] , x − xnA ≤ 2

  • κ(A) − 1
  • κ(A) + 1

n x − x0A .

slide-12
SLIDE 12
  • Z. Strakoš

12

1 Minimization property and the bound

This bound has a remarkably wiggling history:

  • Markov (1890)
  • Flanders and Shortley (1950)
  • Lanczos (1953), Kincaid (1947), Young (1954, ... )
  • Stiefel (1958), Rutishauser (1959)
  • Meinardus (1963), Kaniel (1966)
  • Daniel (1967a, 1967b)
  • Luenberger (1969)

It is relevant to the Chebyshev method!

slide-13
SLIDE 13
  • Z. Strakoš

13

1 Composite bounds considering large outliers?

This bound should not be used in connection with the behaviour of CG unless κ(A) = λN/λ1 is really small or unless the (very special) distribution of eigenvalues makes it relevant. In particular, one should be very careful while using it as a part of a composite bound in the presence of the large outlying eigenvalues min

p(0)=1 deg(p)≤n−s

max

1≤j≤N | qs(λj) p(λj) |

≤ max

1≤j≤N |qs(λj)|

  • Tn−s(λj)|

Tn−s(0)

  • <

max

1≤j≤N−s

  • Tn−s(λj)

Tn−s(0)

  • .

This Chebyshev method bound on the interval [λ1, λN−s] is then valid after s initial steps.

slide-14
SLIDE 14
  • Z. Strakoš

14

1 Quote (2009, ... ): the desired accuracy

ǫ

  • Theorem. After

k = s +

  • ln(2/ǫ)

2

  • λN−s

λ1

  • iteration steps the CG will produce the approximate solution

xn satisfying x − xnA ≤ ǫ x − x0A . This recently republished and used statement is in finite precision arithmetic not true at all.

slide-15
SLIDE 15
  • Z. Strakoš

15

1 Axelsson (1976), Jennings (1977)

  • p. 72: ... it may be inferred that rounding errors ... affects the convergence

rate when large outlying eigenvalues are present.

slide-16
SLIDE 16
  • Z. Strakoš

16

1 The composite bounds completely fail

20 40 60 80 100 10

−15

10

−10

10

−5

10 20 40 60 80 100 10

−15

10

−10

10

−5

10

Composite bounds with varying number of outliers: Exact CG (left) and FP CG (right), Gergelits (2011).

slide-17
SLIDE 17
  • Z. Strakoš

17

2 CG and Gauss-Christoffel quadrature errors

U

L

λ−1 dω(λ) =

n

  • i=1

ω(n)

i

  • θ(n)

i

−1 + Rn(f) x − x02

A

r02 = n-th Gauss quadrature + x − xn2

A

r02 Consider two slightly different distribution functions with Iω = U

L

λ−1 dω(λ) ≈ In

ω

ω =

U

L

λ−1 d˜ ω(λ) ≈ In

˜ ω

slide-18
SLIDE 18
  • Z. Strakoš

18

2 Sensitivity of the Gauss-Christoffel Q.

5 10 15 20 10

−10

10

−5

10 iteration n quadrature error − perturbed integral quadrature error − original integral 5 10 15 20 10

−10

10

−5

10 iteration n difference − estimates difference − integrals

slide-19
SLIDE 19
  • Z. Strakoš

19

2 The point goes back to 1814

  • 1. Gauss-Christoffel quadrature for a small number of quadrature nodes

can be highly sensitive to small changes in the distribution function that enlarge its support. In particular, the difference between the corresponding quadrature approximations (using the same number of quadrature nodes) can be many orders of magnitude larger than the difference between the integrals being approximated.

  • 2. This sensitivity in Gauss-Christoffel quadrature can be observed

for discontinuous, continuous, and even analytic distribution functions, and for analytic integrands uncorrelated with changes in the distribution functions, with no singularity close to the interval of integration.

slide-20
SLIDE 20
  • Z. Strakoš

20

2 Theorem - O’Leary, S, Tichý (2007)

Consider distribution functions ω(λ) and ˜ ω(λ)

  • n

[L, U] . Let pn(λ) = (λ − λ1) . . . (λ − λn) and ˜ pn(λ) = (λ − ˜ λ1) . . . (λ − ˜ λn) be the nth orthogonal polynomials corresponding to ω and ˜ ω respectively, with ˆ ps(λ) = (λ − ξ1) . . . (λ − ξs) their least common multiple. If f ′′ is continuous on [L, U] , then the difference ∆n

ω,˜ ω

between the approximation In

ω

to Iω and the approximation In

˜ ω

to I˜

ω ,

  • btained from the

n-point Gauss-Christoffel quadrature, is bounded as |∆n

ω,˜ ω|

  • U

L

ˆ ps(λ)f[ξ1, . . . , ξs, λ] dω(λ) − U

L

ˆ ps(λ)f[ξ1, . . . , ξs, λ] d˜ ω(λ)

  • +
  • U

L

f(λ) dω(λ) − U

L

f(λ) d˜ ω(λ)

  • .
slide-21
SLIDE 21
  • Z. Strakoš

21

3 Take very simple model boundary value problem

−∆u = 16η1η2(1 − η1)(1 − η2)

  • n the unit square with zero Dirichlet boundary conditions. Galerkin finite

element method (FEM) discretization with linear basis functions on the regular triangular grid with the mesh size h = 1/(m + 1), where m is the number of inner nodes in each direction. Discrete (piecewise linear) solution uh =

N

  • j=1

ζj φj(η1, η2) . Computational error u − u(n)

h

  • total error

= u − uh discretisation error + uh − u(n)

h

  • algebraic error

.

slide-22
SLIDE 22
  • Z. Strakoš

22

3 Local discretization and global computation

Discrete (piecewise linear) solution uh =

N

  • j=1

ζj φj(η1, η2) .

  • If ζj is known exactly, then u(n)

h

= uh , and the global information is approximated as the linear combination of the local basis functions.

  • Apart from trivial cases, ζj ,

which supplies the global information, is not known exactly.

slide-23
SLIDE 23
  • Z. Strakoš

23

3 Local discretisation and global computation

−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

slide-24
SLIDE 24
  • Z. Strakoš

24

3 Energy norm of the error

Theorem

Up to a small inaccuracy proportional to machine precision, ∇(u − u(n)

h )2

= ∇(u − uh)2 + ∇(uh − u(n)

h )2

= ∇(u − uh)2 + x − xn2

A .

Using zero Dirichlet boundary conditions, ∇(u − uh)2 = ∇u2 − ∇uh2 .

slide-25
SLIDE 25
  • Z. Strakoš

25

3 Solution and the discretization error

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Exact solution u of the PDE 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 1 2 3 x 10

−4

Discretisation error u−uh

Exact solution u of the Poisson model problem (left) and the MATLAB trisurf plot of the discretization error u − uh (right).

slide-26
SLIDE 26
  • Z. Strakoš

26

3 Algebraic and total errors

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 −6 −4 −2 2 4 6 x 10

−4

Algebraic error uh−uh

CG with c=1 and α=3

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 −4 −2 2 4 6 8 10 x 10

−4

Total error u−u

h CG with c=1 and α=3

Algebraic error uh − u(n)

h

(left) and the MATLAB trisurf plot of the total error u − u(n)

h

(right) ∇(u − u(n)

h )2

= ∇(u − uh)2 + x − xn2

A

= 5.8444e − 03 + 1.4503e − 05 .

slide-27
SLIDE 27
  • Z. Strakoš

27

3 Algebraic and total errors

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 −6 −4 −2 2 4 6 8 x 10

−5

Algebraic error uh−uh

CG with c=0.5 and α=3

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 1 2 3 x 10

−4

Total error u−u

h CG with c=0.5 and α=3

Algebraic error uh − u(n)

h

(left) and the MATLAB trisurf plot of the total error u − u(n)

h

(right) ∇(u − u(n)

h )2

= ∇(u − uh)2 + x − xn2

A

= 5.8444e − 03 + 5.6043e − 07 .

slide-28
SLIDE 28
  • Z. Strakoš

28

3 One can see 1D analogy

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −0.5 0.5 1 1.5 2 2.5 3 x 10

−3Errors with the algebraic normwise backward error 6.6554e−017

alg.error

  • tot. error

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −2 2 4 6 8 10 x 10

−3 Errors with the algebraic normwise backward error 0.0020031

alg.error

  • tot. error

The discretization error (left), the algebraic and the total error (right), Papež (2011).

slide-29
SLIDE 29
  • Z. Strakoš

29

3 Why?

Krylov subspace methods represent matching moments model reduction!

slide-30
SLIDE 30
  • Z. Strakoš

30

3 Adaptivity?

We need a-posteriori error bounds which are:

  • Locally efficient,
  • fully computable (no hidden constants),
  • and allow to compare the contribution of the discretization error and the

algebraic error to the total error.

slide-31
SLIDE 31
  • Z. Strakoš

31

Conclusions

Patrick J. Roache’s book Validation and Verification in Computational Science, 2006, p. 387: “With the often noted tremendous increases in computer speed and memory, and with the less often acknowledged but equally powerful increases in algorithmic accuracy and efficiency, a natural question suggest itself. What are we doing with the new computer power? with the new GUI and other set-up advances? with the new algorithms? What should we do? ... Get the right answer.”

slide-32
SLIDE 32
  • Z. Strakoš

32

Thank you for your work, help and friendship!