A new algorithm for computing quadrature-based bounds in conjugate - - PowerPoint PPT Presentation

a new algorithm for computing quadrature based bounds in
SMART_READER_LITE
LIVE PREVIEW

A new algorithm for computing quadrature-based bounds in conjugate - - PowerPoint PPT Presentation

A new algorithm for computing quadrature-based bounds in conjugate gradients Petr Tich Czech Academy of Sciences Charles University in Prague joint work with Grard Meurant and Zdenk Strako June 0813, 2014 Householder Symposium


slide-1
SLIDE 1

A new algorithm for computing quadrature-based bounds in conjugate gradients

Petr Tichý

Czech Academy of Sciences Charles University in Prague

joint work with

Gérard Meurant and Zdeněk Strakoš

June 08–13, 2014 Householder Symposium XIX, Spa, Belgium

1

slide-2
SLIDE 2

Problem formulation

Consider a system

Ax = b

where A ∈ Rn×n is symmetric, positive definite. Without loss of generality, b = 1 , x0 = 0 .

2

slide-3
SLIDE 3

The conjugate gradient method

input A, b r0 = b, p0 = r0 for k = 1, 2, . . . do γk−1 = rT

k−1rk−1

pT

k−1A pk−1

xk = xk−1 + γk−1pk−1 rk = rk−1 − γk−1A pk−1 δk = rT

k rk

rT

k−1rk−1

pk = rk + δkpk−1 test quality of xk end for Dk

      

γ−1 ... ... γ−1

k−1

      

Lk

      

1 √δ1 ... ... ...

δk−1

1

      

3

slide-4
SLIDE 4

How to measure quality of an approximation in CG?

A practically relevant question

using residual information, – normwise backward error, – relative residual norm.

“Using of the residual vector rk as a measure of the “goodness” of the estimate xk is not reliable” [Hestenes & Stiefel 1952]

using error estimates, – the A-norm of the error, – the Euclidean norm of the error.

“The function (x − xk, A(x − xk)) can be used as a measure of the “goodness” of xk as an estimate of x.” [Hestenes & Stiefel 1952]

The A-norm of the error plays an important role in stopping criteria [Deuflhard 1994], [Arioli 2004], [Jiránek, Strakoš, Vohralík 2006].

4

slide-5
SLIDE 5

The Lanczos algorithm

Let A be symmetric, compute orthonormal basis of Kk(A, b)

input A, b v1 = b/b||, δ1 = 0 β0 = 0, v0 = 0 for k = 1, 2, . . . do αk = vT

k Avk

w = Avk − αkvk − βk−1vk−1 βk = w vk+1 = w/βk end for Tk

      

α1 β1 β1 ... ... βk−1 βk−1 αk

      

5

slide-6
SLIDE 6

CG versus Lanczos

Let A be symmetric, positive definite

Lanczos Tk ← → CG Dk, Lk Both algorithms generate an orthogonal basis of Kk(A, b). Lanczos using a three-term recurrence → Tk. CG using a coupled two-term recurrence → Dk, Lk .

Tk = Lk Dk LT

k .

6

slide-7
SLIDE 7

CG, Lanczos and Gauss quadrature

CG, Lanczos Gauss quadrature Orthogonal pol. Jacobi matrices At any iteration step k, CG (implicitly) determines weights and nodes of the k-point Gauss quadrature

ξ

ζ

f(λ) dω(λ) =

k

  • i=1

ωi f( θi) + Rk[f] .

7

slide-8
SLIDE 8

Gauss quadrature for f(λ) ≡ λ−1

Gauss quadrature

ξ

ζ

λ−1 dω(λ) =

k

  • i=1

ωi θi + Rk[λ−1] . Lanczos

  • T−1

n

  • 1,1 =
  • T−1

k

  • 1,1 + Rk[λ−1].

CG x2

A = k−1

  • j=0

γjrj2

  • τk

+ x − xk2

A .

Important : Rk[λ−1] > 0 .

8

slide-9
SLIDE 9

Gauss-Radau quadrature for f(λ) = λ−1

µ is prescribed

ξ

ζ

f(λ) dω(λ) =

k

  • i=1
  • ωif

˜

θi

  • +

ωk+1f(µ)

  • T−1

k+1

  • 1,1 ≡

τk+1 + Rk[f] , where

  • Tk+1 =

        

α1 β1 β1 ... ... ... ... βk−1 βk−1 αk βk βk

  • αk+1

        

and µ is an eigenvalue of Tk+1. Important: if 0 < µ ≤ λmin, then Rk[λ−1] < 0 .

9

slide-10
SLIDE 10

Idea of estimating the A-norm of the error

[Golub & Strakoš 1994], [Golub & Meurant 1994, 1997]

Consider two quadrature rules at steps k and k + d, d > 0, x2

A

= τk + x − xk2

A ,

x2

A

=

  • τk+d +
  • Rk+d .

Then x − xk2

A =

τk+d − τk + ˆ Rk+d . Gauss quadrature: ˆ Rk+d > 0 → lower bound, Radau quadrature: ˆ Rk+d < 0 → upper bound. How to compute efficiently

  • τk+d − τk ?

10

slide-11
SLIDE 11

How to compute efficiently τk+d − τk?

x − xk2

A =

τk+d − τk + ˆ Rk+d . For numerical reasons, it is not convenient to compute τk and

  • τk+d explicitly. Instead,
  • τk+d − τk

=

k+d−2

  • j=k

(τj+1 − τj) + ( τj+d − τj+d−1) ≡

k+d−2

  • j=k

∆j + ∆k+d−1 , and update the ∆j’s without subtractions. Recall τj =

  • T−1

j

  • 1,1.

11

slide-12
SLIDE 12

Golub and Meurant approach

[Golub & Meurant 1994, 1997]: Use tridiagonal matrices

CG → Tk → Tk−µI →

  • Tk

and compute ∆’s using updating strategies, no need to store tridiagonal matrices. Use the formulas x − xk2

A

=

k+d−1

  • j=k

∆j + x − xk+d2

A ,

x − xk2

A

=

k+d−2

  • j=k

∆j + ∆(µ)

k+d−1 + R(R) k+d .

12

slide-13
SLIDE 13

CGQL (Conjugate Gradients and Quadrature via Lanczos)

input A, b, x0, µ r0 = b − Ax0, p0 = r0 δ0 = 0, γ−1 = 1, c1 = 1, β0 = 0, d0 = 1, ˜ α(µ)

1

= µ, for k = 1, . . . , until convergence do CG-iteration (k) αk = 1 γk−1 + δk−1 γk−2 , β2

k =

δk γ2

k−1

dk = αk − β2

k−1

dk−1 , ∆k−1 = r02 c2

k

dk , ˜ α(µ)

k+1

= µ + β2

k

αk − ˜ α(µ)

k

, ∆(µ)

k

= r02 β2

kc2 k

dk

  • ˜

α(µ)

k+1dk − β2 k

,

c2

k+1 = β2 kc2 k

d2

k

Estimates(k,d) end for

13

slide-14
SLIDE 14

Our approach

[Meurant & T. 2013]: Update LDLT decompositions of Tk and

Tk CG → LkDkLT

k

  • Lk

Dk LT

k

We use tridiagonal matrices only implicitly. We get very simple formulas for updating ∆k−1 and ∆(µ)

k .

In [Meurant & T. 2013], this idea is used also for other types of quadratures (Gauss-Lobatto, Anti-Gauss).

14

slide-15
SLIDE 15

CGQ (Conjugate Gradients and Quadrature)

[Meurant & T. 2013]

input A, b, x0, µ, r0 = b − Ax0, p0 = r0 ∆(µ) = r02

µ

, for k = 1, . . . , until convergence do CG-iteration(k) ∆k−1 = γk−1rk−12, ∆(µ)

k

= rk2 ∆(µ)

k−1 − ∆k−1

  • µ
  • ∆(µ)

k−1 − ∆k−1

  • + rk2

Estimates(k,d) end for

15

slide-16
SLIDE 16

Conclusions (theoretical part)

Simple formulas for computing bounds on x − xkA. Almost for free. Work well also with preconditioning. Behaviour in finite precision arithmetic?

16

slide-17
SLIDE 17

CG in finite precision arithmetic

Orthogonality is lost, convergence is delayed!

20 40 60 80 100 120 10

−16

10

−14

10

−12

10

−10

10

−8

10

−6

10

−4

10

−2

10

(E) || x−xj ||A (FP) || x−xj ||A (FP) || I−VT

j Vj || F

Identities need not hold in finite precision arithmetic!

17

slide-18
SLIDE 18

Bounds in finite precision arithmetic

Observation: CGQL and CGQ give the same results (up to a small inaccuracy). Do the bounds correspond to x − xkA? Gauss quadrature lower bound → yes [Strakoš & T. 2002]. What about the Gauss-Radau upper bound? x − xk2

A

= ∆(µ)

k

+ R(R)

k+1 ,

x − xkA ≤

  • ∆(µ)

k .

18

slide-19
SLIDE 19

Gauss-Radau upper bound, exact arithmetic

Strakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1

5 10 15 20 25 30 35 40 45 50 10

−3

10

−2

10

−1

10 10

1

CGQ − strakos48 || x − xk ||A Upper bound, µ = 0.5 λ1 Upper bound, µ = λ1

19

slide-20
SLIDE 20

Gauss-Radau upper bound, finite precision arithmetic

Strakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1

20 40 60 80 100 120 10

−16

10

−14

10

−12

10

−10

10

−8

10

−6

10

−4

10

−2

10 10

2

CGQ − strakos48 || x − xk ||A Upper bound, µ = 0.5 λ1 Upper bound, µ = λ1

20

slide-21
SLIDE 21

Gauss-Radau upper bound, finite precision arithmetic

Strakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1

µ = α λ1 , α ∈ (0, 1], α ≈ 1 (red), α ≈ 0 (magenta)

80 85 90 95 100 105 110 115 120 10

−16

10

−14

10

−12

10

−10

10

−8

10

−6

10

−4

10

−2

CGQ − strakos48

  • ∆(µ)

k

21

slide-22
SLIDE 22

Gauss-Radau upper bound, finite precision arithmetic

Strakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1

µ = α λ1 , α ∈ (0, 1], α ≈ 1 (red), α ≈ 0 (magenta)

80 85 90 95 100 105 110 115 120 10

−16

10

−14

10

−12

10

−10

10

−8

10

−6

10

−4

10

−2

CGQ − strakos48

  • α ∆(µ)

k

22

slide-23
SLIDE 23

Conclusions (numerical observation)

Gauss-Radau upper bound

It seems that √ε is a limiting level for the accuracy of the Gauss-Radau upper bound. We cannot avoid subtractions in computing this bound. If µ ≈ λ1, then Tk − µI may be ill conditioned. Simple formulas → investigation of numerical behaviour. Understanding can help

in suggesting another approach, in improving Gauss quadrature lower bound (adaptive choice of d).

23

slide-24
SLIDE 24

Related papers

  • G. Meurant and P. Tichý, [On computing quadrature-based bounds for the

A-norm of the error in CG, Numer. Algorithms, 62 (2013), pp. 163–191.]

  • G. H. Golub and G. Meurant, [ Matrices, moments and quadrature with

applications, Princeton University Press, USA, 2010.]

  • Z. Strakoš and P. Tichý, [On error estimation in CG and why it works in

finite precision computations, ETNA, 13 (2002), pp. 56–80.]

  • G. H. Golub and G. Meurant, [Matrices, moments and quadrature. II. BIT,

37 (1997), pp. 687–705.]

  • G. H. Golub and Z. Strakoš, [Estimates in quadratic formulas, Numer.

Algorithms, 8 (1994), pp. 241–268.]

Thank you for your attention!

24