Modern Krylov subspace methods (and applications to parabolic - - PowerPoint PPT Presentation

modern krylov subspace methods and applications to
SMART_READER_LITE
LIVE PREVIEW

Modern Krylov subspace methods (and applications to parabolic - - PowerPoint PPT Presentation

Modern Krylov subspace methods (and applications to parabolic control problems) Daniel B. Szyld Temple University, Philadelphia Scientific and Statistical Computing Seminar University of Chicago 25 October 2012 1


slide-1
SLIDE 1

Modern Krylov subspace methods (and applications to parabolic control problems) Daniel B. Szyld Temple University, Philadelphia ————————— Scientific and Statistical Computing Seminar University of Chicago 25 October 2012

1

slide-2
SLIDE 2

Collaborators Xiuhong Du, Alfred University Marcus Sarkis, Worcester Poly. Inst., and IMPA, Rio de Janeiro Christian Schaerer, Universidad Nacional de Asunci´

  • n

Valeria Simoncini, Universit` a di Bologna

2

slide-3
SLIDE 3

Plan for the talk

  • Introduction to Truncated Krylov subspace methods
  • Inexact Krylov subspace methods:

Introduction and Applications

  • Special case of parabolic control problems

3

slide-4
SLIDE 4

General Problem Statement Solve a system Hx = b, H Hermitian or non-Hermitian using Krylov subspace iterative methods

4

slide-5
SLIDE 5

Krylov subspace methods Km(H, r0) = span{r0, Hr0, H2r0, . . . , Hm−1r0}. Subspaces are nested: Km ⊂ Km+1. Given x0, r0 = b − Hx0, find approximation xm ∈ x0 + Km(H, r0), satisfying some property.

5

slide-6
SLIDE 6

Krylov subspace methods (cont.) Conditions: Galerkin, e.g., FOM, CG: b − Hxm ⊥ Km(H, r0) Petrov-Galerkin, e.g., GMRES, MINRES: b − Hxm ⊥ HKm(H, r0)

  • r equivalently

xm = arg min{b − Hx2}, x ∈ x0 + Km(H, r0)

6

slide-7
SLIDE 7

Krylov subspace methods (cont.) Some implementation issues

  • Methods work by suitably choosing a basis of Km(H, r0)
  • Let v1, v2, . . . , vm be such a basis, chosen to be orthonormal.
  • One can of course run Gram-Schmidt on

{r0, Hr0, H2r0, . . . , Hm−1r0} (not advised). Instead, Arnoldi[1951] (for general H) said: v1 = r0 normalized v2 = Hv1 − v1, Hv1v1 normalized, span{v1, v2} = span{r0, Hr0} v3 = Hv2 − v2, Hv2v2 − v1, Hv2v1 normalized, span{v1, v2, v3} = span{r0, Hr0, H2r0} etc.

7

slide-8
SLIDE 8

Arnoldi method Let β = r0, and v1 = r0/β. For k = 1, . . . Compute Hvk, then vk+1hk+1,k = Hvk − k

j=1 vjhjk,

where hjk = vj, Hvk, j ≤ k, and hk+1,k is positive and such that vk+1 = 1. In practice: Modified Gram-Schmidt or Householder orthogonalization With Vm = [v1, v2, . . . , vm], obtain Arnoldi relation: HVm = Vm+1Hm+1,m Hm+1,m is (m + 1) × m upper Hessenberg

8

slide-9
SLIDE 9

Arnoldi relation (cont.) Hm+1,m =               h11 h12 h13 · · · h1m h21 h22 h23 · · · h2m ... ... . . . ... ... . . . hm,m−1 hmm hm+1,m               =   Hm hm+1,meT

m

  HVm = Vm+1Hm+1,m = VmHm + hm+1,mvm+1eT

m 9

slide-10
SLIDE 10

Krylov subspace methods (cont.) More implementation issues Element in Km(H, v1) is a linear combination of v1, v2, . . . , vm, i.e., of the form Vmy, y ∈ Rm Each method finds y = ym and we have xm = Vmym For FOM, or CG, we have the Galerkin condition rm ⊥ Km, i.e., 0 = V T

m(b − Hxm) = V T m b − V T mHVmym

= βe1 − Hmym

10

slide-11
SLIDE 11

FOM: Full Orthogonalization Method [Saad, 1978] Use Arnoldi methods to construct Vm, Hm, Solve Hmym = βe1 Approximation is xm = Vmym. Test for convergence: Is rm < ε? Cost: one matrix-vector product per step, at step m, orthogonalization (m inner-products),

  • ne solution of small m× m upper Hessenberg matrix (LU with no fill).

(rm cheap or free - no details here). Storage: at step m, m vectors v1, v2, . . . , vm.

11

slide-12
SLIDE 12

GMRES implementation We use Arnoldi methods to obtain Vm, and as before, element in Km(H, v1) is of the form Vmy, y ∈ Rm. Recall: β = r0, HVm = VmHm+1,m b − Hxm = r0 − HVmym = βv1 − Vm+1Hm+1,my = Vm+1(βe1 − Hm+1,my) rm = min

x∈Km b − Hx = min y∈Rm βe1 − Hm+1,my

QR factorization Hm+1,m = Qm+1Rm+1,m, Rm+1,m =   Rm  , rm = min

y∈Rm QT m+1βe1 − Rm+1,my. 12

slide-13
SLIDE 13

rm = min

y∈Rm QT m+1βe1 − Rm+1,my.

QT

m+1βe1 =

  tm ρm+1   Then, ym = R−1

m tm, and xm = Vmym.

Furthermore, rm = QT

m+1βe1 − Rm+1,mym = |ρm+1|

Use this computed residual for stopping (may deviate from true residual!) Use Givens rotations for QR, save rotations from previous steps. Only two entries per step needed.

13

slide-14
SLIDE 14

GMRES Cost: one matrix-vector product per step, at step m, orthogonalization (m inner-products), rotations for QR, solution of m × m triangular system. Storage: at step m, m vectors v1, v2, . . . , vm. Residual norms monotonically nonincreasing, but stagnation possible. Superlinear convergence can be observed.

14

slide-15
SLIDE 15
  • Main costs:
  • 1. Matrix-vector product: Hvk
  • 2. Orthogonalization
  • 3. Storage (if there is no recursion)
  • One popular alternative: Restarted methods. Hit and miss.

Little theory. [Saad, 1996], [Simoncini, 2000]

  • It may happen than larger value of restart parameter is less

efficient! [Embree, 2003]

15

slide-16
SLIDE 16

This Talk

  • Consider the case when one does not fully orthogonalize:

Truncated methods.

  • Reduce the cost of matrix-vector product when H is either

– Not known exactly – Computationally expensive (e.g., Schur complement, reduced Hessian) – Preconditioned with variable matrix (i.e., iteration dependent)

  • Apply all this to Parabolic Control Problems
  • Use a Parareal-in-time approximation

16

slide-17
SLIDE 17

Truncated Krylov subspace methods For the same amount of storage (max ℓ vectors), instead of restarting: Only orthogonalize with respect of the previous ℓ vectors. In Arnoldi we have then: vk+1hk+1,k = Hvk −

k

  • j=max{1,k−ℓ+1}

vjhjk, where hjk = vj, Hvk, j ≤ k, and hk+1,k is positive and such that vk+1 = 1. Only need to store these previous ℓ vectors.

17

slide-18
SLIDE 18

Upper Hessenberg matrix is now banded (upper bandwidth ℓ). Hm+1,m =               h11 h12 h13 h21 h22 h23 ... ... ... ... hm,m−1 hmm hm+1,m              

18

slide-19
SLIDE 19

Truncated Krylov subspace methods (cont.)

  • Truncated GMRES [Saad and Wu, 1996]

Truncated FOM [Saad, 1981], [Jia, 1996]

  • Basis of Km(H, r0), v1, v2, . . . , vm (m > ℓ) is not orthogonal,

but xm ∈ Km(H, r0), and minimization (or Galerkin condition) is over the whole space.

  • Hm+1,m banded with upper semiband ℓ − 2.

Matrix with basis vectors Vm not orthogonal. Can be implemented so that only O(ℓ) vectors are stored.

  • Extreme case, ℓ = 3 , Hm+1,m tridiagonal.

If H is SPD, FOM reduces to CG (and Vm automatically

  • rthogonal).
  • Theory for “non-optimal methods” [Simoncini and Szyld, 2005]

19

slide-20
SLIDE 20

Example: L(u) = −uxx + −uyy + 100(x + y)ux + 100(x + y)uy, on [0, 1]2, Dirichlet b.c., centered 5 pts. discretization, n = 2500.

5 10 15 20 25 30 35 40 45 10

−9

10

−8

10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10 10

1

number of its. 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x 10

6

10

−9

10

−8

10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10 10

1

flops

GMRES, Truncated ℓ = 3.

20

slide-21
SLIDE 21

Inexact Krylov subspace methods

  • At the kth iteration of the Krylov space method use

(H + Dk)vk−1 instead of Hvk−1, where Dk can be monitored

  • Two examples now:
  • Schur complement, where the inverse is approximated
  • Inexact preconditioning
  • [Bouras, Frayss´

e, and Giraud, CERFACS reports 2000, SIMAX 2005]

show experimentally that as k progresses Dk can be allowed to be larger; see also [Golub and Ye, 1999], [Notay, 1999],

[Sleijpen and van der Eshof, 2004] and [Simoncini and Eld´ en, 2002]

21

slide-22
SLIDE 22

Inexact Krylov (cont.) We repeat: Dk small at first, Dk can be big later. Convergence is maintained!

  • Instead of

HVm = Vm+1Hm+1,m we have now [(H + D1)v1, (H + D2)v2, . . . , (H + Dm)vm] = Vm+1Hm+1,m

  • Subspace spanned by v1, v2, . . . , vm is not a Krylov subspace,

but Vm orthogonal (in the full case)

22

slide-23
SLIDE 23

Theorem for Inexact FOM

[Simoncini and Szyld, 2003]

True residual: rm = b − Hxm = r0 − HVmym Computed residual(e.g.): ˜ rm = r0 − Vm+1Hm+1,mym = r0 − Wmym Let ε > 0. If for every k ≤ m, Dk ≤ σmin(Hm∗) m∗ 1 ˜ rk−1ε ≡ ℓF

m

1 ˜ rk−1ε , then V T

mrm ≤ ε and rm − ˜

rm ≤ ε. m∗ being the maximum number of iterations allowed Similar results for inexact GMRES see also [Giraud, Gratton, Langou, 2007]

23

slide-24
SLIDE 24

Theorem for Inexact Truncated FOM Dk ≤ σmin(Hm∗)σmin(Vm) m∗ 1 ˜ rk−1ε ≡ ℓT F

m

1 ˜ rk−1ε , implies V T

mrm ≤ ε and δm = rm − ˜

rm ≤ ε. Notes:

  • This result applies in particular to Inexact CG
  • ℓm can be estimated from problem, if information is available.

24

slide-25
SLIDE 25

First Experiment H = diag([10−4, 2, 3, · · · , 100]) Dk = symm [αkrandn(100, 100)] b = randn(100, 1) We chose ε = 10−8

  • Our condition (e.g. for FOM)

Dk ≤ σmin(H) m∗ 1 ˜ rk−1ε is very conservative. In most cases it is too strict. However, σmin(H) does play a role.

25

slide-26
SLIDE 26

CG: condition Dk ≤ σmin(H)

m∗ 1 ˜ rk−1ε

10 20 30 40 50 60 70 80 90 100 10

−15

10

−10

10

−5

10 10

5

||Dk|| ||Vk

Trk||

||rk|| number of iterations magnitude

V T

mrm ≪ ε 26

slide-27
SLIDE 27

Applications:

  • I. Schur complement systems

  A B BT     w x   =   f   , BT A−1Bx = BTA−1f; Aw = f − Bx Hx = b A−1 not exactly (use Krylov method).

27

slide-28
SLIDE 28

Applications: I. Schur complement systems (cont.)

  • A−1 not exactly (use Krylov method).
  • Replace Hv with Hv := BTz(k)

j

, where z(k)

j

is the approximation

  • btained at the jth (inner) iteration of the solution to the equation

Az = Bv

  • Question is then: How many inner iterations?

i.e., at what value of j stop? “Translate” conditions on Dk to conditions on norm of inner residual. Let rinner

k

= Az(k)

j

− Bv be the inner residual Take rinner

k

< σm⋆(Hm⋆) BT A−1m⋆ 1 ˜ rfom

k−1

ε ≡ εinner

28

slide-29
SLIDE 29

20 40 60 80 100 120 10

−8

10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10 10

1

number of iterations magnitude

δm ||rm|| ||rm|| εinner

~

  • Two-dim. saddle point magnetostatic problem from

[Perugia, Simoncini, Arioli, 1999], A is 1272 × 1272

  • Inexact FOM, m⋆ = 120, ε = 10−4

29

slide-30
SLIDE 30

Applications:

  • II. Inexact Preconditioning

Hx = b − → HP−1¯ x = b, x = P−1¯ x P−1 not performed exactly (use Krylov method) HP−1vk replaced with H˜ zk, ˜ zk ≈ P−1vk Arnoldi relation HP−1Vm = Vm+1Hm+1,m is transformed into H[˜ z1, · · · , ˜ zm] = Vm+1Hm+1,m. Use Flexible Krylov subspace method rinner

k

= vk − P˜ zk inner residual rinner

k

≤ σm⋆(Hm⋆) HP−1m⋆ 1 ˜ rgm

k−1ε ≡ εinner 30

slide-31
SLIDE 31

10 20 30 40 50 60 70 80 10

−14

10

−12

10

−10

10

−8

10

−6

10

−4

10

−2

10 10

2

number of iterations magnitude

||rm|| ||rm|| ~ εinner δm

For same 2D saddle point, use P =   I BT B   . Solve BT Bpk = rhs iteratively, m⋆ = 80, ε = 10−9, tolerance εinner

31

slide-32
SLIDE 32

Some CPU Times: Same Magnetostatic 2D Problem Outer tolerance: ε = 10−8 rinner

k

≤ c0 router

m−1 ε ≡ εinner

c0: Constant estimated a–priori: Here we use 10−2 and 10−4. Elapsed Time CPU in seconds of a Sun Enterprise 4500 (Fortran code) (4 CPU 400MHertz, 2GBytes RAM) CG iterations. Problem Size Fixed Inner Tol

  • Var. Inner Tol.
  • Var. Inner Tol.

εinner = 10−10 10−10/r 10−12/r 3810 17.0 (54) 11.4 (54) 14.7 (54) 9102 82.9 (58) 62.8 (58) 70.7 (58) 14880 198.4 (54) 156.5 (54) 170.1 (54)

32

slide-33
SLIDE 33

Applications:

  • III. Parabolic Control Problems

Inverse problem: Recover control v(x) based on field (state) z(x) related by the forward problem zt + Az = v, xǫΩ z = g, xǫ∂Ω z = z0, xǫΩ/∂Ω, for t = 0 A elliptic, e.g., A = −△ This is a distributed control problem. Similar techniques for boundary control problems (control g).

33

slide-34
SLIDE 34

Associated variational problem J(z(v), v) := α 2 tf

t0

z(v)(t, ·) − ˜ y(t, ·)2

L2(Ω)

+ β 2 z(v)(tf, ·) − ˜ y(tf, ·)2

L2(Ω) + γ

2 tf

t0

v(t, ·)2

L2(Ω).

discretized as Jτ

h(z, v) = 1

2 (z − ˜ y)T K(z − ˜ y) + 1 2 vT Gv + (z − ˜ y)T g.

34

slide-35
SLIDE 35

Discretized forward problem (FD) Ez + Nv = f E =        F1 −F0 F1 ... ... −F0 F1        , N =        B B ... B       

35

slide-36
SLIDE 36

Optimization problem min Jτ

h(z, v)

subject to Ez + Nv = f Lagrangian: Jτ

h(z, v) + qT (Ez + Nv − f) 36

slide-37
SLIDE 37

Linearize to obtain     K ET G NT E N         z u p     =     g f     After elimination one obtains Hu := (G + NT E−T KE−1N)u = b H being the spd reduced Hessian

37

slide-38
SLIDE 38

Hu = (G + NT E−T KE−1N)u = b We use inexact FOM, approximating each of the the systems with E and ET with a Parareal method with varying (increasing) tolerance. MVP Hv

  • 1. Multiply Nv
  • 2. Solve Ez = Nv by solving Ez = Nv with an inner tolerance ǫin1
  • 3. Multiply Kz
  • 4. Solve ET w = Kz by solving with an inner tolerance ǫin2
  • 5. Compute NT w

38

slide-39
SLIDE 39

Approximate solutions of systems with E or ET Hu = (G + NT E−T KE−1N)u = b We prove that with ǫin1 = ǫin2, we have spectral equivalence of the form (v, Gv) ≤ (v, Hv) ≤ µ(v, Gv) We choose FOM, since it reduces to CG when we have full symmetry.

39

slide-40
SLIDE 40

Sample Experiment: 2D heat equation 15 × 15 grid. τ = 1/512, control u of order 115200 Same relative tolerance for both Parareal systems, outer ε = 10−6

ℓ(i)

m ǫ

IFOM TIFOM mT = 2 mT = 4 mT = 8 mT = 12 10−12 15(576) 16(610) 16(608) 15(576) 15(576) 10−10 15(482) 17(532) 17(528) 16(504) 15(482) 10−8 15(388) 17(426) 17(426) 17(420) 15(388) 10−7 15(340) 18(394) 17(374) 17(368) 15(340) 10−6 15(288) 19(340) 19(338) 18(320) 16(298) 10−5 17(238) 24(298) 21(266) 19(258) 19(242) 10−4 17(180) n.c. 22(210) 28(254) 20(192)

40

slide-41
SLIDE 41

One surface of true and recovered model, and their difference increasing εinner = 10−5/˜ rk−1, mT = 8

2 4 6 8 10 12 2 4 6 8 10 12 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 0.022

2 4 6 8 10 12 2 4 6 8 10 12 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 0.022

2 4 6 8 10 12 2 4 6 8 10 12 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 x 10

−7

True Computed error O(10−7)

41

slide-42
SLIDE 42

TIFOM Convergence curves

5 10 15 20 25 10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10 ICG TIFOM(4) TIFOM(8) TIFOM(12) TIFOM(20)

  • uter ε = 10−6, inner factor = 10−5

42

slide-43
SLIDE 43

Similar results for more non-symmetric

ℓ(1)

m ε

ℓ(2)

m ε

IFOM TIFOM o-iter. (i-iter.)

  • -iter. (i-iter)

mT = 2 mT = 4 mT = 8 mT = 12 10−6 10−7 15(288) 18(330) 19(338) 17(310) 16(298) 10−6 10−6 15(288) 19(340) 19(338) 18(320) 16(298) 10−6 10−5 16(300) 23(390) 20(346) 19(334) 17(310) 10−6 10−4 16(300) 51(710) 37(520) 20(352) 17(310) 10−5 10−7 15(234) 19(270) 27(280) 18(254) 21(246) 10−5 10−6 15(234) 19(270) 29(284) 18(254) 19(242) 10−5 10−5 17(238) 24(298) 21(266) 19(258) 19(242) 10−5 10−4 17(240) 30(334) 31(356) 20(270) 19(244)

43

slide-44
SLIDE 44

One surface of true and recovered model, and their difference, outer tolerance ε = 10−6 εin1 = 10−6/˜ rk−1, εin4 = 10−4/˜ rk−1, mT = 8

2 4 6 8 10 12 2 4 6 8 10 12 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 0.022 2 4 6 8 10 12 2 4 6 8 10 12 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 0.022 2 4 6 8 10 12 2 4 6 8 10 12 −8 −6 −4 −2 2 4 6 8 10 12 x 10

−8

True Computed error O(10−6)

44

slide-45
SLIDE 45

Conclusions

  • Inexact matrix-vector product (or inexact preconditioning)

might be worth trying for your problem

  • Truncated methods

might be worth trying for your problem

  • New work on inexact and truncated for parabolic control problems

(with two inner criteria)

45

slide-46
SLIDE 46

With Valeria Simoncini: Theory of Inexact Krylov Subspace Methods and Applications to Scientific Computing SIAM J. Scientific Computing, v. 25 (2003) 454–477. On the Occurrence of Superlinear Convergence of Exact and Inexact Krylov Subspace Methods SIAM Review, v. 47 (2005) 247–272. The Effect of Non-Optimal Bases on the Convergence of Krylov Subspace Methods Numerische Mathematik, v. 100 (2005) 711-733. Recent computational developments in Krylov Subspace Methods for linear systems Numerical Linear Algebra with Applications, v. 14 (2007) 1-59. All available at: http://www.math.temple.edu/szyld

46

slide-47
SLIDE 47

With Xiuhond Du, Marcus Sarkis and Christian E. Schaerer Inexact and truncated Parareal-in-time Krylov subspace methods for parabolic optimal control problems Research Report 12-02-06, February 2012 Also available at: http://www.math.temple.edu/szyld

47