ECS 231 Subspace projection methods for LS
1 / 38
ECS 231 Subspace projection methods for LS 1 / 38 Part I. Basics - - PowerPoint PPT Presentation
ECS 231 Subspace projection methods for LS 1 / 38 Part I. Basics The landscape of solvers for linear systems of equations Ax = b, 2 / 38 Part I. Basics The landscape of solvers for linear systems of equations Ax = b, more robust
1 / 38
2 / 38
more robust ← − − − → less storage Direct Iterative (u = Av) Nonsymmetric A LU GMRES Symmetric positive definite A Cholesky CG more general ↑ | | | ↓ more robust 2 / 38
◮ The basic idea:
◮ extract an approximate solution
◮ a technique of dimension reduction. 3 / 38
◮ The basic idea:
◮ extract an approximate solution
◮ a technique of dimension reduction.
◮ Mathematically, let W, V ⊆ Rn, and x0 is an initial guess of the
3 / 38
◮ The basic idea:
◮ extract an approximate solution
◮ a technique of dimension reduction.
◮ Mathematically, let W, V ⊆ Rn, and x0 is an initial guess of the
3 / 38
◮ The basic idea:
◮ extract an approximate solution
◮ a technique of dimension reduction.
◮ Mathematically, let W, V ⊆ Rn, and x0 is an initial guess of the
◮ Orthogonal projection: W = V, ◮ Oblique projection: W = V,
3 / 38
4 / 38
4 / 38
5 / 38
6 / 38
6 / 38
6 / 38
6 / 38
7 / 38
rT
k rk
rT
k Ark
8 / 38
◮ Since A is SPD, rT k Ark > 0 except rk = 0.
9 / 38
◮ Since A is SPD, rT k Ark > 0 except rk = 0. ◮ Let x∗ = A−1b, then k-th step of the SD iteration minimizes
A = 1
9 / 38
◮ Since A is SPD, rT k Ark > 0 except rk = 0. ◮ Let x∗ = A−1b, then k-th step of the SD iteration minimizes
A = 1
◮ Recall that from Calculus, the negative of the gradient direction is
9 / 38
10 / 38
rT
k AT rk
rT
k AT Ark
10 / 38
rT
k AT rk
rT
k AT Ark
10 / 38
rT
k AT rk
rT
k AT Ark
2 = b − Ax2 2
α b − A(xk − αrk)2.
10 / 38
11 / 38
i w
12 / 38
13 / 38
m = Vm+1
mVm = Im, V T mvm+1 = 0 and vm+12 = 1.
m
14 / 38
15 / 38
15 / 38
15 / 38
15 / 38
15 / 38
◮ The Generalized Minimum Residual (GMRES)1 is a generalization of
◮ GMRES uses the following pair of Krylov subspaces as pair of
16 / 38
◮ Let
mAT (b − Ax) = 0.
mAT (r0 − AVmy) = 0.
mAT AVmy = V T mAT r0 ◮ By order-m Arnoldi decompositions, we have
m
mV T m+1r0 =
m(βe1) ◮ This is equivalent to solve the LS problem
y
17 / 38
◮ A vector x in x0 + Km can be written as x = x0 + Vmy ◮ Define J(y) = b − Ax2 = b − A(x0 + Vmy)2 ◮ Using the Arnoldi decomposition (2), we have
◮ Since the column vectors of Vm+1 are orthonormal, then
◮ Therefore, the GMRES approximation xm is the unique vector
y
◮ The LS problem is inexpensive to compute since m is small.
18 / 38
19 / 38
20 / 38
◮ The Lanczos procedure can be regarded as a simplification of Arnoldi’s
◮ By an order-m Arnoldi decomposition, we know that
mAVm.
◮ This simple observation leads to the following procedure to compute
21 / 38
j w
2Note that we change the notation αj = hjj and βj+1 = hj−1,j, comparing with the
22 / 38
◮ Only three vectors must be saved in the inner loop of the procedure.
◮ In the presence of finite precision, it could start losing such
3An excellent reference on the subject is [B. N. Parlett, The Symmetric Eigenvalue
23 / 38
m = Vm+1
Tm = α1 β2 β2 α2 . . . β3 . . . βm−1 . . . αm−1 βm βm αm and
βm+1eT m
mVm = I and V T mvm+1 = 0, we have
mAVm = Tm.
24 / 38
◮ The Conjugate Gradient (CG) method is the best known iterative
◮ There are several ways to derive the CG method. In terms of our
◮ An alternative derivation is given by Shewchuk5
25 / 38
◮ Before we derive the CG method, we first derive a so-called direct
◮ Using the subspace projection technique, with an initial guess x0, the
26 / 38
◮ Now, let’s try to solve the tridiagonal system (4) progressively along
◮ Let’s write the LU factorization of Tm as
Tm = LmUm = 1 λ2 1 λ3 1 . . . . . . λm 1 η1 β2 η2 β3 . . . . . . ηm−1 βm ηm ,
◮ Then xm is given by
m L−1 m (r02e1) ≡ x0 + Pmzm.
m
m (r02e1).
27 / 38
m =
m−1
m−1(βmem−1)η−1 m
m
m−1
m−1(βmem−1)η−1 m + vmη−1 m
m + vmη−1 m
m (vm − βmpm−1)
m (vm − βmpm−1),
28 / 38
m (r02e1) =
m−1
m−1L−1 m−1
m−1(r02e1)
m−1L−1 m−1(r02e1)
29 / 38
30 / 38
mw
m (vm − βmpm−1)
31 / 38
◮ The residual vector rm:
m)ym
mym)
mym)vm+1.
◮ Since {vi} are orthogonal, we conclude that the residual vectors {ri}
j ri = 0
32 / 38
◮ Note that P T mAPm is a diagonal matrix:
mAPm = U −T m V T mAVmU −1 m
m TmU −1 m
m LmUmU −1 m
m Lm.
mAPm i must be a diagonal matrix. ◮ By the fact that P T mAPm is diagonal, we conclude that the vectors
j Api = 0
33 / 38
34 / 38
◮ By the relation (6), let us express the j + 1-th approximate vector
◮ Then the corresponding residual vector satisfies
◮ Since the rj’s are orthogonal, i.e., rT j rj+1 = 0, then it gives
j rj
j Apj
35 / 38
◮ By the relation (5) and noting that vj is in the direction of rj+1, it is
◮ Therefore, we can write
◮ first consequence
j Apj = (pj − τj−1pj−1)T Apj = pT j Apj.
j rj
j Apj
36 / 38
◮ second consequence: by imposing A-conjugacy pT j+1Apj = 0, we have
j Arj+1
j Apj
j Apj
j+1rj+1
j rj
37 / 38
j rj/(pT j Apj)
j+1rj+1/(rT j rj)
38 / 38