AX = B A C n n x (1) , x (2) , . . . , x ( L ) b (1) , b (2) , - - PowerPoint PPT Presentation
AX = B A C n n x (1) , x (2) , . . . , x ( L ) b (1) , b (2) , - - PowerPoint PPT Presentation
AX = B A C n n x (1) , x (2) , . . . , x ( L ) b (1) , b (2) , . . . , b ( L ) X = , B = . A x = B x A , B R n n , V R n L S k = 1 z k ( zB A ) 1 BV d z 2 i
AX = B A ∈ Cn×n X =
- x(1), x(2), . . . , x(L)
, B =
- b(1), b(2), . . . , b(L)
.
Ax = λBx 一般化固有値問題: S k = 1 2πi
- Γ zk(zB − A)−1BVdz
˜ S k = 1 N
N−1
∑
j=0
ω j − γ ρ k+1 (ω jB − A)−1BV A, B ∈ Rn×n, V ∈ Rn×L ˜ S 0, ˜ S 1, . . . ω j.
10-14 10-11 10-8 10-5 10-2 101 500 1000 1500 2000 2500 True relative residual norm Iteration number, k 10-10 10-7 10-4 10-1 500 1000 1500 2000 2500 True relative residual norm Iteration number, k
B − AXkF/BF.
XF ≡
- n
∑
i=1 L
∑
j=1
|xij|2.
AX = B, A ∈ Cn×n, X, B ∈ Cn×L. Rk+1 = B − AXk+1, ≡ (Qk+1Rk+1)(A) ◦ R0 R0(z) = P0(z) = IL, Rk+1(z) = Rk(z) − zPk(z)αk, Pk+1(z) = Rk+1(z) + Pk(z)βk Q0(z) = 1, Qk+1(z) = (1 − ζkz)Qk(z) αk, βk ∈ CL×L, ζk ∈ C.
- Mk(A) ◦ V ≡
k
∑
j=0
AjVM j Mk(z) ≡
k
∑
j=0
zjM j M j ∈ CL×L, V ∈ Cn×L.
ζk = Tr
- (ATk)HTk
- /Tr
- (ATk)HATk
- Rk+1 = Tk − ζkATk,
Pk+1 = Rk+1 + (Pk − ζkAPk)βk, Tk = Rk − APkαk, Xk+1 = Xk + Pkαk + ζkTk ⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭ Rk+1F Rk+1 ≡ (Qk+1Rk+1)(A) ◦ R0 Pk+1 ≡ (Qk+1Pk+1)(A) ◦ R0 Tk ≡ (QkRk+1)(A) ◦ R0 αk = ( ˜ RH
0 APk)−1( ˜
RH
0 Rk),
βk = −( ˜ RH
0 APk)−1( ˜
RH
0 ATk)
X0 ∈ Cn×L is an initial guess, Compute R0 = B − AX0, Set P0 = R0, Choose ˜ R0 ∈ Cn×L, For k = 0, 1, . . . , until RkF ≤ εBF do:
Vk = APk, Solve ( ˜ RH
0 Vk)αk = ˜
RH
0 Rk for αk,
Tk = Rk − Vkαk, Zk = ATk, ζk = Tr
- ZH
k Tk
- /Tr
- ZH
k Zk
- ,
Xk+1 = Xk + Pkαk + ζkTk, Rk+1 = Tk − ζkZk, Solve ( ˜ RH
0 Vk)βk = − ˜
RH
0 Zk for βk,
Pk+1 = Rk+1 + (Pk − ζkVk)βk,
End
10-14 10-11 10-8 10-5 10-2 101 500 1000 1500 2000 2500 True relative residual norm Iteration number, k
B − AXk+1F/BF
Xk+1 = Xk + Pkαk + ζkTk, Rk+1 = Rk − (APk)αk − ζkATk αk αk ζk
Pjαj, (AP j)αj
- P jαj = P jαj + F j,
(AP j)αj = AP jαj + G j F j, G j
Xk+1 = Xk + Pkαk + ζkTk Xk+1 Xk+1 = Xk + Pkαk + ζkTk Xk+1 = Xk + Pkαk + ζkTk = X0 +
k
∑
j=0
P jαj +
k
∑
j=0
ζjT j = X0 +
k
∑
j=0
- Pjαj + ζjT j
- +
k
∑
j=0
F j Pjαj P jαj = P jαj + F j
(AP j)αj (AP j)αj = AP jαj + G j Rk+1 = Rk − (APk)αk − ζkATk Rk+1 = Rk − (APk)αk − ζkATk Rk+1 Rk+1 = Rk − (APk)αk + ζkATk = R0 −
k
∑
j=0
(AP j)αj +
k
∑
j=0
ζjAT j = R0 −
k
∑
j=0
- AP jαj + ζjAT j
- −
k
∑
j=0
G j
Xk+1 = X0 +
k
∑
j=0
- Pjαj + ζjT j
- +
k
∑
j=0
F j, Rk+1 = R0 −
k
∑
j=0
- AP jαj + ζjAT j
- −
k
∑
j=0
G j B − AXk+1 = R0 −
k
∑
j=0
- AP jαj + ζjAT j
- −
k
∑
j=0
AF j = Rk+1 +
k
∑
j=0
- G j − AF j
- B − AXk+1
k
∑
j=0
- G j − AF j
- 誤差項
が現れる! F j, G j
B − AXk+1 = Rk+1 +
k
∑
j=0
- G j − AF j
- F j = P jαj − P jαj,
G j = (AP j)αj − AP jαj E j ≡ G j − AF j = (AP j)αj − AP jαj Block BiCGSTAB 法の残差が になっても Rk+1 = O B − AXk+1F =
k
∑
j=0
E jF までしか減少しない!
10-14 10-11 10-8 10-5 10-2 101 10-14 10-11 10-8 10-5 10-2 101 10 20 30 40 (True) relative residual norm Iteration number, k
- k
∑
j=0
E jF/BF
- k
∑
j=0
E jF/BF RkF/BF B − AXkF/BF
真の相対残差が 緑のラインで停滞!
Xk+1 = X0 +
k
∑
j=0
- Pjαj + ζjT j
- +
k
∑
j=0
F j, Rk+1 = R0 −
k
∑
j=0
- AP jαj + ζjAT j
- −
k
∑
j=0
G j B − AXk+1 = Rk+1 +
k
∑
j=0
E j, E j = G j − AF j F j = G j = O G j = AF j A αk E j
Rk+1 = B − AXk+1 = (Qk+1Rk+1)(A) ◦ R0 Qk+1(z) Rk+1(z) G j = AF j
Rk+1 = (Qk+1Rk+1)(A) ◦ R0, Pk+1 = (Qk+1Pk+1)(A) ◦ R0, S k = (Qk+1Pk)(A) ◦ R0 Rk+1(z) Rk+1 = Rk − ζkARk − AUk, Pk+1 = Rk+1 + S kγk, S k = Pk − ζkAPk, Uk = S kαk, Xk+1 = Xk + ζkRk + Uk ⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭ αk = ( ˜ RH
0 APk)−1( ˜
RH
0 Rk),
γk = ( ˜ RH
0 Rk)−1( ˜
RH
0 Rk+1)/ζk
ζk = Tr
- (ARk)HRk
- /Tr
- (ARk)HARk
- Rk − ζkARkF
X0 ∈ Cn×L is an initial guess, Compute R0 = B − AX0, Set P0 = R0 and V0 = W0 = AR0, Choose ˜ R0 ∈ Cn×L, For k = 0, 1, . . . , until RkF ≤ εBF do:
Solve ( ˜ RH
0 Vk)αk = ˜
RH
0 Rk for αk,
ζk = Tr
- WH
k Rk
- /Tr
- WH
k Wk
- ,
S k = Pk − ζkVk, Uk = S kαk, Yk = AUk, Xk+1 = Xk + ζkRk + Uk, Rk+1 = Rk − ζkWk − Yk, Wk+1 = ARk+1, Solve ( ˜ RH
0 Rk)γk = ˜
RH
0 Rk+1/ζk for γk,
Pk+1 = Rk+1 + Ukγk, Vk+1 = Wk+1 + Ykγk,
End
αk
Xk+1 = Xk + ζkRk + S kαk, Rk+1 = Rk − ζkARk − A(S kαk) Xk+1 = Xk + ζkRk + S kαk, Rk+1 = Rk − ζkARk − AS kαk Xk+1 = X0 +
k
∑
j=0
- ζjRj + S jαj
- +
k
∑
j=0
Hj, Rk+1 = R0 −
k
∑
j=0
- ζjARj + AS jαj
- −
k
∑
j=0
AH j, = B − A ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣X0 +
k
∑
j=0
- ζjRj + S jαj
- +
k
∑
j=0
Hj ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ = B − AXk+1 S jαj S jαj = S jαj + H j
とすると F j = H j, G j = AH j E j = G j − AF j = O
CPU Intel Core 2 Duo 2.4 GHz Memory 4.0GBytes Compiler Intel Fortran ver. 10.1 Compile option
- O3 -xT -openmp
Initial solution X0 [0, 0, . . . , 0] Right hand side B [e1, e2, . . . , eL] Shadow residual ˜ R0 Random Block size L 1, 2, 4 Convergence criterion RkF/BF ≤ 1.0 × 10−14
Matrix name Size n nnz(A) PDE900 900 4, 380 JPWH991 991 6, 027 CONF5.4-00L8X8-1000 49, 152 1, 916, 928
10-14 10-12 10-10 10-8 10-6 10-4 10-2 100 10 20 30 40 50 60 True relative residual norm Iteration number, k
B − AXkF/BF
10-14 10-12 10-10 10-8 10-6 10-4 10-2 100 10 20 30 40 50 60 True relative residual norm Iteration number, k
L #Iter. Time/L [s] Res. True Res. 1 53 0.0096 4.8 × 10−15 4.8 × 10−15 2 46 0.0067 1.1 × 10−15 2.0 × 10−13 4 41 0.0031 4.8 × 10−15 1.8 × 10−12 L #Iter. Time/L [s] Res. True Res. 1 53 0.0107 3.2 × 10−15 3.3 × 10−15 2 46 0.0051 1.1 × 10−15 1.4 × 10−15 4 45 0.0031 5.5 × 10−15 5.6 × 10−15
10-14 10-11 10-8 10-5 10-2 101 10 20 30 40 50 60 True relative residual norm Iteration number, k
B − AXkF/BF
10-14 10-11 10-8 10-5 10-2 101 10 20 30 40 50 60 True relative residual norm Iteration number, k
L #Iter. Time/L [s] Res. True Res. 1 56 0.0159 5.7 × 10−15 1.2 × 10−14 2 49 0.0083 8.3 × 10−15 4.1 × 10−13 4 43 0.0034 6.3 × 10−15 5.9 × 10−12 L #Iter. Time/L [s] Res. True Res. 1 52 0.0134 8.4 × 10−15 1.3 × 10−14 2 51 0.0082 3.7 × 10−15 6.1 × 10−15 4 44 0.0035 1.5 × 10−15 2.3 × 10−15
10-14 10-11 10-8 10-5 10-2 101 100 200 300 400 500 600 True relative residual norm Iteration number, k
B − AXkF/BF
10-14 10-11 10-8 10-5 10-2 101 100 200 300 400 500 600 True relative residual norm Iteration number, k
L #Iter. Time/L [s] Res. True Res. 1 555 13.9408 8.9 × 10−15 9.5 × 10−15 2 452 7.5609 7.3 × 10−15 2.5 × 10−13 4 406 6.1544 8.7 × 10−15 2.8 × 10−13 L #Iter. Time/L [s] Res. True Res. 1 555 14.2714 7.4 × 10−15 8.5 × 10−15 2 456 8.1093 5.6 × 10−15 6.7 × 10−15 4 386 6.0348 7.4 × 10−15 8.6 × 10−15
行列サイズ:n = 1, 572, 864,非零要素数:80, 216, 064
格子QCD計算で現れる大規模行列
CPU AMD Opteron Quad Core 8356 2.3GHz × 4 Memory 32.0GBytes Compiler Intel Fortran ver. 10.1 Compile option
- O3 -xO -openmp
10-14 10-11 10-8 10-5 10-2 101 500 1000 1500 2000 2500 True relative residual norm Iteration number, k
B − AXkF/BF
10-14 10-11 10-8 10-5 10-2 101 500 1000 1500 2000 2500 True relative residual norm Iteration number, k