AX = B A C n n x (1) , x (2) , . . . , x ( L ) b (1) , b (2) , - - PowerPoint PPT Presentation

ax b
SMART_READER_LITE
LIVE PREVIEW

AX = B A C n n x (1) , x (2) , . . . , x ( L ) b (1) , b (2) , - - PowerPoint PPT Presentation

AX = B A C n n x (1) , x (2) , . . . , x ( L ) b (1) , b (2) , . . . , b ( L ) X = , B = . A x = B x A , B R n n , V R n L S k = 1 z k ( zB A ) 1 BV d z 2 i


slide-1
SLIDE 1
slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4

AX = B A ∈ Cn×n X =

  • x(1), x(2), . . . , x(L)

, B =

  • b(1), b(2), . . . , b(L)

.

slide-5
SLIDE 5

Ax = λBx 一般化固有値問題: S k = 1 2πi

  • Γ zk(zB − A)−1BVdz

˜ S k = 1 N

N−1

j=0

ω j − γ ρ k+1 (ω jB − A)−1BV A, B ∈ Rn×n, V ∈ Rn×L ˜ S 0, ˜ S 1, . . . ω j.

slide-6
SLIDE 6
slide-7
SLIDE 7

10-14 10-11 10-8 10-5 10-2 101 500 1000 1500 2000 2500 True relative residual norm Iteration number, k 10-10 10-7 10-4 10-1 500 1000 1500 2000 2500 True relative residual norm Iteration number, k

B − AXkF/BF.

XF ≡

  • n

i=1 L

j=1

|xij|2.

slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10

AX = B, A ∈ Cn×n, X, B ∈ Cn×L. Rk+1 = B − AXk+1, ≡ (Qk+1Rk+1)(A) ◦ R0 R0(z) = P0(z) = IL, Rk+1(z) = Rk(z) − zPk(z)αk, Pk+1(z) = Rk+1(z) + Pk(z)βk Q0(z) = 1, Qk+1(z) = (1 − ζkz)Qk(z) αk, βk ∈ CL×L, ζk ∈ C.

  • Mk(A) ◦ V ≡

k

j=0

AjVM j Mk(z) ≡

k

j=0

zjM j M j ∈ CL×L, V ∈ Cn×L.

slide-11
SLIDE 11

ζk = Tr

  • (ATk)HTk
  • /Tr
  • (ATk)HATk
  • Rk+1 = Tk − ζkATk,

Pk+1 = Rk+1 + (Pk − ζkAPk)βk, Tk = Rk − APkαk, Xk+1 = Xk + Pkαk + ζkTk ⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭ Rk+1F Rk+1 ≡ (Qk+1Rk+1)(A) ◦ R0 Pk+1 ≡ (Qk+1Pk+1)(A) ◦ R0 Tk ≡ (QkRk+1)(A) ◦ R0 αk = ( ˜ RH

0 APk)−1( ˜

RH

0 Rk),

βk = −( ˜ RH

0 APk)−1( ˜

RH

0 ATk)

slide-12
SLIDE 12

X0 ∈ Cn×L is an initial guess, Compute R0 = B − AX0, Set P0 = R0, Choose ˜ R0 ∈ Cn×L, For k = 0, 1, . . . , until RkF ≤ εBF do:

Vk = APk, Solve ( ˜ RH

0 Vk)αk = ˜

RH

0 Rk for αk,

Tk = Rk − Vkαk, Zk = ATk, ζk = Tr

  • ZH

k Tk

  • /Tr
  • ZH

k Zk

  • ,

Xk+1 = Xk + Pkαk + ζkTk, Rk+1 = Tk − ζkZk, Solve ( ˜ RH

0 Vk)βk = − ˜

RH

0 Zk for βk,

Pk+1 = Rk+1 + (Pk − ζkVk)βk,

End

10-14 10-11 10-8 10-5 10-2 101 500 1000 1500 2000 2500 True relative residual norm Iteration number, k

B − AXk+1F/BF

slide-13
SLIDE 13

Xk+1 = Xk + Pkαk + ζkTk, Rk+1 = Rk − (APk)αk − ζkATk αk αk ζk

slide-14
SLIDE 14
slide-15
SLIDE 15

Pjαj, (AP j)αj

  • P jαj = P jαj + F j,

(AP j)αj = AP jαj + G j F j, G j

slide-16
SLIDE 16

Xk+1 = Xk + Pkαk + ζkTk Xk+1 Xk+1 = Xk + Pkαk + ζkTk Xk+1 = Xk + Pkαk + ζkTk = X0 +

k

j=0

P jαj +

k

j=0

ζjT j = X0 +

k

j=0

  • Pjαj + ζjT j
  • +

k

j=0

F j Pjαj P jαj = P jαj + F j

slide-17
SLIDE 17

(AP j)αj (AP j)αj = AP jαj + G j Rk+1 = Rk − (APk)αk − ζkATk Rk+1 = Rk − (APk)αk − ζkATk Rk+1 Rk+1 = Rk − (APk)αk + ζkATk = R0 −

k

j=0

(AP j)αj +

k

j=0

ζjAT j = R0 −

k

j=0

  • AP jαj + ζjAT j

k

j=0

G j

slide-18
SLIDE 18

Xk+1 = X0 +

k

j=0

  • Pjαj + ζjT j
  • +

k

j=0

F j, Rk+1 = R0 −

k

j=0

  • AP jαj + ζjAT j

k

j=0

G j B − AXk+1 = R0 −

k

j=0

  • AP jαj + ζjAT j

k

j=0

AF j = Rk+1 +

k

j=0

  • G j − AF j
  • B − AXk+1

k

j=0

  • G j − AF j
  • 誤差項

が現れる! F j, G j

slide-19
SLIDE 19

B − AXk+1 = Rk+1 +

k

j=0

  • G j − AF j
  • F j = P jαj − P jαj,

G j = (AP j)αj − AP jαj E j ≡ G j − AF j = (AP j)αj − AP jαj Block BiCGSTAB 法の残差が になっても Rk+1 = O B − AXk+1F =

k

j=0

E jF までしか減少しない!

slide-20
SLIDE 20

10-14 10-11 10-8 10-5 10-2 101 10-14 10-11 10-8 10-5 10-2 101 10 20 30 40 (True) relative residual norm Iteration number, k

  • k

j=0

E jF/BF

  • k

j=0

E jF/BF RkF/BF B − AXkF/BF

真の相対残差が 緑のラインで停滞!

slide-21
SLIDE 21
slide-22
SLIDE 22

Xk+1 = X0 +

k

j=0

  • Pjαj + ζjT j
  • +

k

j=0

F j, Rk+1 = R0 −

k

j=0

  • AP jαj + ζjAT j

k

j=0

G j B − AXk+1 = Rk+1 +

k

j=0

E j, E j = G j − AF j F j = G j = O G j = AF j A αk E j

slide-23
SLIDE 23

Rk+1 = B − AXk+1 = (Qk+1Rk+1)(A) ◦ R0 Qk+1(z) Rk+1(z) G j = AF j

slide-24
SLIDE 24

Rk+1 = (Qk+1Rk+1)(A) ◦ R0, Pk+1 = (Qk+1Pk+1)(A) ◦ R0, S k = (Qk+1Pk)(A) ◦ R0 Rk+1(z) Rk+1 = Rk − ζkARk − AUk, Pk+1 = Rk+1 + S kγk, S k = Pk − ζkAPk, Uk = S kαk, Xk+1 = Xk + ζkRk + Uk ⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭ αk = ( ˜ RH

0 APk)−1( ˜

RH

0 Rk),

γk = ( ˜ RH

0 Rk)−1( ˜

RH

0 Rk+1)/ζk

ζk = Tr

  • (ARk)HRk
  • /Tr
  • (ARk)HARk
  • Rk − ζkARkF
slide-25
SLIDE 25

X0 ∈ Cn×L is an initial guess, Compute R0 = B − AX0, Set P0 = R0 and V0 = W0 = AR0, Choose ˜ R0 ∈ Cn×L, For k = 0, 1, . . . , until RkF ≤ εBF do:

Solve ( ˜ RH

0 Vk)αk = ˜

RH

0 Rk for αk,

ζk = Tr

  • WH

k Rk

  • /Tr
  • WH

k Wk

  • ,

S k = Pk − ζkVk, Uk = S kαk, Yk = AUk, Xk+1 = Xk + ζkRk + Uk, Rk+1 = Rk − ζkWk − Yk, Wk+1 = ARk+1, Solve ( ˜ RH

0 Rk)γk = ˜

RH

0 Rk+1/ζk for γk,

Pk+1 = Rk+1 + Ukγk, Vk+1 = Wk+1 + Ykγk,

End

αk

slide-26
SLIDE 26

Xk+1 = Xk + ζkRk + S kαk, Rk+1 = Rk − ζkARk − A(S kαk) Xk+1 = Xk + ζkRk + S kαk, Rk+1 = Rk − ζkARk − AS kαk Xk+1 = X0 +

k

j=0

  • ζjRj + S jαj
  • +

k

j=0

Hj, Rk+1 = R0 −

k

j=0

  • ζjARj + AS jαj

k

j=0

AH j, = B − A ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣X0 +

k

j=0

  • ζjRj + S jαj
  • +

k

j=0

Hj ⎤ ⎥ ⎥ ⎥ ⎥ ⎥  = B − AXk+1 S jαj S jαj = S jαj + H j

とすると F j = H j, G j = AH j E j = G j − AF j = O

slide-27
SLIDE 27
slide-28
SLIDE 28

CPU Intel Core 2 Duo 2.4 GHz Memory 4.0GBytes Compiler Intel Fortran ver. 10.1 Compile option

  • O3 -xT -openmp

Initial solution X0 [0, 0, . . . , 0] Right hand side B [e1, e2, . . . , eL] Shadow residual ˜ R0 Random Block size L 1, 2, 4 Convergence criterion RkF/BF ≤ 1.0 × 10−14

slide-29
SLIDE 29

Matrix name Size n nnz(A) PDE900 900 4, 380 JPWH991 991 6, 027 CONF5.4-00L8X8-1000 49, 152 1, 916, 928

slide-30
SLIDE 30

10-14 10-12 10-10 10-8 10-6 10-4 10-2 100 10 20 30 40 50 60 True relative residual norm Iteration number, k

B − AXkF/BF

10-14 10-12 10-10 10-8 10-6 10-4 10-2 100 10 20 30 40 50 60 True relative residual norm Iteration number, k

slide-31
SLIDE 31

L #Iter. Time/L [s] Res. True Res. 1 53 0.0096 4.8 × 10−15 4.8 × 10−15 2 46 0.0067 1.1 × 10−15 2.0 × 10−13 4 41 0.0031 4.8 × 10−15 1.8 × 10−12 L #Iter. Time/L [s] Res. True Res. 1 53 0.0107 3.2 × 10−15 3.3 × 10−15 2 46 0.0051 1.1 × 10−15 1.4 × 10−15 4 45 0.0031 5.5 × 10−15 5.6 × 10−15

slide-32
SLIDE 32

10-14 10-11 10-8 10-5 10-2 101 10 20 30 40 50 60 True relative residual norm Iteration number, k

B − AXkF/BF

10-14 10-11 10-8 10-5 10-2 101 10 20 30 40 50 60 True relative residual norm Iteration number, k

slide-33
SLIDE 33

L #Iter. Time/L [s] Res. True Res. 1 56 0.0159 5.7 × 10−15 1.2 × 10−14 2 49 0.0083 8.3 × 10−15 4.1 × 10−13 4 43 0.0034 6.3 × 10−15 5.9 × 10−12 L #Iter. Time/L [s] Res. True Res. 1 52 0.0134 8.4 × 10−15 1.3 × 10−14 2 51 0.0082 3.7 × 10−15 6.1 × 10−15 4 44 0.0035 1.5 × 10−15 2.3 × 10−15

slide-34
SLIDE 34

10-14 10-11 10-8 10-5 10-2 101 100 200 300 400 500 600 True relative residual norm Iteration number, k

B − AXkF/BF

10-14 10-11 10-8 10-5 10-2 101 100 200 300 400 500 600 True relative residual norm Iteration number, k

slide-35
SLIDE 35

L #Iter. Time/L [s] Res. True Res. 1 555 13.9408 8.9 × 10−15 9.5 × 10−15 2 452 7.5609 7.3 × 10−15 2.5 × 10−13 4 406 6.1544 8.7 × 10−15 2.8 × 10−13 L #Iter. Time/L [s] Res. True Res. 1 555 14.2714 7.4 × 10−15 8.5 × 10−15 2 456 8.1093 5.6 × 10−15 6.7 × 10−15 4 386 6.0348 7.4 × 10−15 8.6 × 10−15

slide-36
SLIDE 36

行列サイズ:n = 1, 572, 864,非零要素数:80, 216, 064

格子QCD計算で現れる大規模行列

CPU AMD Opteron Quad Core 8356 2.3GHz × 4 Memory 32.0GBytes Compiler Intel Fortran ver. 10.1 Compile option

  • O3 -xO -openmp
slide-37
SLIDE 37

10-14 10-11 10-8 10-5 10-2 101 500 1000 1500 2000 2500 True relative residual norm Iteration number, k

B − AXkF/BF

10-14 10-11 10-8 10-5 10-2 101 500 1000 1500 2000 2500 True relative residual norm Iteration number, k

slide-38
SLIDE 38

L #Iter. Time/L [s] Res. True Res. 1 2350 1361.6 3.4 × 10−15 7.5 × 10−15 2 1758 685.1 7.2 × 10−15 1.0 × 10−12 4 1229 456.2 9.5 × 10−15 1.4 × 10−11 L #Iter. Time/L [s] Res. True Res. 1 2412 1366.8 4.4 × 10−15 7.8 × 10−15 2 1704 650.3 6.2 × 10−15 8.5 × 10−15 4 1306 475.6 9.0 × 10−15 1.1 × 10−14

slide-39
SLIDE 39
slide-40
SLIDE 40