ax b

AX = B A C n n x (1) , x (2) , . . . , x ( L ) b (1) , b (2) , - PowerPoint PPT Presentation

AX = B A C n n x (1) , x (2) , . . . , x ( L ) b (1) , b (2) , . . . , b ( L ) X = , B = . A x = B x A , B R n n , V R n L S k = 1 z k ( zB A ) 1 BV d z 2 i


  1. AX = B A ∈ C n × n � x (1) , x (2) , . . . , x ( L ) � � b (1) , b (2) , . . . , b ( L ) � X = , B = .

  2. 一般化固有値問題: A x = λ B x A , B ∈ R n × n , V ∈ R n × L S k = 1 � Γ z k ( zB − A ) − 1 BV d z 2 π i ω j . � k + 1 N − 1 � ω j − γ S k = 1 ˜ ∑ ( ω j B − A ) − 1 BV N ρ j = 0 S 0 , ˜ ˜ S 1 , . . .

  3. True relative residual norm 10 1 True relative residual norm 10 -1 10 -2 10 -4 10 -5 10 -7 10 -8 10 -10 10 -11 0 500 1000 1500 2000 2500 Iteration number, k 10 -14 0 500 1000 1500 2000 2500 � � n L Iteration number, k ∑ ∑ | x ij | 2 . � X � F ≡ i = 1 j = 1 � B − AX k � F / � B � F .

  4. ◦ AX = B , A ∈ C n × n , X , B ∈ C n × L . k ∑ A j VM j M k ( A ) ◦ V ≡ j = 0 k ∑ z j M j M k ( z ) ≡ R k + 1 = B − AX k + 1 , j = 0 ≡ ( Q k + 1 R k + 1 )( A ) ◦ R 0 M j ∈ C L × L , V ∈ C n × L . R 0 ( z ) = P 0 ( z ) = I L , Q 0 ( z ) = 1 , R k + 1 ( z ) = R k ( z ) − z P k ( z ) α k , Q k + 1 ( z ) = (1 − ζ k z ) Q k ( z ) P k + 1 ( z ) = R k + 1 ( z ) + P k ( z ) β k α k , β k ∈ C L × L , ζ k ∈ C .

  5. R k + 1 ≡ ( Q k + 1 R k + 1 )( A ) ◦ R 0 R k + 1 = T k − ζ k AT k , P k + 1 ≡ ( Q k + 1 P k + 1 )( A ) ◦ R 0 P k + 1 = R k + 1 + ( P k − ζ k AP k ) β k , T k ≡ ( Q k R k + 1 )( A ) ◦ R 0 T k = R k − AP k α k , X k + 1 = X k + P k α k + ζ k T k ⎫ α k = ( ˜ R H 0 AP k ) − 1 ( ˜ R H 0 R k ) , ⎪ ⎪ ⎬ β k = − ( ˜ R H 0 AP k ) − 1 ( ˜ R H ⎪ 0 AT k ) ⎪ ⎭ � � � � ( AT k ) H T k ( AT k ) H AT k ζ k = Tr / Tr � R k + 1 � F

  6. X 0 ∈ C n × L is an initial guess, Compute R 0 = B − AX 0 , 10 1 True relative residual norm Set P 0 = R 0 , 10 -2 Choose ˜ R 0 ∈ C n × L , 10 -5 For k = 0 , 1 , . . . , until � R k � F ≤ ε � B � F do: V k = AP k , 10 -8 Solve ( ˜ 0 V k ) α k = ˜ R H R H 0 R k for α k , 10 -11 T k = R k − V k α k , 10 -14 0 500 1000 1500 2000 2500 Z k = AT k , Iteration number, k � � � � Z H Z H ζ k = Tr k T k / Tr k Z k , � B − AX k + 1 � F / � B � F X k + 1 = X k + P k α k + ζ k T k , R k + 1 = T k − ζ k Z k , Solve ( ˜ 0 V k ) β k = − ˜ R H R H 0 Z k for β k , P k + 1 = R k + 1 + ( P k − ζ k V k ) β k , End

  7. X k + 1 = X k + P k α k + ζ k T k , R k + 1 = R k − ( AP k ) α k − ζ k AT k ζ k α k α k

  8. P j α j , ( AP j ) α j � P j α j � = P j α j + F j , � ( AP j ) α j � = AP j α j + G j � � F j , G j

  9. X k + 1 P j α j X k + 1 = X k + � P k α k � + ζ k T k X k + 1 = X k + P k α k + ζ k T k � P j α j � = P j α j + F j X k + 1 = X k + � P k α k � + ζ k T k k k ∑ ∑ = X 0 + � P j α j � + ζ j T j j = 0 j = 0 k k � � ∑ ∑ = X 0 + P j α j + ζ j T j F j + j = 0 j = 0

  10. R k + 1 ( AP j ) α j R k + 1 = R k − � ( AP k ) α k � − ζ k AT k R k + 1 = R k − ( AP k ) α k − ζ k AT k � ( AP j ) α j � = AP j α j + G j R k + 1 = R k − � ( AP k ) α k � + ζ k AT k k k ∑ ∑ = R 0 − � ( AP j ) α j � + ζ j AT j j = 0 j = 0 k k � � ∑ ∑ = R 0 − AP j α j + ζ j AT j − G j j = 0 j = 0

  11. F j , G j 誤差項 k k � � ∑ ∑ X k + 1 = X 0 + P j α j + ζ j T j F j , + k � � ∑ j = 0 j = 0 G j − AF j k k j = 0 � � ∑ ∑ R k + 1 = R 0 − AP j α j + ζ j AT j G j − が現れる! j = 0 j = 0 B − AX k + 1 k k � � ∑ ∑ B − AX k + 1 = R 0 − AP j α j + ζ j AT j − AF j j = 0 j = 0 k � � ∑ = R k + 1 + G j − AF j j = 0

  12. k � � ∑ B − AX k + 1 = R k + 1 + G j − AF j j = 0 E j ≡ G j − AF j F j = � P j α j � − P j α j , G j = � ( AP j ) α j � − AP j α j = � ( AP j ) α j � − A � P j α j � Block BiCGSTAB 法の残差が になっても R k + 1 = O k ∑ � B − AX k + 1 � F = � E j � F j = 0 までしか減少しない!

  13. 10 1 10 1 (True) relative residual norm 10 -2 10 -2 � 真の相対残差が ∑ j = 0 k 緑のラインで停滞! 10 -5 10 -5 E j � F / � B � F 10 -8 10 -8 10 -11 10 -11 10 -14 10 -14 0 10 20 30 40 Iteration number, k k ∑ � E j � F / � B � F � B − AX k � F / � B � F � R k � F / � B � F j = 0

  14. k k k � � ∑ ∑ ∑ X k + 1 = X 0 + P j α j + ζ j T j F j , B − AX k + 1 = R k + 1 + E j , + j = 0 j = 0 j = 0 E j = G j − AF j k k � � ∑ ∑ R k + 1 = R 0 − AP j α j + ζ j AT j − G j j = 0 j = 0 E j F j = G j = O G j = AF j α k A

  15. R k + 1 = B − AX k + 1 = ( Q k + 1 R k + 1 )( A ) ◦ R 0 R k + 1 ( z ) Q k + 1 ( z ) G j = AF j

  16. R k + 1 ( z ) R k + 1 = R k − ζ k AR k − AU k , R k + 1 = ( Q k + 1 R k + 1 )( A ) ◦ R 0 , P k + 1 = R k + 1 + S k γ k , P k + 1 = ( Q k + 1 P k + 1 )( A ) ◦ R 0 , S k = P k − ζ k AP k , S k = ( Q k + 1 P k )( A ) ◦ R 0 U k = S k α k , X k + 1 = X k + ζ k R k + U k ⎫ α k = ( ˜ R H 0 AP k ) − 1 ( ˜ R H 0 R k ) , ⎪ ⎪ ⎬ γ k = ( ˜ R H 0 R k ) − 1 ( ˜ R H ⎪ 0 R k + 1 ) /ζ k ⎪ ⎭ � � � � ( AR k ) H R k ( AR k ) H AR k ζ k = Tr / Tr � R k − ζ k AR k � F

  17. X 0 ∈ C n × L is an initial guess, Compute R 0 = B − AX 0 , Set P 0 = R 0 and V 0 = W 0 = AR 0 , Choose ˜ R 0 ∈ C n × L , For k = 0 , 1 , . . . , until � R k � F ≤ ε � B � F do: Solve ( ˜ 0 V k ) α k = ˜ R H R H 0 R k for α k , � � � � W H W H ζ k = Tr k R k / Tr k W k , S k = P k − ζ k V k , U k = S k α k , α k Y k = AU k , X k + 1 = X k + ζ k R k + U k , R k + 1 = R k − ζ k W k − Y k , W k + 1 = AR k + 1 , Solve ( ˜ 0 R k ) γ k = ˜ R H R H 0 R k + 1 /ζ k for γ k , P k + 1 = R k + 1 + U k γ k , V k + 1 = W k + 1 + Y k γ k , End

  18. S j α j X k + 1 = X k + ζ k R k + � S k α k � , X k + 1 = X k + ζ k R k + S k α k , � S j α j � = S j α j + H j R k + 1 = R k − ζ k AR k − A � S k α k � R k + 1 = R k − ζ k AR k − A ( S k α k ) k k � � ∑ ∑ X k + 1 = X 0 + ζ j R j + S j α j H j , + とすると F j = H j , G j = AH j j = 0 j = 0 E j = G j − AF j = O k k � � ∑ ∑ R k + 1 = R 0 − ζ j AR j + AS j α j − AH j , j = 0 j = 0 ⎡ ⎤ k k � � ⎢ ⎥ ∑ ∑ = B − A ⎢ ⎣ X 0 + ζ j R j + S j α j H j ⎥  = B − AX k + 1 + ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ j = 0 j = 0

  19. CPU Intel Core 2 Duo 2.4 GHz Memory 4.0GBytes Compiler Intel Fortran ver. 10.1 Compile option -O3 -xT -openmp Initial solution X 0 [ 0 , 0 , . . . , 0 ] Right hand side B [ e 1 , e 2 , . . . , e L ] Shadow residual ˜ R 0 Random Block size L 1 , 2 , 4 � R k � F / � B � F ≤ 1 . 0 × 10 − 14 Convergence criterion

  20. Matrix name Size n nnz( A ) PDE900 900 4 , 380 JPWH991 991 6 , 027 CONF5.4-00L8X8-1000 49 , 152 1 , 916 , 928

  21. 10 0 10 0 True relative residual norm True relative residual norm 10 -2 10 -2 10 -4 10 -4 10 -6 10 -6 10 -8 10 -8 10 -10 10 -10 10 -12 10 -12 10 -14 10 -14 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Iteration number, k Iteration number, k � B − AX k � F / � B � F

  22. L #Iter. Time/ L [s] Res. True Res. 4 . 8 × 10 − 15 4 . 8 × 10 − 15 1 53 0 . 0096 1 . 1 × 10 − 15 2 . 0 × 10 − 13 2 46 0 . 0067 4 . 8 × 10 − 15 1 . 8 × 10 − 12 4 41 0 . 0031 L #Iter. Time/ L [s] Res. True Res. 3 . 2 × 10 − 15 3 . 3 × 10 − 15 1 53 0 . 0107 1 . 1 × 10 − 15 1 . 4 × 10 − 15 2 46 0 . 0051 5 . 5 × 10 − 15 5 . 6 × 10 − 15 4 45 0 . 0031

  23. 10 1 10 1 True relative residual norm True relative residual norm 10 -2 10 -2 10 -5 10 -5 10 -8 10 -8 10 -11 10 -11 10 -14 10 -14 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Iteration number, k Iteration number, k � B − AX k � F / � B � F

  24. L #Iter. Time/ L [s] Res. True Res. 5 . 7 × 10 − 15 1 . 2 × 10 − 14 1 56 0 . 0159 8 . 3 × 10 − 15 4 . 1 × 10 − 13 2 49 0 . 0083 6 . 3 × 10 − 15 5 . 9 × 10 − 12 4 43 0 . 0034 L #Iter. Time/ L [s] Res. True Res. 8 . 4 × 10 − 15 1 . 3 × 10 − 14 1 52 0 . 0134 3 . 7 × 10 − 15 6 . 1 × 10 − 15 2 51 0 . 0082 1 . 5 × 10 − 15 2 . 3 × 10 − 15 4 44 0 . 0035

  25. 10 1 10 1 True relative residual norm True relative residual norm 10 -2 10 -2 10 -5 10 -5 10 -8 10 -8 10 -11 10 -11 10 -14 10 -14 0 100 200 300 400 500 600 0 100 200 300 400 500 600 Iteration number, k Iteration number, k � B − AX k � F / � B � F

  26. L #Iter. Time/ L [s] Res. True Res. 8 . 9 × 10 − 15 9 . 5 × 10 − 15 1 555 13 . 9408 7 . 3 × 10 − 15 2 . 5 × 10 − 13 2 452 7 . 5609 8 . 7 × 10 − 15 2 . 8 × 10 − 13 4 406 6 . 1544 #Iter. Time/ L [s] Res. True Res. L 7 . 4 × 10 − 15 8 . 5 × 10 − 15 1 555 14 . 2714 5 . 6 × 10 − 15 6 . 7 × 10 − 15 2 456 8 . 1093 7 . 4 × 10 − 15 8 . 6 × 10 − 15 4 386 6 . 0348

  27. 格子 QCD 計算で現れる大規模行列 � 行列サイズ: n = 1, 572, 864 ,非零要素数: 80, 216, 064 CPU AMD Opteron Quad Core 8356 2.3GHz × 4 Memory 32.0GBytes Compiler Intel Fortran ver. 10.1 Compile option -O3 -xO -openmp

  28. 10 1 10 1 True relative residual norm True relative residual norm 10 -2 10 -2 10 -5 10 -5 10 -8 10 -8 10 -11 10 -11 10 -14 10 -14 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 Iteration number, k Iteration number, k � B − AX k � F / � B � F

Recommend


More recommend