scale linear systems
play

Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M - PowerPoint PPT Presentation

Revisit of Monte Carlo Methods on Solving Large- Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M E N T O F C O M P U T E R S C I E N C E O L D D O M I N I O N U N I V E R S I T Y yaohang@cs.odu.edu Nov. 4, 2014 @ NIST


  1. Revisit of Monte Carlo Methods on Solving Large- Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M E N T O F C O M P U T E R S C I E N C E O L D D O M I N I O N U N I V E R S I T Y yaohang@cs.odu.edu Nov. 4, 2014 @ NIST

  2. Agenda  Numerical Linear Algebra in Big Data Era  Revisit of Ulam-von Neumann Scheme  Breakdown-Free BCG  Randomized SVD  BCG with Adaptive Deflation  Results and Applications  Summary

  3. Numerical Linear Algebra for  Numerical Linear Algebra in Big Data Era  Characterizing by Large Matrices  Million x Million to Billion x Billion  Most likely sparse  Operations  Approximating Matrix-Matrix/Matrix-Vector Multiplications  Solving Linear Systems with Large Coefficient Matrices  Finding Extreme Eigenvalues/Eigenvectors  Top- k Eigenvalues/Eigenvectors  Low- k Eigenvalues/Eigenvectors  Estimating the Determinant

  4. Characteristics of Numerical Linear Algebra in Big Data Era (1) Matrices may not fit in the Matrices may not exist main memory explicitly Core Memory CPU = ● Large Matrix Large Matrix Storage Accessing Matrix Elements Becomes the New Bottleneck

  5. Characteristics of Numerical Linear Algebra in Big Data Era (2)  Matrices may be incomplete  Some elements may be missing  Chance of memory errors increases Large Matrix

  6. Characteristics of Numerical Linear Algebra in Big Data Era (3)  Solution precision requirement is usually not very high  10 -2 even 10 -1 is enough, sometimes  Vice versa, computational speed (response time) is usually more important, instead

  7. Characteristics of Numerical Linear Algebra in Big Data Era (4)  Modern Parallel/Distributed Computing Paradigms/Architectures Intel Xeon Phi Map/Reduce Paradigm Nvidia GPGPU

  8. A Little Bit of History  Conjugate Gradient (CG) Methods  Developed in 1950’s  Not considered as efficient methods compared to direct methods  Start to gain attention in 1970’s  Bigger matrices with sparsity  Efficient matrix-vector multiplication  Powerful iterative method for sparse linear systems  Catalyst for subsequent work  Krylov subspace methods  Preconditioning  Parallel computing  etc. [Golub & O’Leary, SIAM Review, 1989]

  9. Can History Repeat Itself?  Monte Carlo Methods for Solving Linear Systems  Developed in 1950s by Ulam and von Neumann  Statistical sampling  Not considered as efficient methods compared to deterministic methods  Can Monte Carlo Methods be Resurrected in the Big Data Era?  Sampling matrices (reduced visits to matrix elements)  Fast computation with low relative accuracy of 10 -1 or 10 -2 residual error  Natural parallel

  10. Ulam-von Neumann Scheme  Ulam-von Neumann Scheme  Construct Markov chains to sample the Neumann Series  Consider a Linear System x = Hx + b  Neumann Series I + H + H 2 + H 3 + …  If ρ ( H ) < 1 I + H + H 2 + H 3 + … = (I – H) -1 [Hammersley & Handscomb, 1964]

  11. Ulam-von Neumann Scheme (cont.)  Transition Matrix P  0 ; P ij   1 ; P ij j    0 0 H P ij ij  Termination Probability T i    1 T P i ij j  Random walk γ      : ... r r r r 0 1 2 k  Then ... H H H is an unbiased estimator of x   r r r r r r  ( ) / X 0 1 1 2 k 1 k b T r 0 r r ... P P P k k r r r r r r  0 1 1 2 1 k k

  12. An Improved Monte Carlo Algorithm  Monte Carlo Almost Optimal (MAO)  An alternative transition matrix P | | H  ij P  ij | | H ik k  Adaptive termination  1 ; W 0 H  r r  ...; W W 1 k k  k k 1 P r r  1 k k  An alternative estimator of u r 0  (   ) X W b k kr k k  Better accuracy (smaller variance) [Dimov et al., 1998]

  13. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

  14. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜾 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0.225 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

  15. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 • Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; • ℎ𝑡𝑣𝑛 = 0.55 • Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 • While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 • Generate an uniformly distributed 0.28125 0.5625 0.15625 random number 𝜊 ∈ ) 0,1 ; • Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 • While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0.225 𝑘 = 𝑘 + 1 ; • • End • 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; • 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; • 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; • End • return 𝑌 ;

  16. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

  17. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 1 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

  18. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 2 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend