Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M - PowerPoint PPT Presentation

Revisit of Monte Carlo Methods on Solving Large- Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M E N T O F C O M P U T E R S C I E N C E O L D D O M I N I O N U N I V E R S I T Y yaohang@cs.odu.edu Nov. 4, 2014 @ NIST

Agenda  Numerical Linear Algebra in Big Data Era  Revisit of Ulam-von Neumann Scheme  Breakdown-Free BCG  Randomized SVD  BCG with Adaptive Deflation  Results and Applications  Summary

Numerical Linear Algebra for  Numerical Linear Algebra in Big Data Era  Characterizing by Large Matrices  Million x Million to Billion x Billion  Most likely sparse  Operations  Approximating Matrix-Matrix/Matrix-Vector Multiplications  Solving Linear Systems with Large Coefficient Matrices  Finding Extreme Eigenvalues/Eigenvectors  Top- k Eigenvalues/Eigenvectors  Low- k Eigenvalues/Eigenvectors  Estimating the Determinant

Characteristics of Numerical Linear Algebra in Big Data Era (1) Matrices may not fit in the Matrices may not exist main memory explicitly Core Memory CPU = ● Large Matrix Large Matrix Storage Accessing Matrix Elements Becomes the New Bottleneck

Characteristics of Numerical Linear Algebra in Big Data Era (2)  Matrices may be incomplete  Some elements may be missing  Chance of memory errors increases Large Matrix

Characteristics of Numerical Linear Algebra in Big Data Era (3)  Solution precision requirement is usually not very high  10 -2 even 10 -1 is enough, sometimes  Vice versa, computational speed (response time) is usually more important, instead

Characteristics of Numerical Linear Algebra in Big Data Era (4)  Modern Parallel/Distributed Computing Paradigms/Architectures Intel Xeon Phi Map/Reduce Paradigm Nvidia GPGPU

A Little Bit of History  Conjugate Gradient (CG) Methods  Developed in 1950’s  Not considered as efficient methods compared to direct methods  Start to gain attention in 1970’s  Bigger matrices with sparsity  Efficient matrix-vector multiplication  Powerful iterative method for sparse linear systems  Catalyst for subsequent work  Krylov subspace methods  Preconditioning  Parallel computing  etc. [Golub & O’Leary, SIAM Review, 1989]

Can History Repeat Itself?  Monte Carlo Methods for Solving Linear Systems  Developed in 1950s by Ulam and von Neumann  Statistical sampling  Not considered as efficient methods compared to deterministic methods  Can Monte Carlo Methods be Resurrected in the Big Data Era?  Sampling matrices (reduced visits to matrix elements)  Fast computation with low relative accuracy of 10 -1 or 10 -2 residual error  Natural parallel

Ulam-von Neumann Scheme  Ulam-von Neumann Scheme  Construct Markov chains to sample the Neumann Series  Consider a Linear System x = Hx + b  Neumann Series I + H + H 2 + H 3 + …  If ρ ( H ) < 1 I + H + H 2 + H 3 + … = (I – H) -1 [Hammersley & Handscomb, 1964]

Ulam-von Neumann Scheme (cont.)  Transition Matrix P  0 ; P ij   1 ; P ij j    0 0 H P ij ij  Termination Probability T i    1 T P i ij j  Random walk γ      : ... r r r r 0 1 2 k  Then ... H H H is an unbiased estimator of x   r r r r r r  ( ) / X 0 1 1 2 k 1 k b T r 0 r r ... P P P k k r r r r r r  0 1 1 2 1 k k

An Improved Monte Carlo Algorithm  Monte Carlo Almost Optimal (MAO)  An alternative transition matrix P | | H  ij P  ij | | H ik k  Adaptive termination  1 ; W 0 H  r r  ...; W W 1 k k  k k 1 P r r  1 k k  An alternative estimator of u r 0  (   ) X W b k kr k k  Better accuracy (smaller variance) [Dimov et al., 1998]

For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜾 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0.225 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 • Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; • ℎ𝑡𝑣𝑛 = 0.55 • Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 • While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 • Generate an uniformly distributed 0.28125 0.5625 0.15625 random number 𝜊 ∈ ) 0,1 ; • Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 • While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0.225 𝑘 = 𝑘 + 1 ; • • End • 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; • 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; • 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; • End • return 𝑌 ;

For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 1 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 2 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M - PowerPoint PPT Presentation

Revisit of Monte Carlo Methods on Solving Large- Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M E N T O F C O M P U T E R S C I E N C E O L D D O M I N I O N U N I V E R S I T Y yaohang@cs.odu.edu Nov. 4, 2014 @ NIST

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE

Reliability of Cloud-Scale Systems (CS 598) Fall 2018 Tianyin Xu 1 Reliability of Cloud-Scale

Linear Systems CS3220 Summer 2008 Jonathan Kaldor Systems of Linear Equations Want to find

Lecture 3: Linear systems Habib Ammari Department of Mathematics, ETH Z urich Numerical

Categorical Semantics for Linear Logic Categorical semantics for linear logic Interaction

Linear Programming Linear Programming In a linear programming problem, there is a set of

Chapter 1 What is Linear Algebra? Chapter 1 What is Linear Algebra? The study of linear

Convolution Outline Linear Systems Properties Response of Linear System

Linear Systems Linear Systems Transform Ax = b into an equivalent but Transform Ax b into

Linear Manifold Clustering Robert Haralick and Rave Harpaz Outline Background The linear

Calibrating the Calibrating the Output of a Linear Output of a Linear Output of a Linear

Linear Transformations Linear Transformations 1 / 21 Linear Transformations A function T from R

Workshop 3 Building from Linear Models to Generalised Linear Models Part 2: GLMs 2 2 What are

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

Linear regression Linear regression is a simple approach to supervised learning. It assumes

EMA Workshop Estimating the Probability of Target Attainment G.L. Drusano, M.D. Professor and

Pricing Weather Derivatives for Extreme Events Rob Erhardt University of North Carolina at Chapel

Fuels in Monte Carlo Neutron Transport Calculation Zhenping Chen School of Nuclear Science and

Aix-en-Provence (France) 7-9 November 2005 Session II: Methods for Uncertainty Assessment The

Study of radioactive neutral isotopes propagation at ISOLDE: a Monte Carlo and a spectroscopic

Monte Carlo simulation on population synthesis of -ray pulsars Jumpei Takata Yu Wang K.S.

and Monte Carlo Methods Random Numbers Biostatistics 615/815 Lecture 15: . . Summary . 1 /

State-of-the-art option pricing by simulation PRMIA Munich Chapter Meeting Christian Bender TU

Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M - PowerPoint PPT Presentation

Revisit of Monte Carlo Methods on Solving Large- Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M E N T O F C O M P U T E R S C I E N C E O L D D O M I N I O N U N I V E R S I T Y yaohang@cs.odu.edu Nov. 4, 2014 @ NIST

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Graphics 2014 Linear Algebra II Linear Maps &amp; Matrices Linear Maps &amp; Matrices CORE

Reliability of Cloud-Scale Systems (CS 598) Fall 2018 Tianyin Xu 1 Reliability of Cloud-Scale

Linear Systems CS3220 Summer 2008 Jonathan Kaldor Systems of Linear Equations Want to find

Lecture 3: Linear systems Habib Ammari Department of Mathematics, ETH Z urich Numerical

Categorical Semantics for Linear Logic Categorical semantics for linear logic Interaction

Linear Programming Linear Programming In a linear programming problem, there is a set of

Chapter 1 What is Linear Algebra? Chapter 1 What is Linear Algebra? The study of linear

Convolution Outline Linear Systems Properties Response of Linear System

Linear Systems Linear Systems Transform Ax = b into an equivalent but Transform Ax b into

Linear Manifold Clustering Robert Haralick and Rave Harpaz Outline Background The linear

Calibrating the Calibrating the Output of a Linear Output of a Linear Output of a Linear

Linear Transformations Linear Transformations 1 / 21 Linear Transformations A function T from R

Workshop 3 Building from Linear Models to Generalised Linear Models Part 2: GLMs 2 2 What are

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

Linear regression Linear regression is a simple approach to supervised learning. It assumes

EMA Workshop Estimating the Probability of Target Attainment G.L. Drusano, M.D. Professor and

Pricing Weather Derivatives for Extreme Events Rob Erhardt University of North Carolina at Chapel

Fuels in Monte Carlo Neutron Transport Calculation Zhenping Chen School of Nuclear Science and

Aix-en-Provence (France) 7-9 November 2005 Session II: Methods for Uncertainty Assessment The

Study of radioactive neutral isotopes propagation at ISOLDE: a Monte Carlo and a spectroscopic

Monte Carlo simulation on population synthesis of -ray pulsars Jumpei Takata Yu Wang K.S.

and Monte Carlo Methods Random Numbers Biostatistics 615/815 Lecture 15: . . Summary . 1 /

State-of-the-art option pricing by simulation PRMIA Munich Chapter Meeting Christian Bender TU

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE