numerical error analysis for statistical n i l e a l i f
play

Numerical Error Analysis for Statistical N i l E A l i f St ti - PowerPoint PPT Presentation

gart.de ch.uni-stuttg Numerical Error Analysis for Statistical N i l E A l i f St ti ti l www.simtec Software on Multi-Core Systems y w Wenbin Li, Sven Simon liwn@ipvs.uni-stuttgart.de Institute of Parallel and Distributed Systems


  1. gart.de ch.uni-stuttg Numerical Error Analysis for Statistical N i l E A l i f St ti ti l www.simtec Software on Multi-Core Systems y w Wenbin Li, Sven Simon liwn@ipvs.uni-stuttgart.de Institute of Parallel and Distributed Systems University of Stuttgart, Germany

  2. O tli Outline gart.de 1 1. Numerical Errors and Accuracy Control Numerical Errors and Accuracy Control ch.uni-stuttg 2. A Method for Accuracy Control www.simtec • e.g. Result : 9.676383250285714*10 3 Contaminated digits Accurate 12 digits g w due to rounding errors Accuracy: 12 digits 3 Applied to a Statistical Software 3. Applied to a Statistical Software – R R 4. Parallelization on Multi-Core Systems 5. Conclusion 2

  3. Di Disasters Caused by Numerical Errors t C d b N i l E gart.de ch.uni-stuttg “ Numerical precision is the very soul of science. ” -- D'Arcy Wentworth y www.simtec http://ta twi tudelft nl/users/vuik/wi211/disasters html http://ta.twi.tudelft.nl/users/vuik/wi211/disasters.html w 28 persons killed !!! p $7 billion lost !!! $700 million lost !! $700 million lost !! 3

  4. Schemes to Evaluate Numerical Quality Schemes to Evaluate Numerical Quality gart.de • Compare the result with a high precision reference p g p ch.uni-stuttg • Reference is obtained by repeating the computation in arithmetic of increasing precision. www.simtec π = 3.14159265 3 . 14 3589793238462 ………………… w 4

  5. Schemes to Evaluate Numerical Quality Schemes to Evaluate Numerical Quality gart.de • Compare the result with a high precision reference p g p ch.uni-stuttg • How many digits are sufficient? • Low performance: www.simtec • Single precision performance >> Double precision (e.g. GPU, CellBE, etc.). w • A bit Arbitrary high precision is several orders of magnitude times slower. hi h i i i l d f it d ti l • Hard to modify source codes to generate high precision reference: • millions of lines of source code. illi f li f d Unknown libraries � source code available (readable) ? • • Mismatch in memory: • Fortran: common • C/C++ : union 5

  6. Schemes to Evaluate Numerical Quality Schemes to Evaluate Numerical Quality gart.de • Compare the result with a high precision reference p g p ch.uni-stuttg • Reference is obtained by repeating the computation in arithmetic of increasing precision. www.simtec π = 3.14159265 3 . 14 3589793238462 ………………… w • Interval arithmetic ( ( ) ) − ∞ ∞ +∞ +∞ , e.g. Your result is in guaranteed e g Yo r res lt is in g aranteed • Probabilistic error analysis 6

  7. Probability Distribution of the Rounding Error Probability Distribution of the Rounding Error gart.de • The calculated result in a program p g ch.uni-stuttg = + ε ˆ R r rounding r r : exact result : exact result www.simtec ˆ r : approximation of due to rounding error. R ε : rounding error. : rounding error rounding rounding w ( ) = ∑ n − − ε ⋅ α + Ο p 2 p g ( d ) 2 2 • First order model ([1]) : rounding i i = k 1 Gaussian Distribution Assume a probability distribution for Central Limit pdf r ? rounding error of elementary operation g y p Theorem p p p OP OP OP N 2 1 ˆ R R ( ) σ ˆ ? R 7

  8. Probabilistic Rounding Error Analysis (PREA) Probabilistic Rounding Error Analysis (PREA) gart.de r • Consider a sequence of computations providing an exact result . ch.uni-stuttg • Perform computations N times with random rounding ( randomly choosing + ∞ − ∞ = ˆ rounding to or ), N results are obtained. R i , i 1 ,... N • The computed result is p Input p N N www.simtec 1 ∑ = ˆ R R i N User’s User’s User’s = i 1 ( ) ( ) program, program, program, σ • • Instead of , we have sample variance: Instead of we have sample variance: w R R random random random ( ) rounding rounding rounding N 1 ∑ 2 = − ˆ 2 s R R − i N N 1 1 = ˆ ˆ ˆ i i 1 1 R R R • Define 1 2 N − R r = sample from Student’s t-distribution T s s N N ν + ⎛ ⎞ • Probability density function 1 Γ ⎛ ν + ⎞ ⎜ ⎟ 1 − ⎜ ⎟ ⎛ + ⎞ ⎝ ⎠ 2 ⎝ ⎠ t 2 2 ⎜ ⎜ ⎟ ⎟ = = = = + f f ( ( T T t t ) ) 1 1 ⎜ ⎜ ⎟ ⎟ ( ) ν νπ ⋅ Γ ν ⎝ ⎠ 2 8

  9. Number of Significant Digits Number of Significant Digits gart.de • • Confidence Interval Confidence Interval ch.uni-stuttg ⎛ ⎞ − R r ⎜ ⎟ − τ < < τ = β Pr ⎜ ⎟ β β ⎝ ⎠ s N www.simtec β • With probability , the number of the significant digits (i.e. accurate digits) of is ⎛ ⎞ R ⋅ R N ⎜ ⎜ ⎟ ⎟ ≥ w N log ⎜ ⎟ τ ⋅ ⎜ ⎟ 10 Significan t Digit of R s ⎝ β ⎠ • • Two Hypothesis should hold ([1 2]) : Two Hypothesis should hold ([1, 2]) : • Random rounding error is Gaussian distributed. -- not importance because of robustness of Student’s t-test. p • The first order approximation is legitimate. • The operands of multiplication must be significant. Self validation • The divisor of division must be significant. 9

  10. Applied to Complex Statistical Software Applied to Complex Statistical Software gart.de R (v2 10 1) : a software for statistical computing and graphics R (v2.10.1) : a software for statistical computing and graphics. ch.uni-stuttg Benchmarks: NIST StRD (a collection of data sets and certified values) ⎛ ⎞ ⋅ R N N R Certified values Certified values ⎜ ⎜ ⎟ ⎟ Accuracy Estimation A E ti ti www.simtec log from StRD ⎜ ⎟ τ ⋅ s ⎜ ⎟ 10 using PREA : ⎝ β ⎠ ⎛ ⎛ ⎞ ⎞ − r R R r ⎜ ⎜ ⎟ ⎟ w = − Reference (true accuracy): LRE log ⎜ ⎟ 10 ⎝ r ⎠ PREA True PREA True PREA True PREA True Standard UNIV benchmark Mean Autocorrelation - Deviation Deviation mean Mavro 15 15 13 13 13 14 - - sd acf ac Numacc3 Numacc3 15 15 15 15 10 10 10 10 11 11 10 10 - - 10

  11. Applied to Complex Statistical Software Applied to Complex Statistical Software gart.de R (v2 10 1) : a software for statistical computing and graphics R (v2.10.1) : a software for statistical computing and graphics. ch.uni-stuttg Benchmarks: NIST StRD (a collection of data sets and certified values) ⎛ ⎞ ⋅ R N N R Certified values Certified values ⎜ ⎜ ⎟ ⎟ Accuracy Estimation A E ti ti www.simtec log from StRD ⎜ ⎟ τ ⋅ s ⎜ ⎟ 10 using PREA : ⎝ β ⎠ ⎛ ⎛ ⎞ ⎞ − r R R r ⎜ ⎜ ⎟ ⎟ w = − Reference (true accuracy): LRE log ⎜ ⎟ 10 ⎝ r ⎠ PREA True PREA True PREA True PREA True Standard UNIV benchmark Mean Autocorrelation - Deviation Deviation mean Mavro 15 15 13 13 13 14 - - sd acf ac Numacc3 Numacc3 15 15 15 15 10 10 10 10 11 11 10 10 - - Number of significant Digits 11

  12. Applied to Complex Statistical Software Applied to Complex Statistical Software gart.de PREA True PREA True PREA True PREA True b benchmark h k SST SST SSE SSE F F-statistic t ti ti ch.uni-stuttg MSE MSE ANOVA SiRstv 12 12 12 13 12 13 12 12 aov SmLs08 4 4 2 2 2 2 2 2 R 2 R 2 Coefficient Coefficient RSD RSD F statistic F-statistic LINR LINR www.simtec Norris 12 13 14 14 15 15 13 14 lm Wampler5 5 6 15 15 14 14 14 15 Coefficient i Coefficient Coefficient Coefficient j RSS RSS RSD RSD w NLINR NLINR Lanczos2 5 5 6 6 7 8 8 8 Nls Bennet5 4 4 4 5 9 9 9 9 Exact estimation Underestimation Overestimation by 1 digit by 1 digit Relative Frequency Relative Frequency 67% 67% 29% 29% 4% 4% If underestimation by 1 digit is tolerable: reliable estimation in 96% of the cases 12

  13. Parallelization for Acceleration Parallelization for Acceleration gart.de • Why do we need parallel computing? ch.uni-stuttg • The multiple runs of the code are intensive in computation time. • Multi-cores are the mainstream hardware architecture. www.simtec • To harvest the performance potential of multi-core processors. • Parallel execution w • T To get N random rounding result, the user’s code can be executed t N d di lt th ’ d b t d concurrently on different CPU cores. • Communication between threads is necessary to: y • make a unitive decision for instructions such as “ IF() THEN”. • perform Self Validation (SV) • perform Numerical Instabilities Checking (NIC). i.e. sudden accuracy loss due to cancellation in ‘+’ or ‘-’ ). • H How to the minimize communication & synchronization overhead? h i i i i i & h i i h d? 13

  14. Parallelization with Asynchronous SV & NIC Parallelization with Asynchronous SV & NIC gart.de ch.uni-stuttg Random number generator www.simtec Computation with random rounding arithmetic w Copy 1 of the code: Copy 1 of the code: Copy N of the code: exe_thread 1 exe_thread N Branch decision Start End Operands/ intermediate result for Operands/ intermediate results for SV&NIC M lti l SV&NIC; Multiplex SV&NIC; Multiplex SV&NIC M lti l buf_ SV,NIC buf_ SV,NIC buf_ SV,NIC buf_ SV,NIC (1,1) (1,2) ( N ,1) ( N ,2) Multiplex Multiplex Self validation & Numerical instability checking (SN) SN_thread1 SN_thread M 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend