Efficient implementation of elementary functions in the - PowerPoint PPT Presentation

Efficient implementation of elementary functions in the medium-precision range Fredrik Johansson (LFANT, INRIA Bordeaux) 22nd IEEE Symposium on Computer Arithmetic (ARITH 22), Lyon, France, June 2015 1 / 27

Elementary functions Functions : exp, log, sin, cos, atan 2 / 27

Elementary functions Functions : exp, log, sin, cos, atan My goal is to do interval arithmetic with arbitrary precision . 2 / 27

Elementary functions Functions : exp, log, sin, cos, atan My goal is to do interval arithmetic with arbitrary precision . Input : floating-point number x = a · 2 b , precision p ≥ 2 Output : m , r with f ( x ) ∈ [ m − r , m + r ] and r ≈ 2 − p | f ( x ) | 2 / 27

Precision ranges Hardware precision ( n ≈ 53 bits) ◮ Extensively studied - elsewhere! 3 / 27

Precision ranges Hardware precision ( n ≈ 53 bits) ◮ Extensively studied - elsewhere! Medium precision ( n ≈ 100 - 10 000 bits) ◮ Multiplication costs M ( n ) = O ( n 2 ) or O ( n 1 . 6 ) ◮ Argument reduction + rectangular splitting: O ( n 1 / 3 M ( n )) ◮ In the lower range, software overhead is significant 3 / 27

Precision ranges Hardware precision ( n ≈ 53 bits) ◮ Extensively studied - elsewhere! Medium precision ( n ≈ 100 - 10 000 bits) ◮ Multiplication costs M ( n ) = O ( n 2 ) or O ( n 1 . 6 ) ◮ Argument reduction + rectangular splitting: O ( n 1 / 3 M ( n )) ◮ In the lower range, software overhead is significant Very high precision ( n ≫ 10 000 bits) ◮ Multiplication costs M ( n ) = O ( n log n log log n ) ◮ Asymptotically fast algorithms: binary splitting, arithmetic-geometric mean (AGM) iteration: O ( M ( n ) log( n )) 3 / 27

Improvements in this work 1. The low-level mpn layer of GMP is used exclusively ◮ Most libraries (e.g. MPFR) use more convenient types, e.g. mpz and mpfr , to implement elementary functions 4 / 27

Improvements in this work 1. The low-level mpn layer of GMP is used exclusively ◮ Most libraries (e.g. MPFR) use more convenient types, e.g. mpz and mpfr , to implement elementary functions 2. Argument reduction based on lookup tables ◮ Old idea, not well explored in high precision 4 / 27

Improvements in this work 1. The low-level mpn layer of GMP is used exclusively ◮ Most libraries (e.g. MPFR) use more convenient types, e.g. mpz and mpfr , to implement elementary functions 2. Argument reduction based on lookup tables ◮ Old idea, not well explored in high precision 3. Faster evaluation of Taylor series ◮ Optimized version of Smith’s rectangular splitting algorithm ◮ Takes advantage of mpn level functions 4 / 27

Recipe for elementary functions exp( x ) sin( x ) , cos( x ) log(1 + x ) atan( x ) ↓ Domain reduction using π and log(2) ↓ x ∈ [0 , log(2)) x ∈ [0 , π/ 4) x ∈ [0 , 1) x ∈ [0 , 1) 5 / 27

Recipe for elementary functions exp( x ) sin( x ) , cos( x ) log(1 + x ) atan( x ) ↓ Domain reduction using π and log(2) ↓ x ∈ [0 , log(2)) x ∈ [0 , π/ 4) x ∈ [0 , 1) x ∈ [0 , 1) ↓ Argument-halving r ≈ 8 times exp( x ) = [exp( x / 2)] 2 log(1 + x ) = 2 log( √ 1 + x ) ↓ x ∈ [0 , 2 − r ) ↓ Taylor series 5 / 27

Better recipe at medium precision exp( x ) sin( x ) , cos( x ) log(1 + x ) atan( x ) ↓ Domain reduction using π and log(2) ↓ x ∈ [0 , log(2)) x ∈ [0 , π/ 4) x ∈ [0 , 1) x ∈ [0 , 1) ↓ Lookup table with 2 r ≈ 2 8 entries exp( t + x ) = exp( t ) exp( x ) log(1 + t + x ) = log(1 + t ) + log(1 + x / (1 + t )) ↓ x ∈ [0 , 2 − r ) ↓ Taylor series 6 / 27

Argument reduction formulas What we want to compute: f ( x ) , x ∈ [0 , 1) Table size: q = 2 r t = i / q , i = ⌊ 2 r x ⌋ Precomputed value: f ( t ) , y ∈ [0 , 2 − r ) Remaining value to compute: f ( y ) , 7 / 27

Argument reduction formulas What we want to compute: f ( x ) , x ∈ [0 , 1) Table size: q = 2 r t = i / q , i = ⌊ 2 r x ⌋ Precomputed value: f ( t ) , y ∈ [0 , 2 − r ) Remaining value to compute: f ( y ) , exp( x ) = exp( t ) exp( y ) , y = x − i / q sin( x ) = sin( t ) cos( y ) + cos( t ) sin( y ) , y = x − i / q cos( x ) = cos( t ) cos( y ) − sin( t ) sin( y ) , y = x − i / q 7 / 27

Argument reduction formulas What we want to compute: f ( x ) , x ∈ [0 , 1) Table size: q = 2 r t = i / q , i = ⌊ 2 r x ⌋ Precomputed value: f ( t ) , y ∈ [0 , 2 − r ) Remaining value to compute: f ( y ) , exp( x ) = exp( t ) exp( y ) , y = x − i / q sin( x ) = sin( t ) cos( y ) + cos( t ) sin( y ) , y = x − i / q cos( x ) = cos( t ) cos( y ) − sin( t ) sin( y ) , y = x − i / q log(1 + x ) = log(1 + t ) + log(1 + y ) , y = ( qx − i ) / ( i + q ) atan( x ) = atan( t ) + atan( y ) , y = ( qx − i ) / ( ix + q ) 7 / 27

Optimizing lookup tables m = 2 tables with 2 5 + 2 5 entries gives same reduction as m = 1 table with 2 10 entries Function Precision Entries Size (KiB) m r exp ≤ 512 1 8 178 11.125 exp ≤ 4608 2 5 23+32 30.9375 sin ≤ 512 1 8 203 12.6875 sin ≤ 4608 2 5 26+32 32.625 cos ≤ 512 1 8 203 12.6875 cos ≤ 4608 2 5 26+32 32.625 log ≤ 512 2 7 128+128 16 log ≤ 4608 2 5 32+32 36 atan ≤ 512 1 8 256 16 atan ≤ 4608 2 5 32+32 36 Total 236.6875 8 / 27

Taylor series Logarithmic series : 3 x 3 + 1 5 x 5 − 1 7 x 7 + . . . atan( x ) = x − 1 log(1 + x ) = 2 atanh( x / ( x + 2)) With x < 2 − 10 , need 230 terms for 4600-bit precision 9 / 27

Taylor series Logarithmic series : 3 x 3 + 1 5 x 5 − 1 7 x 7 + . . . atan( x ) = x − 1 log(1 + x ) = 2 atanh( x / ( x + 2)) With x < 2 − 10 , need 230 terms for 4600-bit precision Exponential series : 2! x 2 + 1 3! x 3 + 1 4! x 4 + . . . exp( x ) = 1 + x + 1 3! x 3 + 1 5! x 5 − . . . , 2! x 2 + 1 4! x 4 − . . . sin( x ) = x − 1 cos( x ) = 1 − 1 With x < 2 − 10 , need 280 terms for 4600-bit precision 9 / 27

Taylor series Logarithmic series : 3 x 3 + 1 5 x 5 − 1 7 x 7 + . . . atan( x ) = x − 1 log(1 + x ) = 2 atanh( x / ( x + 2)) With x < 2 − 10 , need 230 terms for 4600-bit precision Exponential series : 2! x 2 + 1 3! x 3 + 1 4! x 4 + . . . exp( x ) = 1 + x + 1 3! x 3 + 1 5! x 5 − . . . , 2! x 2 + 1 4! x 4 − . . . sin( x ) = x − 1 cos( x ) = 1 − 1 With x < 2 − 10 , need 280 terms for 4600-bit precision � 1 − sin 2 ( x ) Above 300 bits: cos( x ) = � 1 + sinh 2 ( x ) Above 800 bits: exp( x ) = sinh( x ) + 9 / 27

Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: i =0 � x i in O ( n ) cheap steps + O ( n 1 / 2 ) expensive steps � n 10 / 27

Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: i =0 � x i in O ( n ) cheap steps + O ( n 1 / 2 ) expensive steps � n � x 2 � x 3 ( + � x + + ) + � � x 2 � x 3 x 4 ( � + � x + + ) + � x 2 � x 3 x 8 ( � + � x + + ) + � x 2 � x 3 x 12 ( + + + ) � � x 10 / 27

Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: i =0 � x i in O ( n ) cheap steps + O ( n 1 / 2 ) expensive steps � n � x 2 � x 3 ( + � x + + ) + � � x 2 � x 3 x 4 ( � + � x + + ) + � x 2 � x 3 x 8 ( � + � x + + ) + � x 2 � x 3 x 12 ( + + + ) � � x ◮ Smith, 1989: elementary and hypergeometric functions 10 / 27

Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: i =0 � x i in O ( n ) cheap steps + O ( n 1 / 2 ) expensive steps � n � x 2 � x 3 ( + � x + + ) + � � x 2 � x 3 x 4 ( � + � x + + ) + � x 2 � x 3 x 8 ( � + � x + + ) + � x 2 � x 3 x 12 ( + + + ) � � x ◮ Smith, 1989: elementary and hypergeometric functions ◮ Brent & Zimmermann, 2010: improvements to Smith 10 / 27

Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: i =0 � x i in O ( n ) cheap steps + O ( n 1 / 2 ) expensive steps � n � x 2 � x 3 ( + � x + + ) + � � x 2 � x 3 x 4 ( � + � x + + ) + � x 2 � x 3 x 8 ( � + � x + + ) + � x 2 � x 3 x 12 ( + + + ) � � x ◮ Smith, 1989: elementary and hypergeometric functions ◮ Brent & Zimmermann, 2010: improvements to Smith ◮ FJ, 2014: generalization to D-finite functions 10 / 27

Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: i =0 � x i in O ( n ) cheap steps + O ( n 1 / 2 ) expensive steps � n � x 2 � x 3 ( + � x + + ) + � � x 2 � x 3 x 4 ( � + � x + + ) + � x 2 � x 3 x 8 ( � + � x + + ) + � x 2 � x 3 x 12 ( + + + ) � � x ◮ Smith, 1989: elementary and hypergeometric functions ◮ Brent & Zimmermann, 2010: improvements to Smith ◮ FJ, 2014: generalization to D-finite functions ◮ New: optimized algorithm for elementary functions 10 / 27

Logarithmic series Rectangular splitting: x + 1 2 x 2 + x 3 � 1 3 + 1 4 x + 1 5 x 2 + x 3 � 1 6 + 1 7 x + 1 8 x 2 �� 11 / 27

Logarithmic series Rectangular splitting: x + 1 2 x 2 + x 3 � 1 3 + 1 4 x + 1 5 x 2 + x 3 � 1 6 + 1 7 x + 1 8 x 2 �� Improved algorithm with fewer divisions: x + 1 10+ 1 � 30 x 2 + x 3 � 20+15 x +12 x 2 + x 3 � � � 8 x +7 x 2 �� 60 60 56 11 / 27

Exponential series Rectangular splitting: 1+ x +1 x 2 +1 1+1 x +1 x 2 +1 1+1 x +1 � 3 x 3 � � � 6 x 3 � � 8 x 2 �� 2 4 5 7 12 / 27

Efficient implementation of elementary functions in the - PowerPoint PPT Presentation

Efficient implementation of elementary functions in the medium-precision range Fredrik Johansson (LFANT, INRIA Bordeaux) 22nd IEEE Symposium on Computer Arithmetic (ARITH 22), Lyon, France, June 2015 1 / 27 Elementary functions Functions :

Elementary Functions Part 1, Functions Lecture 1.4a, Symmetries of Functions: Even and Odd

Elementary Functions Part 1, Functions Lecture 1.1b, Functions defined by equations Dr. Ken W.

Elementary Functions Part 1, Functions Lecture 1.1c, Finding the domains of functions Dr. Ken W.

Elementary Elementary Superh ero image Hillcrest Elementary School Oakside Elementary School

DUAL LANGUAGE PROGRAM Brent Elementary Hackberry Elementary School Oak Point Elementary

Elementary and Middle Schools Reading and Math Benchmark Data June 14, 2017 Elementary ELA Local

Elementary Particle Physics in a Nutshell Elementary Particle Physics in a Nutshell

The Unit Circle Many important elementary functions involve computations on the unit circle.

Elementary Functions Part 1, Functions Lecture 1.1a, The Definition of a Function Dr. Ken W.

Elementary Functions Part 1, Functions Lecture 1.0a, Excellence in Algebra: Exponents Dr. Ken W.

The inverse of a trig function With many of the previous elementary functions, we are able to

More on Functions Thomas Schwarz, SJ Marquette University Functions of Functions Functions

Orthonormal bases of functions April 24, 2018 Data - Vectors or Functions Vectors Functions

Functions Programmer-Defined Functions Local Variables in Functions Overloading

Functions Declarations vs Definitions Inline Functions Class Member functions

Periodic Functions and Orthogonal Systems Periodic Functions Even and Odd Functions

C OMPARATIVE A NALYSIS O F S OFTWARE L IBRARIES F OR P UBLIC K EY C RYPTOGRAPHY Ashraf Abusharekh

DOUGLAS C. SMITH GREEN MOUNTAIN POWER Vermont PUC Workshop January 11, 2018 Discussion Themes 2

Pension/OPEB Plans Alicia H. Munnell and Jean-Pierre Aubry Center for Retirement Research at

Numerical shape optimization using CFD Praveen. C praveen@math.tifrbng.res.in Tata Institute of

J-theory in application to the spectral theory of periodic GMP matrices Benjamin Eichinger

ECM speed records on CPU and GPU D. J. Bernstein University of Illinois at Chicago NSF

Textbook RSA Textbook RSA RSA (cont d) RSA (contd) void rsa_keys (mpz_t n , mpz_t d , void

Testing mobile computing applications: toward a scenario language and tools Minh Duc Nguyen,