Reliable multiprecision arithmetic for number theory Fredrik - PowerPoint PPT Presentation

Reliable multiprecision arithmetic for number theory Fredrik Johansson LFANT seminar, IMB / INRIA 2014-09-16

Where I come from Malung (population ≈ 5000) in Dalarna, Sweden

Things I’ve done previously Software ◮ Since 2007: mpmath , a Python library for arbitrary-precision floating-point arithmetic ◮ Since 2010: FLINT , a C library for number theory (coauthor) ◮ Since 2012: Arb , a C library for arbitrary-precision ball arithmetic Education ◮ 2004-2010: MSc in engineering physics , Chalmers University of Technology, Gothenburg, Sweden ◮ 2010-2014: PhD in symbolic computation , RISC, Linz, Austria

My current interest Fast arithmetic mainly in R , C , R [ x ] , C [ x ] , R [[ x ]] , C [[ x ]] with rigorous error bounds . Fast in precision: ideally O ( p 1+ ε ) Fast in polynomial degree: ideally O ( n 1+ ε ) Fast in both: ideally O (( np ) 1+ ε ) Fast in practice : low implementation overhead, different algorithms depending on size

Representing real numbers Floating-point : 3 . 141592653589793 MPFR, MPC, many others . . . Interval (inf-sup) : [3 . 141592653589793 , 3 . 141592653589794] MPFI, . . . Ball (mid-rad) : 3 . 141592653589793 ± 10 − 15 iRRAM, Mathemagix, Arb

A benchmark: power series arithmetic Computing the Taylor expansion of exp(exp(1 + x )) to order x n at a precision of n decimal digits. n Pari/GP Arb Faster 10 0.000014 0.0000113 1 . 2 × 30 0.000066 0.0000583 1 . 1 × 100 0.00105 0.000859 1 . 2 × 300 0.0273 0.0116 2 . 3 × 1000 1.53 0.180 8 . 5 × 3000 69 2.313 29 × 10000 29.81 Time in seconds

Open problem: semantics for ball arithmetic What should the output of sin(10 ± 3) be? − 0 . 544021110889370 ± 3 . 001 (“best midpoint, generic radius”) − 0 . 544021110889370 ± 1 . 545 (“best midpoint, best radius”) − 0 . 544 ± 3 . 002 (“trimmed best midpoint, generic radius”) − 0 . 544 ± 1 . 545 (“trimmed best midpoint, best radius”) 0 ± 1 (“best ball”) These results are quite different, and one can think of applications where either one is superior to the others.

Open problem: transcendental functions Typical function: ∞ N � � f ( z ) = c k ( z ) = c k ( z ) + ε ( N , z ) k =0 k =0 ◮ Bounding the error in evaluating � N k =0 c k ( z ) ◮ Trivial in principle with ball arithmetic ◮ If z is inexact, may need to be more clever for good output ◮ Bounding the remainder ε ( N , z ) ◮ Bounds in the literature are often not general enough, lack explicit constants, are computationally ineffective, or simply not available . . . ◮ Efficiency ◮ Choosing the right algorithm on each subdomain ◮ Asymptotic complexity, practical efficiency

The Hurwitz zeta function ∞ 1 � ζ ( s , a ) = s , a ∈ C ( k + a ) s k =0 Special cases : Riemann zeta ( a = 1), Dirichlet L -functions, polylogarithms, . . . Goal : compute ζ ( s , a ) and derivatives with respect to s , to arbitrary precision, with rigorous error bounds

The Euler-Maclaurin formula U � f ( k ) = I + T + R k = N � U I = f ( t ) dt N T = 1 2 ( f ( N ) + f ( U )) M � � B 2 k � f (2 k − 1) ( U ) − f (2 k − 1) ( N ) + (2 k )! k =1 � U ˜ B 2 M ( t ) (2 M )! f (2 M ) ( t ) dt R = − N

Computing ζ ( s , a ) using Euler-Maclaurin N − 1 ∞ 1 � � ζ ( s , a ) = f ( k ) + f ( k ) , f ( k ) = ( a + k ) s k =0 k = N � �� I + T + R S For derivatives, substitute s → s + x ∈ C [[ x ]]: ∞ ( − 1) i log i ( a + k ) 1 � x i ∈ C [[ x ]] f ( k ) = ( a + k ) s + x = i !( a + k ) s i =0

Parts to evaluate N − 1 1 � S = ( a + k ) s + x k =0 � ∞ ( a + t ) s + x dt = ( a + N ) 1 − ( s + x ) 1 I = ( s + x ) − 1 N � � M 1 1 ( s + x ) 2 k − 1 B 2 k � T = 2 + ( a + N ) s + x ( a + N ) 2 k − 1 (2 k )! k =1 � ∞ ˜ B 2 M ( t ) ( s + x ) 2 M R = − ( a + t ) ( s + x )+2 M dt (bound) (2 M )! N

Bounding the remainder � � � ∞ ˜ � � B 2 M ( t ) ( s + x ) 2 M � � | R | = ( a + t ) s + x +2 M dt � � (2 M )! � � N � � � ∞ ˜ � B 2 M ( t ) ( s + x ) 2 M � � � ≤ � dt � � ( a + t ) s + x +2 M (2 M )! � N � ∞ � � ≤ 4 | ( s + x ) 2 M | dt � � � ∈ R [[ x ]] � � (2 π ) 2 M ( a + t ) s + x +2 M � N � ∞ �� ∞ � � ∞ � � � log( a + t ) k dt dt � � � � � x k � = � � � � ( a + t ) s + x +2 M ( a + t ) s +2 M k ! � � � N N k =0

A sequence of integrals For k ∈ N , A > 0 , B > 1 , C ≥ 0, � ∞ t − B ( C + log t ) k dt J k ( A , B , C ) ≡ A L k = ( B − 1) k +1 A B − 1 where L 0 = 1 , L k = kL k − 1 + D k D = ( B − 1)( C + log A )

Error bound Theorem : for complex numbers s = σ + τ i , a = α + β i and positive integers N , M such that α + N > 1 and σ + 2 M > 1, � � ∞ � � | R | ≤ 4 | ( s + x ) 2 M | � R k x k � � � ∈ R [[ x ]] � � (2 π ) 2 M � k =0 where R k ≤ ( K / k !) J k ( N + α, σ + 2 M , C ) and K and C are certain numbers given explicitly in terms of s , a , N , M .

Evaluation steps To evaluate ζ ( s , a ) with an error of 2 − p : 1. Choose N , M = O ( p ), bound the error term R 2. Compute the power sum S 3. Compute the integral I 4. Compute the Bernoulli numbers 5. Compute the tail T Observation : the first n derivatives of ζ ( s , a ) can be simultaneously computed to n digits of precision in O ( n 2+ ε ) time ( quasi-optimal ).

Fast power series power sum   log n ( a ) 1 log( a ) · · · log n ( a + 1) 1 log( a + 1) · · ·     V = . . . ...  . . .  . . .   log n ( a + n ) 1 log( a + n ) · · · � ( a + n ) − s � T a − s ( a + 1) − s Y = . . . VY is multipoint evaluation . We want V T Y . An algorithm with O ( n 2+ ε ) time complexity exists by the transposition principle (not yet implemented). My implementation: O ( n 3+ ε ), but supports parallelization ( ≈ 16 × speedup on 16 cores).

Fast power series tail T = � M k =1 B 2 k t ( k ) ∈ C [[ x ]] t ( k + 1) is a rational function in k and x t ( k ) t ( k + 1) = r ( k ) t ( k ) s ( k + 1) = s ( k ) + b ( k ) t ( k ) � t ( M ) � � r ( M − 1) � � r (0) � � t (0) � 0 0 = · · · s ( M ) b ( M − 1) 1 b (0) 1 s (0) � �� Matrix product is computed using binary splitting + fast polynomial multiplication

Some computational results

The Keiper-Li coefficients Define { λ n } ∞ n =1 by � � � � ∞ 1 x � λ n x n log ξ = log ξ = − log 2 + 1 − x x − 1 n =1 where ξ ( s ) = 1 2 s ( s − 1) π − s / 2 Γ( s / 2) ζ ( s ). Keiper (1992): Riemann hypothesis ⇒ ∀ n : λ n > 0 Li (1997): Riemann hypothesis ⇐ ∀ n : λ n > 0 Keiper conjectured 2 λ n ≈ (log n − log(2 π ) + γ − 1)

Evaluating the Keiper-Li coefficients Ingredients: 1. The series expansion of ζ ( s ) at s = 0 � f ′ ( x ) / f ( x ) dx 2. A series logarithm: log f ( x ) = 3. Series expansion of log Γ( s ) at s = 1, essentially γ, ζ (2) , ζ (3) , ζ (4) , . . . 4. Right-composing by x / ( x − 1) The need for high precision : a working precision of ≈ n bits is needed to get an accurate value for λ n . Complexity: O ( n 3+ ε ) (implemented), O ( n 2+ ε ) (in theory)

Fast composition k =0 a k x k is The binomial transform of f = � ∞ � n � � � ∞ � n � 1 x � � ( − 1) k x n T [ f ] = 1 − x f = a k x − 1 k n =0 k =0 and the Borel transform is ∞ a k � k ! x k . B [ f ] = k =0 � � T [ f ( x )] = B − 1 [ e x B [ f ( − x )]], so we get f x by a single power x − 1 series multiplication!

Values of Keiper-Li coefficients I have computed all λ n up to n = 100000 using 110000 bits of precision. In particular, λ 100000 = 4 . 62580782406902231409416038 . . . plus about 2900 more accurate digits. Keiper’s approximation suggests λ 100000 ≈ 4 . 626132.

Comparison with approximation formula Plot of n ( λ n − (log n − log(2 π ) + γ − 1) / 2).

Computation time (seconds) for Keiper-Li coefficients n = 1000 n = 10000 n = 100000 Error bound 0.017 1.0 97 Power sum, 16 threads 0.048 47 65402 (Power sum, CPU time) (0.65) (693) (1042210) Bernoulli numbers 0.0020 0.19 59 Tail (binary splitting) 0.058 11 1972 Logarithm of power series 0.047 8.5 1126 log Γ(1 + x ) power series 0.019 3.0 1610 Power series composition 0.022 4.1 593 Total wall time 0.23 84 71051 Memory 8 MB 730 MB 48700 MB

Stieltjes constants The Stieltjes constants are the coefficients in the Laurent series ∞ ( − 1) n 1 � γ n ( a ) ( s − 1) n . ζ ( s , a ) = s − 1 + n ! n =0 Special case: γ n (1) ≡ γ n γ 0 ≈ +0 . 577216 γ 10 ≈ +0 . 000205 γ 100 ≈ − 4 . 25340 × 10 17 γ 1 ≈ − 0 . 072816 γ 1000 ≈ − 1 . 57095 × 10 486 γ 2 ≈ − 0 . 009690

Asymptotics of Stieltjes constants Open problem: precise asymptotic bounds/series for γ n Matsuoka (1985): | γ n | < 0 . 0001 e n log log n , n ≥ 10 Knessl and Coffey (2011): asymptotic approximation formula (without explicit bound) ◮ Predicts sign oscillations ◮ Appears accurate even for small n ◮ Correct sign except for n = 137?

Reliable multiprecision arithmetic for number theory Fredrik - PowerPoint PPT Presentation

Reliable multiprecision arithmetic for number theory Fredrik Johansson LFANT seminar, IMB / INRIA 2014-09-16 Where I come from Malung (population 5000) in Dalarna, Sweden Things Ive done previously Software Since 2007: mpmath , a

Faster multiprecision integer division William Hart June 22, 2015 William Hart Faster

By Shervin Daneshpajouh Computer Arithmetic Computer Arithmetic p Computer Computer Arithmetic

Numerical Recipes for Multiprecision Computations Henri Cohen May 13, 2014 IMB, Universit e

Digital Design Discussion: Arithmetic Binary Arithmetic Floating-Point Arithmetic Binary

Stochastic arithmetic in multiprecision Stef Graillat Joint work with Fabienne Jzquel and

Reliable multiprecision implementation of a class of special functions Team: A. Cuyt, V.B.

The Rise of Multiprecision Computations Nick Higham School of Mathematics The University of

ARITHMETIC, SET THEORY, AND THEIR MODELS PART TWO: ENDOMORPHISMS Ali Enayat YOUNG SET THEORY

Lecture 4 Arithmetic-Logic Unit 1 Arithmetic - Logic Unit ALU Handles integers Does the

Arithmetic for Computers October 31, 2008 Arithmetic for Computers ALU Arithmetic Logic Unit

Section 4 Section 4 Arithmetic Units a 4-1 1 ALU ALU a 4-2 2 Arithmetic Logic Unit (ALU)

The Multiprecision Effort in the US Exascale Computing Project ICERM: Variable Precision in

Multiprecision Multiplication on ARMv8 ZHE LIU 1 , KIMMO JRVINENDL 2 , WEIQIANG LIU 3 ,

Prior Work Consensus Consensus Reliable BGP Consensus Reliable BGP Consensus Routing

Prior Work Consensus Consensus Reliable BGP Consensus Reliable BGP Consensus Routing

Arithmetic Division Distributed Arithmetic Newton Raphson Newton Raphson CORDIC Unsigned

The Impact of Domain Knowledge on the Effectiveness of Requirements Idea Generation during

Linear Logic, Types and Implicit Computational Complexity Patrick Baillot LIPN, CNRS Universit

processors for new highly informative experiments in space O. Serdin, . n tonov, A. Dubrovsky,

Programming of hierarchic array processors: The physical layer February 18, 2013 Many core

Representing Big Integers Multiple-precision integers cant be stored in a single machine word

Reliable Evaluation of the Worst-Case Peak Gain Matrix in Multiple Precision Anastasia Volkova,

Introduction to Multivariate Public Key Cryptography Geovandro Carlos C. F. Pereira PhD advisor:

Enclosures of Roundoff Errors using SDP Victor Magron , CNRS Jointly Certified Upper Bounds with G.