Correct rounding of transcendental functions: an approach via - - PowerPoint PPT Presentation

correct rounding of transcendental functions an approach
SMART_READER_LITE
LIVE PREVIEW

Correct rounding of transcendental functions: an approach via - - PowerPoint PPT Presentation

Correct rounding of transcendental functions: an approach via Euclidean lattices and approximation theory Nicolas Brisebarre (C.N.R.S.) and Guillaume Hanrot (.N.S. Lyon) CUNY Graduate Center-Courant Seminar in Symbolic-Numeric Computing -


slide-1
SLIDE 1

Correct rounding of transcendental functions: an approach via Euclidean lattices and approximation theory

Nicolas Brisebarre (C.N.R.S.) and Guillaume Hanrot (É.N.S. Lyon) CUNY Graduate Center-Courant Seminar in Symbolic-Numeric Computing - December 5, 2019

  • 1-
slide-2
SLIDE 2

(Binary) Floating Point (FP) Arithmetic

Given

  • a precision

p 1, a set of exponents Emin, · · · , Emax. A finite FP number x is represented by 2 integers: integer significand M, 2p−1 |M| 2p − 1, exponent E, Emin E Emax such that x = M 2p−1 × 2E.

  • 2-
slide-3
SLIDE 3

IEEE Precisions

IEEE 754 standard (1984 then 2008). See http://en.wikipedia.org/wiki/IEEE_floating_point or (older)

http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html.

precision p

  • min. exponent

maximal exponent Emin Emax binary32 (single) 24 −126 127 binary64 (double) 53 −1022 1023 extended double 64 −16382 16383 binary128 (quadruple) 113 −16382 16383 We have x =

M 2p−1 × 2E with 2p−1 |M| 2p − 1

and Emin E Emax.

  • 3-
slide-4
SLIDE 4

Rounding modes

In the IEEE 754 standard, the user defines an active rounding mode. In this talk, we use: round to nearest (default). If x ∈ R, RN(x) is the floating-point number that is the closest to x. In case of a tie, value whose integral significand is even. A breakpoint is a point where the rounding function changes.

  • 4-
slide-5
SLIDE 5

Rounding modes

In the IEEE 754 standard, the user defines an active rounding mode. In this talk, we use: round to nearest (default). If x ∈ R, RN(x) is the floating-point number that is the closest to x. In case of a tie, value whose integral significand is even. A breakpoint is a point where the rounding function changes. Here, breakpoint = the middle of two consecutive FP numbers.

  • 4-
slide-6
SLIDE 6

Correct rounding

A correctly-rounded operation whose entries are FP numbers must return what we would get by infinitely precise operation followed by rounding. It brings various benefits. In particular: accuracy and portability are improved; FP arithmetic becomes a structure in itself, that can be studied.

  • 5-
slide-7
SLIDE 7

Correct rounding

A correctly-rounded operation whose entries are FP numbers must return what we would get by infinitely precise operation followed by rounding. It brings various benefits. In particular: accuracy and portability are improved; FP arithmetic becomes a structure in itself, that can be studied. IEEE-754 (1985): Correct rounding for +, −, ×, ÷, √ and some conversions. IEEE-754 (2008): suggests correct rounding for some elementary functions.

  • 5-
slide-8
SLIDE 8

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2)

  • 6-
slide-9
SLIDE 9

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x)

  • 6-
slide-10
SLIDE 10

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) RN(f(x))

  • 6-
slide-11
SLIDE 11

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) RN(f(x)) Given ε > 0, the computed value f(x) satisfies |f(x) − f(x)|< ε.

  • 6-
slide-12
SLIDE 12

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x)

  • f(x)

ε RN(f(x)) Given ε > 0, the computed value f(x) satisfies |f(x) − f(x)|< ε.

  • 6-
slide-13
SLIDE 13

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x)

  • f(x)

ε = RN( f(x)) RN(f(x)) Given ε > 0, the computed value f(x) satisfies |f(x) − f(x)|< ε.

  • 6-
slide-14
SLIDE 14

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) ε

  • f(x)

RN(f(x)) Given ε > 0, the computed value f(x) satisfies |f(x) − f(x)|< ε.

  • 6-
slide-15
SLIDE 15

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) ε

  • f(x)

RN( f(x)) RN(f(x)) Given ε > 0, the computed value f(x) satisfies |f(x) − f(x)|< ε.

  • 6-
slide-16
SLIDE 16

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) ε/2 RN(f(x)) Given ε > 0, the computed value f(x) satisfies |f(x) − f(x)|< ε.

  • 6-
slide-17
SLIDE 17

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x)

  • f(x)

RN(f(x)) ε/2 Given ε > 0, the computed value f(x) satisfies |f(x) − f(x)|< ε.

  • 6-
slide-18
SLIDE 18

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) RN(f(x))= RN( f(x))

  • f(x)

ε/2 Given ε > 0, the computed value f(x) satisfies |f(x) − f(x)|< ε.

  • 6-
slide-19
SLIDE 19

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) RN(f(x))= RN( f(x))

  • f(x)

ε/2 Given ε > 0, the computed value f(x) satisfies |f(x) − f(x)|< ε. Potential issues: What if f(x) is a breakpoint? What about the number of subdivisions? ε should be uniform!

  • 6-
slide-20
SLIDE 20

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) RN(f(x))= RN( f(x))

  • f(x)

ε/2 Given ε > 0, the computed value f(x) satisfies |f(x) − f(x)|< ε. Potential issues: What if f(x) is a breakpoint? What about the number of subdivisions? ε should be uniform! And as large as possible!

  • 6-
slide-21
SLIDE 21

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) 2−m RN(f(x))

  • 7-
slide-22
SLIDE 22

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) 2−m (ℓ + 1)/2p−1 (2ℓ + 1)/2p ℓ/2p−1

  • 7-
slide-23
SLIDE 23

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) 2−m (ℓ + 1)/2p−1 (2ℓ + 1)/2p ℓ/2p−1 We want to find m ∈ N s.t. either there exists ℓ ∈ 2p−1, 2p − 1 s.t. f(x) = (2ℓ + 1)/2p,

  • 7-
slide-24
SLIDE 24

The Table Maker’s Dilemma

x ∈ [1, 2), x = j 2p−1 , j ∈ Z, 2p−1 j 2p − 1, f(x) ∈ [1, 2) f(x) 2−m (ℓ + 1)/2p−1 (2ℓ + 1)/2p ℓ/2p−1 We want to find m ∈ N, as small as possible, s.t. for all FP x: either there exists ℓ ∈ 2p−1, 2p − 1 s.t. f(x) = (2ℓ + 1)/2p,

  • r

for all k ∈ 2p−1, 2p − 1,

  • f(x) − 2k + 1

2p

  • 2−m.
  • 7-
slide-25
SLIDE 25

The Table Maker’s Dilemma: Diophantine Problems

Assume, w.l.o.g, that x and f(x) ∈ [1, 2).

  • Q. (TMD) We want to determine m ∈ N, as small as possible, s.t. for all

j ∈ 2p−1, 2p − 1, either there exists ℓ ∈ 2p−1, 2p − 1 s.t. f

  • j

2p−1

  • = 2ℓ+1

2p ,

  • r

for all 2p−1 ℓ 2p − 1,

  • f
  • j

2p−1

  • − 2ℓ + 1

2p

  • 1

2m .

  • 8-
slide-26
SLIDE 26

Some insight (Warning: Hand-waving!). . .

the infinitely precise significand y of f(x) has the form:

y = y0.y1y2 · · · yp−1 01111111 · · · 11

  • k bits

xxxxx · · ·

  • r

y = y0.y1y2 · · · yp−1

k bits

  • 10000000 · · · 00 xxxxx · · ·

with k 1.

  • 9-
slide-27
SLIDE 27

Some insight (Warning: Hand-waving!). . .

the infinitely precise significand y of f(x) has the form:

y = y0.y1y2 · · · yp−1 01111111 · · · 11

  • k bits

xxxxx · · ·

  • r

y = y0.y1y2 · · · yp−1

k bits

  • 10000000 · · · 00 xxxxx · · ·

with k 1. Assuming that after the kth position the “1” and “0” are equally likely, the “probability” of having k k0 is 21−k0;

  • 9-
slide-28
SLIDE 28

Some insight (Warning: Hand-waving!). . .

the infinitely precise significand y of f(x) has the form:

y = y0.y1y2 · · · yp−1 01111111 · · · 11

  • k bits

xxxxx · · ·

  • r

y = y0.y1y2 · · · yp−1

k bits

  • 10000000 · · · 00 xxxxx · · ·

with k 1. Assuming that after the kth position the “1” and “0” are equally likely, the “probability” of having k k0 is 21−k0; if we consider 2p−1 input FP numbers, around 2p−1 × 21−k0 = 2p−k0 values for which k k0 ; → roughly, ”mopt ∼ 2p” (Q).

  • 9-
slide-29
SLIDE 29

The Table Maker’s Dilemma: an Example

Consider the function 2x and the binary64 FP number (base 2, p = 53) x = 8520761231538509 262 We have

252+x = 4509371038706515.4999999999999999994026 · · · (decimal) = 1 · · ·

  • 53 bits

.01 · · · · · · · · · · · · 1

  • 60 consecutive 1′s

0 · · · (binary).

  • 10-
slide-30
SLIDE 30

The Table Maker’s Dilemma: an Example

Consider the function 2x and the binary64 FP number (base 2, p = 53) x = 8520761231538509 262 We have

252+x = 4509371038706515.4999999999999999994026 · · · (decimal) = 1 · · ·

  • 53 bits

.01 · · · · · · · · · · · · 1

  • 60 consecutive 1′s

0 · · · (binary).

Hardest-to-round (HR) case for function 2x and binary64 FP numbers. Lefèvre et al.: the value of m is 113 (∼ 2p, p = 53) here.

  • 10-
slide-31
SLIDE 31

The Table Maker’s Dilemma: an Example

Consider the function 2x and the binary64 FP number (base 2, p = 53) x = 8520761231538509 262 We have

252+x = 4509371038706515.4999999999999999994026 · · · (decimal) = 1 · · ·

  • 53 bits

.01 · · · · · · · · · · · · 1

  • 60 consecutive 1′s

0 · · · (binary).

Hardest-to-round (HR) case for function 2x and binary64 FP numbers. Lefèvre et al.: the value of m is 113 (∼ 2p, p = 53) here. Function f:

n

√ , sin, cos, arcsin, arccos, tan, arctan, exp, log, sinh, cosh...

  • 10-
slide-32
SLIDE 32

The Table Maker’s Dilemma: Diophantine Problems

Assume, w.l.o.g, that x and f(x) ∈ [1, 2).

  • Q. (TMD) We want to determine m ∈ N, as small as possible, s.t. for all

j ∈ 2p−1, 2p − 1, either there exists ℓ ∈ 2p−1, 2p − 1 s.t. f

  • j

2p−1

  • = 2ℓ + 1

2p ,

  • r

for all 2p−1 ℓ 2p − 1,

  • f
  • j

2p−1

  • − 2ℓ + 1

2p

  • 2−m.
  • 11-
slide-33
SLIDE 33

The Table Maker’s Dilemma: First Challenge

A breakpoint is a point where the rounding function changes. In this talk, it is the middle of two consecutive FP numbers. First challenge: Determine the set BPf of all the FP numbers x ∈ [1, 2) such that f(x) is a breakpoint. In other words, determine all j, ℓ ∈ 2p−1, 2p − 1 s.t. f

  • j

2p−1

  • = 2ℓ + 1

2p .

  • 12-
slide-34
SLIDE 34

State of the Art

A function f is algebraic if there exists P ∈ Z[X, Y ] \ {0} s.t. P(x, f(x)) = 0. Otherwise, f is transcendental.

  • 13-
slide-35
SLIDE 35

State of the Art

Algebraic functions. Works by Jeannerod, Louvet, Muller and Panhaleux.

  • 14-
slide-36
SLIDE 36

State of the Art

Algebraic functions. Works by Jeannerod, Louvet, Muller and Panhaleux. Transcendental elementary Functions sin, cos, arcsin, arccos, tan, arctan, exp, log, sinh, cosh. Hermite-Lindemann’s theorem: α = 0 algebraic ⇒ eα transcendental.

  • 14-
slide-37
SLIDE 37

State of the Art

Algebraic functions. Works by Jeannerod, Louvet, Muller and Panhaleux. Transcendental elementary Functions sin, cos, arcsin, arccos, tan, arctan, exp, log, sinh, cosh. Hermite-Lindemann’s theorem: α = 0 algebraic ⇒ eα transcendental. Therefore, let x a FP number, f(x) is not a breakpoint, except for straightforward cases (e0, ln(1), sin(0), . . . ).

  • 14-
slide-38
SLIDE 38

State of the Art

Algebraic functions. Works by Jeannerod, Louvet, Muller and Panhaleux. Transcendental elementary Functions sin, cos, arcsin, arccos, tan, arctan, exp, log, sinh, cosh. Hermite-Lindemann’s theorem: α = 0 algebraic ⇒ eα transcendental. Therefore, let x a FP number, f(x) is not a breakpoint, except for straightforward cases (e0, ln(1), sin(0), . . . ). What about the Euler Gamma function? For Re(z) > 0, Γ(z) = +∞ tz−1e−tdt. Very little is known: Γ(1/2), Γ(1/3), Γ(1/4), Γ(1/6), Γ(2/3), Γ(3/4), Γ(5/6) are transcendental and there are some partial irrationality results.

  • 14-
slide-39
SLIDE 39

Our setting

Let f : [1, 2) → [1, 2), f is transcendental and analytic in the neighborhood of [1, 2).

  • 15-
slide-40
SLIDE 40

Our setting

Let f : [1, 2) → [1, 2), f is transcendental and analytic in the neighborhood of [1, 2). Let g ∈ C([a, b]), g∞,[a,b] := max

x∈[a,b] |g(x)|.

  • 15-
slide-41
SLIDE 41

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We want to find all 2p−1 i, j 2p − 1 s.t. f

  • i

2p−1

  • = 2j + 1

2p ,

  • 16-
slide-42
SLIDE 42

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We want to find all 2p−1 i, j 2p − 1 s.t. f

  • i

2p−1

  • = 2j + 1

2p , i.e. 2j + 1 = 2pf

  • i

2p−1

  • .
  • 16-
slide-43
SLIDE 43

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We want to find all 2p−1 i, j 2p − 1 s.t. f

  • i

2p−1

  • = 2j + 1

2p , i.e. 2j + 1 = 2pf

  • i

2p−1

  • .

We build a trap! We find a partition of [1, 2) = ∪ℓIℓ such that, for all ℓ, we can compute Pℓ,1, Pℓ,2 ∈ Z[X, Y ] \ {0} with

  • Pℓ,k(2p−1u, 2pf(u))
  • < 1, k = 1, 2,

for all u ∈ Iℓ.

  • 16-
slide-44
SLIDE 44

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We build a trap! We find a partition of [1, 2) = ∪ℓIℓ such that, for all ℓ, we can compute Pℓ,1, Pℓ,2 ∈ Z[X, Y ] \ {0} with

  • Pℓ,k(2p−1u, 2pf(u))
  • < 1, k = 1, 2 for all u ∈ Iℓ.
  • 17-
slide-45
SLIDE 45

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We build a trap! We find a partition of [1, 2) = ∪ℓIℓ such that, for all ℓ, we can compute Pℓ,1, Pℓ,2 ∈ Z[X, Y ] \ {0} with

  • Pℓ,k(2p−1u, 2pf(u))
  • < 1, k = 1, 2 for all u ∈ Iℓ.

For all ℓ, if there exist 2p−1 i, j 2p − 1 s.t. i/2p−1 ∈ Iℓ and f

  • i

2p−1

  • = 2j + 1

2p ,

  • 17-
slide-46
SLIDE 46

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We build a trap! We find a partition of [1, 2) = ∪ℓIℓ such that, for all ℓ, we can compute Pℓ,1, Pℓ,2 ∈ Z[X, Y ] \ {0} with

  • Pℓ,k(2p−1u, 2pf(u))
  • < 1, k = 1, 2 for all u ∈ Iℓ.

For all ℓ, if there exist 2p−1 i, j 2p − 1 s.t. i/2p−1 ∈ Iℓ and f

  • i

2p−1

  • u

= 2j + 1 2p

v

, then we have, for k = 1, 2, Pℓ,k(i, 2j + 1) ∈ Z and |Pℓ,k(i, 2j + 1)| < 1!

  • 17-
slide-47
SLIDE 47

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We build a trap! We find a partition of [1, 2) = ∪ℓIℓ such that, for all ℓ, we can compute Pℓ,1, Pℓ,2 ∈ Z[X, Y ] \ {0} with

  • Pℓ,k(2p−1u, 2pf(u))
  • < 1, k = 1, 2 for all u ∈ Iℓ.

For all ℓ, if there exist 2p−1 i, j 2p − 1 s.t. i/2p−1 ∈ Iℓ and f

  • i

2p−1

  • u

= 2j + 1 2p

v

, then we have, for k = 1, 2, Pℓ,k(i, 2j + 1) ∈ Z and |Pℓ,k(i, 2j + 1)| < 1! ⇒ Pℓ,k(i, 2j + 1) = 0.

  • 17-
slide-48
SLIDE 48

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We build a trap! We find a partition of [1, 2) = ∪ℓIℓ such that, for all ℓ, we can compute Pℓ,1, Pℓ,2 ∈ Z[X, Y ] \ {0} with

  • Pℓ,k(2p−1u, 2pf(u))
  • < 1, k = 1, 2 for all u ∈ Iℓ.

For all ℓ, if there exist 2p−1 i, j 2p − 1 s.t. i/2p−1 ∈ Iℓ and f

  • i

2p−1

  • u

= 2j + 1 2p

v

, then we have, for k = 1, 2, Pℓ,k(i, 2j + 1) ∈ Z and |Pℓ,k(i, 2j + 1)| < 1! ⇒ Pℓ,k(i, 2j + 1) = 0. We eliminate (heuristic assumption!) one of the two variables and we get i and j, if they exist (Coppersmith; Boneh & Durfee; Stehlé).

  • 17-
slide-49
SLIDE 49

Interpolation at Chebyshev Nodes and Uniform Approximation

Definition Let n ∈ N, the Chebyshev nodes of the first kind of order n are the points µk = cos

  • (2k+1)π

2(n+1)

  • , k = 0, . . . , n.
  • 18-
slide-50
SLIDE 50

Interpolation at Chebyshev Nodes and Uniform Approximation

Definition Let n ∈ N, the Chebyshev nodes of the first kind of order n are the points µk = cos

  • (2k+1)π

2(n+1)

  • , k = 0, . . . , n.

Let pn ∈ Rn[X] that interpolates ϕ ∈ C([−1, 1]) at the (µk)k=0,..,n, then we have ϕ − pn∞,[−1,1] 2 1 π log(n + 1) + 1

  • min

Q∈Rn[x]ϕ − Q∞,[−1,1].

  • 18-
slide-51
SLIDE 51

Interpolation at Chebyshev Nodes and Uniform Approximation

Definition Let n ∈ N, the Chebyshev nodes of the first kind of order n are the points µk = cos

  • (2k+1)π

2(n+1)

  • , k = 0, . . . , n.

Let pn ∈ Rn[X] that interpolates ϕ ∈ Cn+1([a, b]) at the (µk)k=0,..,n, then for any x ∈ [−1, 1], there exists ξx ∈ (−1, 1) s.t. ϕ(x) − pn(x) = ϕ(n+1)(ξx) 2n(n + 1)! .

  • 18-
slide-52
SLIDE 52

Bernstein Ellipse

Let ρ > 1, let Eρ := ρeiθ + ρ−1e−iθ 2 , θ ∈ [0, 2π]

  • .
  • 19-
slide-53
SLIDE 53

Bernstein Ellipse

Let ρ > 1, let Eρ := ρeiθ + ρ−1e−iθ 2 , θ ∈ [0, 2π]

  • .

Bernstein ellipses for ρ = 1.05, 1.25, 1.45, 1.65, 1.85.

  • 19-
slide-54
SLIDE 54

Interpolation at Chebyshev Nodes and Uniform Approximation

Definition Let n ∈ N, the Chebyshev nodes of the first kind of order n are the points µk = cos

  • (2k+1)π

2(n+1)

  • , k = 0, . . . , n.

Let ρ > 1, let ϕ be a function analytic in a neighborhood of Eρ. Let pn ∈ Rn[X] that interpolates ϕ at the (µk)k=0,..,n, we have ϕ − pn∞,[−1,1] 4Mρ(ϕ) ρn(ρ − 1). where Mρ(ϕ) = maxz∈Eρ |ϕ(z)|.

  • 20-
slide-55
SLIDE 55

Interpolation at Chebyshev Nodes and Uniform Approximation

Definition Let n ∈ N, the Chebyshev nodes of the first kind of order n are the points µk = cos

  • (2k+1)π

2(n+1)

  • , k = 0, . . . , n.

Let P ∈ Rn[X], we have max

x∈[−1,1] |P(x)|

2 π log(n + 1) + 1

  • max

k=0,...,n |P(µk)|.

  • 20-
slide-56
SLIDE 56

Interpolation at Chebyshev Nodes and Uniform Approximation

Definition Let n ∈ N, the Chebyshev nodes of the first kind of order n are the points µk = cos

  • (2k+1)π

2(n+1)

  • , k = 0, . . . , n.

Let ϕ ∈ Cn+1, let pn ∈ Rn[X] that interpolates ϕ at the (µk)k=0,..,n, let rn = ϕ − pn∞,[−1,1],

  • 20-
slide-57
SLIDE 57

Interpolation at Chebyshev Nodes and Uniform Approximation

Definition Let n ∈ N, the Chebyshev nodes of the first kind of order n are the points µk = cos

  • (2k+1)π

2(n+1)

  • , k = 0, . . . , n.

Let ϕ ∈ Cn+1([−1, 1]), let pn ∈ Rn[X] that interpolates ϕ at the (µk)k=0,..,n, let rn = ϕ − pn∞,[−1,1], we have ϕ∞ pn∞ + rn 2 π log(n + 1) + 1

  • max

k=0,...,n |pn(µk)| + rn

  • 2

π log(n + 1) + 1

  • max

k=0,...,n |ϕ(µk)| + rn.

  • 20-
slide-58
SLIDE 58

Interpolation at Chebyshev Nodes and Uniform Approximation

Definition Let n ∈ N, the Chebyshev nodes of the first kind of order n are the points µk = cos

  • (2k+1)π

2(n+1)

  • , k = 0, . . . , n.

Let I = [a, b], one can define scaled Chebyshev nodes of the first kind of

  • rder n: µk,[a,b] = b−a

2 cos

  • (2k+1)π

2(n+1)

  • + a+b

2 , k = 0, . . . , n.

  • 20-
slide-59
SLIDE 59

Lattice Basis Reduction

  • 21-
slide-60
SLIDE 60

An Approach based on Lattice Basis Reduction

Definition Let L be a nonempty subset of Rd, L is a lattice iff there exists a set of vectors b1, . . . , bk R-linearly independent such that L = Z.b1 ⊕ · · · ⊕ Z.bk. (b1, . . . , bk) is a basis of the lattice L.

  • Examples. Zd, every subgroup of Zd.
  • 22-
slide-61
SLIDE 61

Example: The Lattice Z(2, 0) ⊕ Z(1, 2)

(0, 0) (2, 0) (1, 2)

  • 23-
slide-62
SLIDE 62

Example: The Lattice Z(2, 0) ⊕ Z(1, 2)

(0, 0) (2, 0) (1, 2) u v −3u + v 2u − v

  • 24-
slide-63
SLIDE 63

Example: The Lattice Z(2, 0) ⊕ Z(1, 2)

(0, 0) (2, 0) (1, 2) u v −3u + v 2u − v

SVP (Shortest Vector Problem)

  • 24-
slide-64
SLIDE 64

Example: The Lattice Z(2, 0) ⊕ Z(1, 2)

(0, 0) (2, 0) (1, 2) u v −3u + v 2u − v

SVP (Shortest Vector Problem) is NP-hard.

  • 24-
slide-65
SLIDE 65

Lenstra-Lenstra-Lovász Algorithm

SVP (Shortest Vector Problem) is NP-hard. Factoring Polynomials with Rational Coefficients, A. K. Lenstra, H. W. Lenstra and L. Lovász, Math. Annalen 261, 515-534, 1982. The LLL algorithm gives an approximate solution to SVP in polynomial time.

  • 25-
slide-66
SLIDE 66

Lenstra-Lenstra-Lovász Algorithm

Theorem Let L a lattice of rank k. LLL provides a basis (b1, . . . , bk) made of “pretty” short vectors. We have ||b1|| 2(k−1)/2λ1(L) where λ1(L) denotes the norm of a shortest nonzero vector of L. It terminates in at most O(k6 ln3 B) operations with B ||bi||2 for all i.

  • 26-
slide-67
SLIDE 67

Lenstra-Lenstra-Lovász Algorithm

Theorem Let L a lattice of rank k. LLL provides a basis (b1, . . . , bk) made of “pretty” short vectors. We have ||b1|| 2(k−1)/2λ1(L) where λ1(L) denotes the norm of a shortest nonzero vector of L. It terminates in at most O(k6 ln3 B) operations with B ||bi||2 for all i. Let (b1, . . . , bk) be an LLL-reduced basis L then ||b1|| 2k/2(vol L)1/k and ||b2|| 2k/2(vol L)

1 k−1 .

  • 26-
slide-68
SLIDE 68

Example: The Lattice Z(2, 0) ⊕ Z(1, 2)

(0, 0) (2, 0) (1, 2) u v −3u + v 2u − v

  • 27-
slide-69
SLIDE 69

Example: The Lattice Z(2, 0) ⊕ Z(1, 2)

  • 28-
slide-70
SLIDE 70

Lenstra-Lenstra-Lovász Algorithm

Theorem Let L a lattice of rank k. LLL provides a basis (b1, . . . , bk) made of “pretty” short vectors. We have ||b1|| 2(k−1)/2λ1(L) where λ1(L) denotes the norm of a shortest nonzero vector of L. It terminates in at most O(k6 ln3 B) operations with B ||bi||2 for all i. Let (b1, . . . , bk) be an LLL-reduced basis L then ||b1|| 2k/2(vol L)1/k and ||b2|| 2k/2(vol L)

1 k−1 .

  • 29-
slide-71
SLIDE 71

How do we compute P1 and P2?

Let P1 =

u,v αu,vXuY v and P2 = u,v βu,vXuY v ∈ Z[X, Y ]. We

want to have

  • Pk(2p−1x, 2pf(x))
  • < 1,

k = 1, 2, (1) for all x ∈ I = [a, b].

  • 30-
slide-72
SLIDE 72

How do we compute P1 and P2?

Let P1 =

u,v αu,vXuY v and P2 = u,v βu,vXuY v ∈ Z[X, Y ]. We

want to have

  • Pk(2p−1x, 2pf(x))
  • < 1,

k = 1, 2, (1) for all x ∈ I = [a, b]. To do so, discretize (1): plug (scaled) Chebyshev nodes xk. Rough (but actually very reasonable) intuition: If P1 and P2 are small at these points, then they are small on the whole domain.

  • 30-
slide-73
SLIDE 73

How do we compute P1 and P2?

We want P1(2p−1xk, 2pf(xk)) and P2(2p−1xk, 2pf(xk)) small, i.e. the vectors

  • u,v

αu,v(2p−1xk)u(2pf(xk))v

  • k

=

  • u,v

αu,v

  • (2p−1xk)u(2pf(xk))v

k

  • eu,v

and

  • u,v

βu,v(2p−1xk)u(2pf(xk))v

  • k

=

  • u,v

βu,v

  • (2p−1xk)u(2pf(xk))v

k

  • eu,v

should be small.

  • 31-
slide-74
SLIDE 74

How do we compute P1 and P2?

We want P1(2p−1xk, 2pf(xk)) and P2(2p−1xk, 2pf(xk)) small, i.e. the vectors

  • u,v

αu,v(2p−1xk)u(2pf(xk))v

  • k

=

  • u,v

αu,v

  • (2p−1xk)u(2pf(xk))v

k

  • eu,v

and

  • u,v

βu,v(2p−1xk)u(2pf(xk))v

  • k

=

  • u,v

βu,v

  • (2p−1xk)u(2pf(xk))v

k

  • eu,v

should be small. This is a shortest vector problem in the lattice

  • u,v

Zeu,v!

  • 31-
slide-75
SLIDE 75

How do we compute P1 and P2?

We want P1(2p−1xk, 2pf(xk)) and P2(2p−1xk, 2pf(xk)) small, i.e. the vectors

  • u,v

αu,v(2p−1xk)u(2pf(xk))v

  • k

=

  • u,v

αu,v

  • (2p−1xk)u(2pf(xk))v

k

  • eu,v

and

  • u,v

βu,v(2p−1xk)u(2pf(xk))v

  • k

=

  • u,v

βu,v

  • (2p−1xk)u(2pf(xk))v

k

  • eu,v

should be small. This is a shortest vector problem in the lattice

  • u,v

Zeu,v! LLL yields two fairly short vectors, corresponding to the small values P1(xk, f(xk)) and P2(xk, f(xk)) we wish to obtain.

  • 31-
slide-76
SLIDE 76

How do we compute P1 and P2? Dimension of the Lattice

Let d ∈ N \ {0} and N = (d + 1)(d + 2)/2. We introduce fkℓ(x) = (2p−1x)ℓ(2pf(x))k for 0 k d, 0 ℓ d − k.

  • 32-
slide-77
SLIDE 77

How do we compute P1 and P2? Dimension of the Lattice

Let d ∈ N \ {0} and N = (d + 1)(d + 2)/2. We introduce fkℓ(x) = (2p−1x)ℓ(2pf(x))k for 0 k d, 0 ℓ d − k. Let P1 =

0kd 0ℓd−k

αk,ℓXℓY k and P2 =

0kd 0ℓd−k

βk,ℓXℓY k ∈ Z[X, Y ]. We want to have

  • Pk(2p−1x, 2pf(x))
  • < 1, k = 1, 2,

(2) for all x ∈ I = [a, b].

  • 32-
slide-78
SLIDE 78

How do we compute P1 and P2? Dimension of the Lattice

Let d ∈ N \ {0} and N = (d + 1)(d + 2)/2. We introduce fkℓ(x) = (2p−1x)ℓ(2pf(x))k for 0 k d, 0 ℓ d − k. Let P1 =

0kd 0ℓd−k

αk,ℓXℓY k and P2 =

0kd 0ℓd−k

βk,ℓXℓY k ∈ Z[X, Y ]. We want to have

  • Pk(2p−1x, 2pf(x))
  • < 1, k = 1, 2,

(2) for all x ∈ I = [a, b]. To do so, discretize (2): plug (scaled) Chebyshev nodes xk. If P1 and P2 are small at these points, then they are small on the whole domain.

  • 32-
slide-79
SLIDE 79

How do we compute P1 and P2? Volume of the Lattice

Let d ∈ N \ {0} and N = (d + 1)(d + 2)/2. We introduce fkℓ(x) = (2p−1x)ℓ(2pf(x))k for 0 k d, 0 ℓ d − k. Let (xj)0jN−1 denote Chebyshev nodes for the interval I = [a, b], we want P1(2p−1xj, 2pf(xj)) and P2(2p−1xj, 2pf(xj)) to be small.

  • 33-
slide-80
SLIDE 80

How do we compute P1 and P2? Volume of the Lattice

Let d ∈ N \ {0} and N = (d + 1)(d + 2)/2. We introduce fkℓ(x) = (2p−1x)ℓ(2pf(x))k for 0 k d, 0 ℓ d − k. Let (xj)0jN−1 denote Chebyshev nodes for the interval I = [a, b], we want P1(2p−1xj, 2pf(xj)) and P2(2p−1xj, 2pf(xj)) to be small. It is related to the smallness of the volume of the lattice

  • 0kd

0ℓd−k

Z (fkℓ(xj−1))1jN i.e.

  • det (fkℓ(xj−1)) 0kd

0ℓd−k,1jN

1/N .

  • 33-
slide-81
SLIDE 81

How do we compute P1 and P2? Volume of the Lattice

Let d ∈ N \ {0} and N = (d + 1)(d + 2)/2. We introduce fkℓ(x) = (2p−1x)ℓ(2pf(x))k for 0 k d, 0 ℓ d − k. Let (xj)0jN−1 denote Chebyshev nodes for the interval I = [a, b]. We have, for ρ > 1,

  • det (fkℓ(xj−1)) 0kd

0ℓd−k,1jN

1/N

  • “negligible”(2p2p−1)d/3

ρ(N−1)/2

  • b − a

2 ρ + ρ−1 2

  • + b + a

2

  • d/3

Mρ,a,b(f)d/3. where Mρ,a,b(f) = maxz∈Eρ,a,b |f(z)| and Eρ,a,b =

  • b−a

2 ρeiθ+ρ−1e−iθ 2

+ a+b

2 , θ ∈ [0, 2π]

  • .
  • 34-
slide-82
SLIDE 82

How do we compute P1 and P2? Volume of the Lattice

We have, for ρ > 1,

  • det (fkℓ(xj−1)) 0kd

0ℓd−k,1jN

1/N

  • “negligible”(2p2p−1)d/3

ρ(N−1)/2

  • b − a

2 ρ + ρ−1 2

  • + b + a

2

  • d/3

Mρ,a,b(f(z))d/3. where Mρ,a,b(f(z)) = maxz∈Eρ,a,b |f(z)| and Eρ,a,b =

  • b−a

2 ρeiθ+ρ−1e−iθ 2

+ a+b

2 , θ ∈ [0, 2π]

  • .
  • 35-
slide-83
SLIDE 83

How do we compute P1 and P2? Volume of the Lattice

We have, for ρ > 1,

  • det (fkℓ(xj−1)) 0kd

0ℓd−k,1jN

1/N

  • “negligible”(2p2p−1)d/3

ρ(N−1)/2

  • b − a

2 ρ + ρ−1 2

  • + b + a

2

  • d/3

Mρ,a,b(f(z))d/3. where Mρ,a,b(f(z)) = maxz∈Eρ,a,b |f(z)| and Eρ,a,b =

  • b−a

2 ρeiθ+ρ−1e−iθ 2

+ a+b

2 , θ ∈ [0, 2π]

  • .

For Gamma, d = O(p) is enough to tackle the whole [a, b]

  • 35-
slide-84
SLIDE 84

How do we compute P1 and P2? Volume of the Lattice

We have, for ρ > 1,

  • det (fkℓ(xj−1)) 0kd

0ℓd−k,1jN

1/N

  • “negligible”(2p2p−1)d/3

ρ(N−1)/2

  • b − a

2 ρ + ρ−1 2

  • + b + a

2

  • d/3

Mρ,a,b(f(z))d/3. where Mρ,a,b(f(z)) = maxz∈Eρ,a,b |f(z)| and Eρ,a,b =

  • b−a

2 ρeiθ+ρ−1e−iθ 2

+ a+b

2 , θ ∈ [0, 2π]

  • .

For Gamma, d = O(p) is enough to tackle the whole [a, b] (ρ = 2/(b − a)).

  • 35-
slide-85
SLIDE 85

How do we compute P1 and P2? Volume of the Lattice

We have, for ρ > 1,

  • det (fkℓ(xj−1)) 0kd

0ℓd−k,1jN

1/N

  • “negligible”(2p2p−1)d/3

ρ(N−1)/2

  • b − a

2 ρ + ρ−1 2

  • + b + a

2

  • d/3

Mρ,a,b(f(z))d/3. where Mρ,a,b(f(z)) = maxz∈Eρ,a,b |f(z)| and Eρ,a,b =

  • b−a

2 ρeiθ+ρ−1e−iθ 2

+ a+b

2 , θ ∈ [0, 2π]

  • .

For Gamma, d = O(p) is enough to tackle the whole [a, b] (ρ = 2/(b − a)). If [a, b] = [1, 2], less than one hour for p = 53 and 1 CPU year for p = 113.

  • 35-
slide-86
SLIDE 86

The Table Maker’s Dilemma

A breakpoint is a point where the rounding function changes. In this talk, it is the middle of two consecutive FP numbers. Two-step challenge: Determine the set BPf of all the FP numbers x such that f(x) is a breakpoint;

  • 36-
slide-87
SLIDE 87

The Table Maker’s Dilemma

A breakpoint is a point where the rounding function changes. In this talk, it is the middle of two consecutive FP numbers. Two-step challenge: Determine the set BPf of all the FP numbers x such that f(x) is a breakpoint; Find m ∈ N, as small as possible, s.t. for all FP number y / ∈ BPf, if we use an internal precision of m bits to evaluate the f(y)’s, then we will always get RN(f(y)).

  • 36-
slide-88
SLIDE 88

The Table Maker’s Dilemma

A breakpoint is a point where the rounding function changes. In this talk, it is the middle of two consecutive FP numbers. Two-step challenge: Determine the set BPf of all the FP numbers x such that f(x) is a breakpoint; Find m ∈ N, as small as possible, s.t. for all FP number y / ∈ BPf, if we use an internal precision of m bits to evaluate the f(y)’s, then we will always get RN(f(y)). In other words, find m ∈ N, as small as possible, such that for all j, ℓ ∈ 2p−1, 2p − 1 s.t. j/2p−1 / ∈ BPf and

  • f
  • j

2p−1

  • − 2ℓ + 1

2p

  • 2−m.
  • 36-
slide-89
SLIDE 89

Finding m beyond which there is no problem ?

Function f: sin, cos, arcsin, arccos, tan, arctan, exp, log, sinh, cosh.

  • 37-
slide-90
SLIDE 90

Finding m beyond which there is no problem ?

Function f: sin, cos, arcsin, arccos, tan, arctan, exp, log, sinh, cosh. Hermite-Lindemann’s theorem (α = 0 algebraic ⇒ eα transcendental). If x is a FP number, there exists an mx, s.t. rounding the mx-bit approximation ⇔ rounding f(x);

  • 37-
slide-91
SLIDE 91

Finding m beyond which there is no problem ?

Function f: sin, cos, arcsin, arccos, tan, arctan, exp, log, sinh, cosh. Hermite-Lindemann’s theorem (α = 0 algebraic ⇒ eα transcendental). If x is a FP number, there exists an mx, s.t. rounding the mx-bit approximation ⇔ rounding f(x); finite number of FP numbers → ∃mmax = maxx(mx) s.t. ∀x, rounding the mmax-bit approximation to f(x) is equivalent to rounding f(x);

  • 37-
slide-92
SLIDE 92

Finding m beyond which there is no problem ?

Function f: sin, cos, arcsin, arccos, tan, arctan, exp, log, sinh, cosh. Hermite-Lindemann’s theorem (α = 0 algebraic ⇒ eα transcendental). If x is a FP number, there exists an mx, s.t. rounding the mx-bit approximation ⇔ rounding f(x); finite number of FP numbers → ∃mmax = maxx(mx) s.t. ∀x, rounding the mmax-bit approximation to f(x) is equivalent to rounding f(x); Questions: how large can m be? How to determine it?

  • 37-
slide-93
SLIDE 93

TMD: What can Diophantine Approximation do for us?

Algebraic functions. J.-M. Muller (CNRS, ENS Lyon) used an old and simple (but fundamental) result by J. Liouville to give non trivial estimates for

d

√ : roughly, m dp. The exp function. Khemira and Voutier (2011): for binary64 (p = 53) format, m 16, 490

  • 38-
slide-94
SLIDE 94

TMD: What can Diophantine Approximation do for us?

Algebraic functions. J.-M. Muller (CNRS, ENS Lyon) used an old and simple (but fundamental) result by J. Liouville to give non trivial estimates for

d

√ : roughly, m dp. The exp function. Khemira and Voutier (2011): for binary64 (p = 53) format, m 16, 490(∼ 311p), whereas the hand-waving argument suggests that precision 120 should be sufficient, binary128 (p = 113) format, m 45, 928

  • 38-
slide-95
SLIDE 95

TMD: What can Diophantine Approximation do for us?

Algebraic functions. J.-M. Muller (CNRS, ENS Lyon) used an old and simple (but fundamental) result by J. Liouville to give non trivial estimates for

d

√ : roughly, m dp. The exp function. Khemira and Voutier (2011): for binary64 (p = 53) format, m 16, 490(∼ 311p), whereas the hand-waving argument suggests that precision 120 should be sufficient, binary128 (p = 113) format, m 45, 928(∼ 406p), whereas the hand-waving argument suggests that precision 250 should be sufficient.

  • 38-
slide-96
SLIDE 96

TMD: What can Diophantine Approximation do for us?

Algebraic functions. J.-M. Muller (CNRS, ENS Lyon) used an old and simple (but fundamental) result by J. Liouville to give non trivial estimates for

d

√ : roughly, m dp. The exp function. Khemira and Voutier (2011): for binary64 (p = 53) format, m 16, 490(∼ 311p), whereas the hand-waving argument suggests that precision 120 should be sufficient, binary128 (p = 113) format, m 45, 928(∼ 406p), whereas the hand-waving argument suggests that precision 250 should be sufficient. Need for an algorithmic approach!

  • 38-
slide-97
SLIDE 97

TMD: Algorithmic Approaches

Reminder: p = 24, 53, 64, 113. Find all 2p−1 j, ℓ 2p − 1 s.t.

  • f
  • j

2p−1

  • − 2ℓ + 1

2p

  • <

1 22p . Two algorithmic approaches:

Lefèvre (extension of Euclid’s algorithm) in O(22p/3) : TMD solved for p = 53. Stehlé, Lefèvre & Zimmermann (Coppersmith method, based on LLL) in O(2p/2): TMD solved for p = 64.

  • 39-
slide-98
SLIDE 98

TMD: Algorithmic Approaches

Reminder: p = 24, 53, 64, 113. Find all 2p−1 j, ℓ 2p − 1 s.t.

  • f
  • j

2p−1

  • − 2ℓ + 1

2p

  • <

1 22p . Two algorithmic approaches:

Lefèvre (extension of Euclid’s algorithm) in O(22p/3) : TMD solved for p = 53. Stehlé, Lefèvre & Zimmermann (Coppersmith method, based on LLL) in O(2p/2): TMD solved for p = 64.

We need another approach to tackle the case of the quadruple precision (p = 113)!

  • 39-
slide-99
SLIDE 99

Relaxed TMD

What about solving

  • f
  • j

2p−1

  • − 2ℓ + 1

2p

  • < 1

2m , 2p−1 j, ℓ 2p − 1, when m = 6p, 8p, 10p instead of 2p?

  • 40-
slide-100
SLIDE 100

Relaxed TMD

What about solving

  • f
  • j

2p−1

  • − 2ℓ + 1

2p

  • < 1

2m , 2p−1 j, ℓ 2p − 1, when m = 6p, 8p, 10p instead of 2p? Lefèvre’s algorithm and Stehlé’s improvement of SLZ algorithm get unpractical for p = 113 (quadruple precision).

  • 40-
slide-101
SLIDE 101

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

Remember that x and f(x) ∈ [1, 2). Given m ∈ N, we want to find all 2p−1 i, j 2p − 1 s.t.

  • f
  • i

2p−1

  • − 2j + 1

2p

  • < 1

2m ,

  • 41-
slide-102
SLIDE 102

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

Remember that x and f(x) ∈ [1, 2). Given m ∈ N, we want to find all 2p−1 i, j 2p − 1 s.t.

  • f
  • i

2p−1

  • − 2j + 1

2p

  • < 1

2m , i.e.

  • 2p−1f
  • i

2p−1

  • − 1

2 − j

  • <

1 2m−p+1 .

  • 41-
slide-103
SLIDE 103

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

Remember that x and f(x) ∈ [1, 2). Given m ∈ N, we want to find all 2p−1 i, j 2p − 1 s.t.

  • f
  • i

2p−1

  • − 2j + 1

2p

  • < 1

2m , i.e.

  • 2p−1f
  • i

2p−1

  • − 1

2

  • g(i)

− j

  • ∈Z
  • <

1 2m−p+1 . We have to find “integer points” close to a curve.

  • 41-
slide-104
SLIDE 104

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

Remember that x and f(x) ∈ [1, 2). Given m ∈ N, we want to find all 2p−1 i, j 2p − 1 s.t.

  • f
  • i

2p−1

  • − 2j + 1

2p

  • < 1

2m ,

  • 42-
slide-105
SLIDE 105

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

Remember that x and f(x) ∈ [1, 2). Given m ∈ N, we want to find all 2p−1 i, j 2p − 1 s.t.

  • f
  • i

2p−1

  • − 2j + 1

2p

  • < 1

2m , i.e. 2j + 1 ∈

  • 2p
  • f
  • i

2p−1

  • − 1

2m

  • , 2p
  • f
  • i

2p−1

  • + 1

2m

  • .
  • 42-
slide-106
SLIDE 106

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

Remember that x and f(x) ∈ [1, 2). Given m ∈ N, we want to find all 2p−1 i, j 2p − 1 s.t.

  • f
  • i

2p−1

  • − 2j + 1

2p

  • < 1

2m , i.e. 2j + 1 ∈

  • 2p
  • f
  • i

2p−1

  • − 1

2m

  • , 2p
  • f
  • i

2p−1

  • + 1

2m

  • .

Here again, we build a trap! We find a partition of [1, 2) = ∪ℓIℓ such that, for all ℓ, we can compute Pℓ,1, Pℓ,2 ∈ Z[X, Y ] \ {0} with

  • Pℓ,k(2p−1u, 2pv)
  • < 1, k = 1, 2,

for all u ∈ Iℓ and for all v ∈

  • f (u) −

1 2m , f (u) + 1 2m

  • .
  • 42-
slide-107
SLIDE 107

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We build a trap! We find a partition of [1, 2) = ∪ℓIℓ such that, for all ℓ, we can compute Pℓ,1, Pℓ,2 ∈ Z[X, Y ] \ {0} with

  • Pℓ,k(2p−1u, 2pv)
  • < 1, k = 1, 2,

for all u ∈ Iℓ and for all v ∈

  • f (u) −

1 2m , f (u) + 1 2m

  • .
  • 43-
slide-108
SLIDE 108

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We build a trap! We find a partition of [1, 2) = ∪ℓIℓ such that, for all ℓ, we can compute Pℓ,1, Pℓ,2 ∈ Z[X, Y ] \ {0} with

  • Pℓ,k(2p−1u, 2pv)
  • < 1, k = 1, 2,

for all u ∈ Iℓ and for all v ∈

  • f (u) −

1 2m , f (u) + 1 2m

  • .

For all ℓ, if there exist 2p−1 i, j 2p − 1 s.t. i/2p−1 ∈ Iℓ and

  • f
  • i

2p−1

  • − 2j + 1

2p

  • < 1

2m , then we have, for k = 1, 2, Pℓ,k(i, 2j + 1) ∈ Z and |Pℓ,k(i, 2j + 1)| < 1!

  • 43-
slide-109
SLIDE 109

Our Approach: Polynomial Interpolation and Lattice Basis Reduction

We build a trap! We find a partition of [1, 2) = ∪ℓIℓ such that, for all ℓ, we can compute Pℓ,1, Pℓ,2 ∈ Z[X, Y ] \ {0} with

  • Pℓ,k(2p−1u, 2pv)
  • < 1, k = 1, 2,

for all u ∈ Iℓ and for all v ∈

  • f (u) −

1 2m , f (u) + 1 2m

  • .

For all ℓ, if there exist 2p−1 i, j 2p − 1 s.t. i/2p−1 ∈ Iℓ and

  • f
  • i

2p−1

  • u

− 2j + 1 2p

v

  • < 1

2m , then we have, for k = 1, 2, Pℓ,k(i, 2j + 1) ∈ Z and |Pℓ,k(i, 2j + 1)| < 1! We eliminate (heuristic assumption!) one of the two variables and we get i and j, if they exist (Coppersmith; Boneh & Durfee; Stehlé).

  • 43-
slide-110
SLIDE 110

How do we compute P1 and P2? The Lattice

Let d ∈ N \ {0} and N = (d + 1)(d + 2)/2. We introduce fkℓ(x, y) = (2p−1x)ℓ(2p(f(x) + y))k for 0 k d, 0 ℓ d − k. Let (xj1)0j1N−1 denote Chebyshev nodes for I = [a, b] and (yj2)0j2N−1 denote Chebyshev nodes for I = [−2−m, 2−m]. We want P1(2p−1xj1, 2p(f(xj1) + yj2) and P2(2p−1xj1, 2p(f(xj1) + yj2)) to be small.

  • 44-
slide-111
SLIDE 111

How do we compute P1 and P2? The Lattice

Let d ∈ N \ {0} and N = (d + 1)(d + 2)/2. We introduce fkℓ(x, y) = (2p−1x)ℓ(2p(f(x) + y))k for 0 k d, 0 ℓ d − k. Let (xj1)0j1N−1 denote Chebyshev nodes for I = [a, b] and (yj2)0j2N−1 denote Chebyshev nodes for I = [−2−m, 2−m]. We want P1(2p−1xj1, 2p(f(xj1) + yj2) and P2(2p−1xj1, 2p(f(xj1) + yj2)) to be small. It is related to the smallness of the volume of the lattice

  • 0kd

0ℓd−k

Z (fkℓ(xj1, yj2))0j1N−1

0j2d−1

i.e.

  • det (fkℓ(xj1, yj2)) 0kd

0ℓd−k,0j1N−1 0j2d−1

1/N .

  • 44-
slide-112
SLIDE 112

Table’s Maker Dilemma: Some Bad News

The complexity of the algorithm, for (uniformly) solving

  • f(x) − 2j + 1

2p

  • <

1 22p , x ∈ [1, 2), is essentially: O(22p/3) for Lefèvre’s algorithm; O(2p/2) for Stehlé’s improvement of Stehlé-Lefèvre-Zimmermann algorithm. Reminder: the target is p = 113.

  • 45-
slide-113
SLIDE 113

Table’s Maker Dilemma: Some Bad News

The complexity of the algorithm, for (uniformly) solving

  • f(x) − 2j + 1

2p

  • <

1 22p , x ∈ [1, 2), is essentially: O(22p/3) for Lefèvre’s algorithm; O(2p/2) for Stehlé’s improvement of Stehlé-Lefèvre-Zimmermann algorithm. Reminder: the target is p = 113. Alas, our work does not bring any new result for m ∼ 2p.

  • 45-
slide-114
SLIDE 114

Relaxed TMD: Some Good News

What about the complexity of the algorithms for (uniformly) solving

  • f(x) − 2j + 1

2p

  • < 1

2m , x ∈ [1, 2), when m = 6p, 8p, 10p, ≫ p?

  • 46-
slide-115
SLIDE 115

Relaxed TMD: Some Good News

What about the complexity of the algorithms for (uniformly) solving

  • f(x) − 2j + 1

2p

  • < 1

2m , x ∈ [1, 2), when m = 6p, 8p, 10p, ≫ p? Lefèvre’s algorithm gets unpractical the complexity for Stehlé’s improvement of Stehlé-Lefèvre-Zimmermann algorithm gets polynomial when m ≈ p2 but it is also unpractical.

  • 46-
slide-116
SLIDE 116

Relaxed TMD: Some Good News

What about the complexity of the algorithms for (uniformly) solving

  • f(x) − 2j + 1

2p

  • < 1

2m , x ∈ [1, 2), when m = 6p, 8p, 10p, ≫ p? Lefèvre’s algorithm gets unpractical the complexity for Stehlé’s improvement of Stehlé-Lefèvre-Zimmermann algorithm gets polynomial when m ≈ p2 but it is also unpractical. S. Torres’ implementation targets m = 6p, 8p, 10p and is also unpractical, as if today.

  • 46-
slide-117
SLIDE 117

Relaxed TMD: Some Good News

What about the complexity of the algorithms for (uniformly) solving

  • f(x) − 2j + 1

2p

  • < 1

2m , x ∈ [1, 2), when m = 6p, 8p, 10p, ≫ p? Lefèvre’s algorithm gets unpractical the complexity for Stehlé’s improvement of Stehlé-Lefèvre-Zimmermann algorithm gets polynomial when m ≈ p2 but it is also unpractical. S. Torres’ implementation targets m = 6p, 8p, 10p and is also unpractical, as if today. The determination of the theoretical complexity of our approach is work in progress. For exp and p = 113, our experiments suggest: we can address the case m = 6p

  • 46-
slide-118
SLIDE 118

Relaxed TMD: Some Good News

What about the complexity of the algorithms for (uniformly) solving

  • f(x) − 2j + 1

2p

  • < 1

2m , x ∈ [1, 2), when m = 6p, 8p, 10p, ≫ p? Lefèvre’s algorithm gets unpractical the complexity for Stehlé’s improvement of Stehlé-Lefèvre-Zimmermann algorithm gets polynomial when m ≈ p2 but it is also unpractical. S. Torres’ implementation targets m = 6p, 8p, 10p and is also unpractical, as if today. The determination of the theoretical complexity of our approach is work in progress. For exp and p = 113, our experiments suggest: we can address the case m = 6p(= 678 bits). We hope this work will pave the way for correctly rounded elementary functions in IEEE binary128/quadruple precision.

  • 46-