 
              On the Comparison of Numbers: An excursion into non-classical corners of computational mathematics Philippe Flajolet, INRIA, Rocquencourt http://algo.inria.fr/flajolet Based on joint work with Brigitte Vall´ ee, Caen 1
A very simple problem Let X, Y be uniformly distributed over [ 0, 1 ] . Cost of deciding X < > Y ? A very simple solution If X, Y are given in some base, e.g., B = 2 — scan digits; — stop when discrepant digits/bits are found. Expected cost (number of comparisons) is E [ C ] = 2 . Probability distribution is geometric : P ( C = k ) = 2 − k . T HE E ND !? 2
What if X and Y are given as fractions? X = a Y = c ? b, d E.g. Compare: 36 113 97.8 307.1 < < 355 ; ? > > 113 307.1 964.9 Still mathematically trivial sign ( ad − bc ) . . . • Requires double precision • Unsafe in fixed precision (floats) 3
Efficiency in symbolic manipulation systems and number-theory packages: O ( N 2 ) boolean complexity. Robustness: Problems with floats. Independent solutions: • HAKMEM (Gosper 1972) • Knuth’s Metafont (late 1970s); • computational geometry ≡ orientation [Boissonnat, et al.] 4
Introduction. HAKMEM 1972 Item 101A (Gosper): Numerical comparison of continued fractions is slightly harder than in decimal, but much easier than with rationals -- just invert the decision as to which is larger whenever the first discrepant terms are even-numbered. Contrast this with the problem of comparing the rationals 113/36 and 355/113. α = 113 1 β = 355 1 36 = 3 + , 113 = 3 + , 7 + 1 1 7 + 16 5 = two “lazy” parallel executions of the continued fraction algorithm. 5
Simulations ( 10 6 ): HAKMEM X < > Y for X, Y ∈ [ 0, 1 ] , uniform. Basic CF-sign Centered CF-sign P ( L = 1 ) 0.710050 0.918003 P ( L = 2 ) 0.241275 0.075710 P ( L = 3 ) 0.038339 0.000422 P ( L = 4 ) 0.008424 0.000035 P ( L = 5 ) 0.001608 0.000005 E ( L ) 1.351612 1.088791 Decay roughly geometrically with k : 5 − k for BCF; 13 − k for CCF . The expectations seem to be constants: Anticipate E ( L ) = 1.352 ± 0.001 and E ( � L ) = 1.089 ± 0.001 . 6
1 Expanding maps & continued fractions A piecewise monotonic function U ( x ) (shift) from some interval I that is expanding: | U ′ ( x ) | > 1 . Branches are indexed by M = digits. A coding of a real x : m ( x ) , m ( U ( x )) , m ( U 2 ( x )) , . . . 7
procedure ( I , U, m ) -expansion ( x : I ) for k := 1 to + ∞ do m k := m ( x ) ; x := U ( x ) ; Set H 1 of branches of U (− 1 ) . Run backwards: x 0 = h m ( x k ) where h m ( y ) = h m 1 ◦ h m 2 ◦ · · · ◦ h m k ( y ) . The set H of all compositions h m is set of inverse branches . Equivalence principle. — the stochastic behaviour of a numbering system — the dynamics of U ( x ) on the interval I — the dynamics of the semigroup of contractions H . 8
Basic continued fraction expansions. � 1 � � 1 � I = [ 0, 1 ] , U ( x ) = , m ( x ) = . x x 1 Inverse branches of depth 1: H 1 := { h ( z ) = m + z m ≥ 1 } . 1 where x 0 = h m ( x k ) h ( y ) = . 1 m 1 + 1 m 2 + ... m k + y Centered continued fraction expansions. � � � � � 1 1 � H 1 := h ( z ) = m + z m ≥ 2 h ( z ) = m − z m ≥ 3 . 9
Sign algorithms. Take two random points; follow their trajectories under the shift U . Stop on discrepancy. procedure ( I , U, m ) -sign ( x, x ′ : I ) for k := 1 to + ∞ do k := m ( x ) ; x := U ( x ) ; x ′ = U ( x ′ ) ; m k := m ( x ) ; m ′ if m k � = m ′ k then exit . 10
2 Fundamental intervals & signs Uniform probability model (Lebesgue) on I . The probability that x ∈ I belongs to the fundamental interval determined by the branch h is, with | I | = 1 , u h = | h ( 0 ) − h ( 1 ) | . 11
The square I × I . � Event [ L ≥ k + 1 ] = h ( I ) × h ( I ) . (1) | h | = k � � u 2 | h ( 0 ) − h ( 1 ) | 2 . P ([ L ≥ k + 1 ]) = h = (2) | h | = k | h | = k 12
The average number of iterations is obtained by summation: � � u 2 | h ( 0 ) − h ( 1 ) | 2 . E [ L ] = h = (3) h h More generally, we define the “moment sums” � � ρ � ℓ � := u ℓ | h ( 0 ) − h ( 1 ) | ℓ h = h h which are central in the n –sorting algorithm ( ❀ later). 13
Continued fraction sign-algorithms. Theorem 1. Expected costs of BCF/CCF-sign are lattice sums , E ( � E ( L ) = ρ � 2 � , ρ � 2 � , L ) = � where, the “moments” of index ℓ are 1 + 1 2 � 1 ρ � ℓ � = 2 ℓ + c ℓ d ℓ ζ ( 2ℓ ) ( d,c ) ∈C ( 1,2 ) √ 2 ℓ � 1 ρ � ℓ � � = c ℓ d ℓ , ( φ = ( 1 + 5 ) /2 ) , ζ ( 2ℓ ) ( d,c ) ∈C ( φ,φ 2 ) C ( β, γ ) := { ( m, n ) ∈ N 2 | mβ < n < mγ } . w/ lattice sum : 14
Proof (elementary!): — The set H of all possible LFT’s used by the BCF –algorithm is � az + b � cz + d | ( a, b, c, d ) ∈ N 4 , | ad − bc | = 1, c ≤ d, a ≤ c, b ≤ d H := . = unimodular h + relatively prime coeffs. + inequalities. — The set e H of all possible LFT’s used by the CCF –algorithm is � az + b � cz + d | ( b,d ) ∈ N 2 , ( a,c ) ∈ Z 2 , ac ≥ 0, | ad − bc | = 1, − d φ 2 < c < d φ, | a | ≤ | c | 2 , b ≤ d e H := . 2 15
3 The sorting algorithm Given X := { x 1 , x 2 , . . . , x n } with x j ∈ I . Build digital tree , trie ( X ) : ( R 1 ) If X = { x } then trie ( X ) is single leaf node x . ( R 2 ) If card ( X ) ≥ 2 , then trie ( X ) is built recursively trie ( X ) = � o, trie ( X 1 ) , trie ( X 2 ) , . . . , trie ( X r ) � , where X i = { U ( x ) | m ( x ) = b i , x ∈ X } . Follow n random trajectories and stop each when it “deviates”. 16
The binary trie is another saga . . . 17
= / 1, 1, 1 , 1, 1, 1, ... / φ − 1 = / 1, 1, 2 , 1, 2, 1, ... / γ = / 1, 2, 1 , 1, 4, 1, ... / exp ( 1 ) − 2 = / 1, 2, 3 , 1, 6, 3, ... / log 2 √ = / 1, 1333462407511 , 1, 8, 1, 1, ... / { exp ( π 163 ) } 2 1/3 − 1 = / 3 , 1, 5, 1, 1, , 4, ... / = / 7 , 15, 1, 292, 1, 1, ... / π − 3 Typical shape?? 18
Theorem 2. The expectation of the number of digit inspections � n − 1 � n � (− 1 ) ℓ ρ � ℓ � , P ( n ) = n ℓ − 1 ℓ = 2 where ρ � ℓ � is a moment sum. [Similarly for CCF.] = finite differences of moment sums! P ( 2 ) = 2 ρ ( 2 ) , P ( 3 ) = 6 ρ ( 2 ) − 3 ρ ( 3 ) , P ( 4 ) = 12 ρ ( 2 ) − 12 ρ ( 3 ) + 4 ρ ( 4 ) . Proof: ( i ) Poissonize and get independence; ( ii ) de-Poissonize “algebraically”. Cf. Cl´ ement-Fl-Vall´ ee, Algorithmica 2001. 19
4 Multiple zeta values [Euler–Zagier-Borwein 2 ] With H n = 1 + 1 2 + · · · + 1 n : ( H n − 1 ) 4 ∞ � = 185 8 ζ ( 7 ) − 43 2 ζ ( 3 ) ζ ( 4 ) + 5ζ ( 2 ) ζ ( 5 ) n 3 n = 1 ∞ ∞ � 1 � (− 1 ) = ( 2 1 − s − 1 ) ζ ( s ) , ζ + ( s ) ≡ ζ ( s ) = ζ − ( s ) = n s , n s n = 1 n = 1 n − 1 (− 1 ) n H ( t ) ∞ ∞ (− 1 ) n � � � ζ −+ ( s, t )) n − 1 = n s q t = . n s n = 1 q = 1 n = 1 A complete evaluation means a reduction to a polynomial form in zeta values ζ ( 2 ) , ζ ( 3 ) , . . . , ζ ( 1 ) = − log 2 , and possibly other constants. ———- [Zagier 1994], [Borwein ⋆ ], [Fl-Salvy, Exp. Math. 1998] . . . 20
1 m + z 2 2 m + z 3 ∞ z n z � Li m ( z ) = 3 m + · · · = n m . (4) n = 1 Theorem 3. The mean number ρ � 2 � of comparisons in BCF –Sign can be expressed in terms of double zeta values as d d ∞ ∞ (− 1 ) d (− 1 ) d ρ � 2 � = 17 4 + 360 � � c 2 = 15 1 2 − 720 � � 1 c, π 4 d 2 π 4 d 3 d = 1 c = 1 d = 1 c = 1 or with ζ ( 3 ) and the tetralogarithm Li 4 ( 1 2 ) , � � ρ � 2 � = − 60 24 Li 4 ( 1 2 ) − π 2 ( log 2 ) 2 + 21ζ ( 3 ) log 2 + ( log 2 ) 4 + 17. π 4 (5) Thus, ρ � 2 � is in class P = polynomial time computable: ρ � 2 � = 1.35113 15744 91659 00179 38680 05256 46466 84404 78970 85087 ± 10 − 50 . 21
Moment sums for BCF ♠ Original sum are very slowly convergent . ♥ Following Zeilberger, define f ( z ) to be holonomic if its satisfies a linear differential equation with coefficients in Q ( z ) . A constant that is the value f ( z 0 ) of a holonomic function at an algebraic point z 0 is called a holonomic constant . Such holonomic constants are polynomial-time computable ( P ). ♥ Usual series-acceleration (Cohen-Villegas-Zagier; Pari, Maple) works just fine in practice! BCF–Sorting: Get P ( n ) to 16D for n = 0, . . . , 200 via ρ ( ℓ ) to 200D. 22
What about Centred Continued Fractions (CCF)? √ ρ � 2 � = 2 2 � 1 � ( ϕ = ( 1 + 5 ) /2 ) c 2 d 2 , ζ ( 4 ) cϕ<d<cϕ 2 Quiz: � 1 — Can you compute to 100D : ⌊ nφ ⌋ 2 ? n ≥ 1 (= 1.2910603681143874895047876... ) — Do you believe this? [Borwein 2 AMM 1992] √ ⌊ ne π 163 ⌋ ∞ � . with error < 10 500,000,000 . = 1280640 2 n n = 1 23
Theorem 4. The expectation of the cost of the CCF-sign algo- rithm is in P : ρ ( 2 ) = 1.08922 14740 95380 . . . . � 24
Recommend
More recommend