Draft 1 On the Lattice Structure of MIXMAX Random Number - - PowerPoint PPT Presentation

draft
SMART_READER_LITE
LIVE PREVIEW

Draft 1 On the Lattice Structure of MIXMAX Random Number - - PowerPoint PPT Presentation

Draft 1 On the Lattice Structure of MIXMAX Random Number Generators Pierre LEcuyer Joint work with Paul Wambergue, Erwan Bourceret, and Marc-Antoine Savard MCQMC, Rennes, July 2018 Draft 2 MIXMAX Generators [Akopov, Savvidy, et al.


slide-1
SLIDE 1

Draft

1

On the Lattice Structure of MIXMAX Random Number Generators

Pierre L’Ecuyer

Joint work with Paul Wambergue, Erwan Bourceret, and Marc-Antoine Savard

MCQMC, Rennes, July 2018

slide-2
SLIDE 2

Draft

2

MIXMAX Generators

[Akopov, Savvidy, et al. (1991)] Output vector ui ∈ Rk follow the recurrence ui = Aui−1 mod 1 where u0 ∈ [0, 1)k and A is a k × k matrix such that: (1) det(A) = 1, so the linear transformation preserves volume; (2) the eigenvalues λ1, . . . , λk of A are away from the unit circle. Then, for appropriate A, distance between trajectories that start from very close will diverge exponentially fast, as Θ(ehi) in i steps, where h is the entropy: h =

k
  • j=1

I[|λj| > 1] · log |λj|. For almost all (irrational) u0, the trajectory is aperiodic and space-filling in [0, 1)k.

slide-3
SLIDE 3

Draft

3

Original MIXMAX proposal:

A = A(k, d) =          1 1 1 1 · · · 1 1 1 2 1 1 · · · 1 1 1 3 + d 2 1 · · · 1 1 1 4 3 2 · · · 1 1 . . . ... 1 k k − 1 k − 2 · · · 3 2          . Savvidy (2015) provides lower bounds on h that depend only on k for A = A(k, d), and shows that h is large, e.g., much larger than for the AWC and SWB generators of Marsaglia and Zaman (1991). However, all of this is for irrational state vectors ui, which cannot be implemented exactly.
slide-4
SLIDE 4

Draft

4 The authors proposed to approximate the irrational recurrence by a rational one as follows. Take a large prime m and define xi = Axi−1 mod m in which xi = (xi,0, . . . , xi,k−1)t ∈ Zk m and ui = (ui,0, . . . , ui,k−1) = xi/m ∈ [0, 1)k. This is a matrix LCG (Niederreiter 1986); gives a periodic recurrence! If det(A) = 1, the maximal period is (mk − 1)/(m − 1), i.e., m − 1 times shorter than general matrix LCGs. Savvidy provides parameters (k, d) giving period ρ = (mk − 1)/(m − 1) for m = 261 − 1 with k from 8 to 3150. Even for k = 8, this already gives ρ ≈ 2427. Long enough period! He also provides an efficient implementation that uses only 2k additions and one multiplication by d to compute the next vector xi at each step. Fast! MIXMAX became popular recently. Part of the ROOT library used at CERN, Geneva. In July 2016, I was at CERN, in a workshop devoted to the MIXMAX. Fred James was telling us: “We have a proof that this RNG passes all statistical tests; this is fantastic.” I was not fully convinced.
slide-5
SLIDE 5

Draft

5

Other variants, with more parameters and flexibility

MIXMAX-(m, k, d, c) and MIXMAX-(m, k, d, c, b):

A = A(k, d, c) =           1 1 1 1 · · · 1 1 1 2 1 1 · · · 1 1 1 c + 2 + d 2 1 · · · 1 1 1 2c + 2 c + 2 2 · · · 1 1 1 3c + 2 2c + 2 c + 2 · · · 1 1 · · · 1 (k − 2)c + 2 (k − 3)c + 2 (k − 4)c + 2 · · · c + 2 2           A = A(k, d, c, b) =           1 1 1 1 · · · 1 1 1 2 1 1 · · · 1 1 1 3c + d + b 2 1 · · · 1 1 1 4c + b 3c + b 2 · · · 1 1 1 5c + b 4c + b 3c + b · · · 1 1 · · · 1 kc + b (k − 1)c + b (k − 2)c + b · · · 3c + b 2          
slide-6
SLIDE 6

Draft

6

Lattice structure of matrix LCGs

Define successive output values of matrix LCGs as follows: u0 = (u0, u1, . . . , uk−1) u1 = (uk, uk+1, . . . , u2k−1) u2 = (u2k, u2k+1, . . . , u3k−1), etc. For any fixed s > 0, let Ψs = {(u0, u1, . . . , us−1) ∈ [0, 1)s | x0 ∈ Zk

m},

the set of vectors of s successive values obtained from all possible initial states. We want this set to cover [0, 1)s very evenly. More general: for any finite set I = {i1, . . . , is} where 0 ≤ i1 < · · · < is, let Ψs(I) = {(ui1, . . . , uis) ∈ [0, 1)s | x0 ∈ Zk

m}.

Want this set to cover [0, 1)s very evenly for all I in some large family, e.g., is ≤ i∗ and s ≤ s∗.

slide-7
SLIDE 7

Draft

7

Lattice structure

It is known that for each I, Ψs(I) = Ls(I) ∩ [0, 1)s where Ls(I) is a lattice in Rs. It means that all the points are in equidistant hyperplanes at distance ds(I) = 1/ℓs(I) apart, where ℓs(I) = length of shortest vector in dual lattice. Let d∗

s (n) be the best (smallest) distance for a known lattice with density n = min(mk, ms)

in s dimensions. Normalized measure: Ss(I) = d∗

s (n)/ds(I) ∈ (0, 1]. Want it close to 1.

Example of worst-case figure of merit: For s′ > 0, let Ms′ = min

I⊆{1,...,s′} Ss(I).

If s′ is very large, there are so many subsets that some of them must be bad. But for s′ not too large, most MRGs and matrix LCGs have a good Ms′.

slide-8
SLIDE 8

Draft

8

Example: MRG of order k

Multiple recursive generators (MRG): xi = (a1xi−1 + · · · + akxi−k) mod m, ui = xi/m. Equivalent to matrix LCG with A =      1 · · · . . . ... . . . · · · 1 ak ak−1 · · · a1     

k

For k = 7 and m = 263 − 52425 (a prime), we searched random parameters a = (a1, . . . , ak) that gave full period ρ = mk − 1 ≈ 2441, and computed M10 for each. Within our limited time budget, we obtained 7041 such vectors and the values of M10 ranged from 0.53858 (the best) to 0.14970 (the worst). Thus, values of M10 smaller than say 0.1 are very rare.

slide-9
SLIDE 9

Draft

9

How to compute ℓs(I)?

V =        I/m At/m · · · (Aν−1)t/m ([Aν]r)t/m I ... I I        and W =        m · I −A I . . . ... −Aν−1 I −[Aν]r I        The rows of V form a basis of Ls and the rows of W form a basis of the dual lattice L∗ s . For a basis of Ls(I), select the appropriate colums of V, and invert this basis modulo 1 to get a basis for L∗ s (I). Then compute the length ℓs(I) of a shortest nonzero vector in L∗ s (I), using BB.
slide-10
SLIDE 10

Draft

10

Lattice structure of MIXMAX (m, k, d)

W =                           m · I · · · −1 −1 −1 · · · −1 −1 −2 −1 · · · −1 −1 −3 − d −2 · · · −1 −1 −4 −3 · · · −1 I · · · · · · −1 −k −(k − 1) · · · −2 −A2 I . . . . . . ... −[Aν]r I                           . The other MIXMAX are similar, with a slightly different −A.
slide-11
SLIDE 11

Draft

11

Any integer linear combination of rows of W belongs to the dual lattice L∗

s.

In particular, if k ≥ 2, for all the MIXMAX variants, we have wk+1 = (−1, −1, −1, . . . , −1,1, 0, . . . , 0) wk+2 = (−1, −2, −1, . . . , −1,0, 1, . . . , 0)

↑ coordinate k + 1

so w = wk+1 − wk+2 = (0, 1, 0, . . . , 0,1, −1, 0, . . . , 0) ∈ L∗

s for s ≥ k + 2. ↑ coordinate k + 1

This vector has Euclidean length √ 3. Its presence implies that the successive output values satisfy (u1 + uk − uk+1) mod 1 = 0. That is, u1 + uk − uk+1 = 0 or 1, because 0 ≤ ui < 1 for all i. So when I := {1, k, k + 1} ⊆ I ′, Ψ3(I ′) is covered by only two planes, 1/ √ 3 apart.

slide-12
SLIDE 12

Draft

12

An example with k = 8

MIXMAX-(m, k, d, c) from Savvidy (2017), with m = 261 − 1, k = 8, d = 0, and c = 253 + 1. Then, whenever {1, 8, 9} = I ⊆ I ′, the points of Ψ3(I ′) are in only two parallel planes. This gives Ss(I) = 6.69 × 10−19 for this I, and then Ms′ ≤ 6.69 × 10−19 for all s′ ≥ k + 2. We applied the following empirical statistical tests to this generator. Collision test: Generate n points in s dimensions, using all coordinates of output vectors. Partition the cube [0, 1)s in ds cubic cells. Count the number C of collisions (a point falling in an occupied cell). C should be approx. Poisson with mean λ = n2/(2ds). Repeat N times and count the total number of collisions, then the p-value. With s = 16, d = 8, n = 4 × 107, N = 10, we get p ≈ 6 × 10−206. We expect about 28 collisions and we get 314. If we take only the first three values of each block and s = 6, with d = 128, same n and N, we get p < 10−300. We expect about 1,818 collisions and we get 116,218. Birthday spacings test: With s = 16, d = 16, n = 3 × 107, N = 10, we get p < 10−300.

slide-13
SLIDE 13

Draft

13

Skipping coordinates

To get rid of the bad structure, one can skip some coordinates of each vector ui. L¨ uscher (1994) already proposed this for the AWC/SWB generators. For example, if we skip the second coordinate of each vector ui, we no longer have the relationship u1 + uk − uk+1 = 0 or 1. But other relationships can be found between the remaining coordinates! For example, for the MIXMAX-(m, k, d), if k ≥ 6, then w = wk+4 − 2wk+5 + wk+6 = (0, . . . , 0, −1, 0, . . . , 0, 1, −2, 1, 0, . . . )

↑ coord. 6

from which we find −u5 + uk+3 − 2uk+4 + uk+5 = q for q ∈ {−2, −1, 0, 1}. It follows that whenever {5, k + 3, k + 4, k + 5} ⊆ I, Ψs(I) is contained in at most 4 equidistant parallel hyperplanes at distance 1/ √ 7 apart. Even if we skip the first three coordinates of each block ui, we still have this structure.

slide-14
SLIDE 14

Draft

14

MIXMAX-(m, k, d, c)

A = A(k, d, c) =           1 1 1 1 · · · 1 1 1 2 1 1 · · · 1 1 1 c + 2 + d 2 1 · · · 1 1 1 2c + 2 c + 2 2 · · · 1 1 1 3c + 2 2c + 2 c + 2 · · · 1 1 · · · 1 (k − 2)c + 2 (k − 3)c + 2 (k − 4)c + 2 · · · c + 2 2          

slide-15
SLIDE 15

Draft

15

Skipping coordinates for MIXMAX-(m, k, d, c) with c ≥ 1. Suppose m = qc + r where q > 0 and |r| are small integers. Then w = q(wk+4 − 2wk+5 + wk+6) mod m = q(0, . . . , 0, 1 − c, −1, 0, . . . , 0, 1, −2, 1, 0, . . . ) mod m = (0, . . . , 0, q + r, −q, 0, . . . , 0, q, −2q, q, 0, . . . ), in which q + r is at position 5 and −2q is at position k + 5. We thus have ((q + r)u4 − qu5 + quk+3 − 2quk+4 + quk+5) mod 1 = 0. So if {4, 5, k + 3, k + 4, k + 5} ⊆ I, Ψs(I) is covered by at most 5q + |q + r| − 1 equidistant parallel hyperplanes, at distance 1/ℓ apart, where ℓ2 = 7q2 + (q + r)2.

slide-16
SLIDE 16

Draft

16

Example with k = 8 again

MIXMAX-(m, k, d, c) from Savvidy (2017), with m = 261 − 1, k = 8, d = 0, and c = 253 + 1. Here, m = 256c − 257, so q = 256 and r = −257. Therefore, if {4, 5, 11, 12, 13} ⊆ I, then Ψs(I) is covered by at most 5q + |q + r| − 1 = 1280 equidistant parallel hyperplanes at distance 1/ℓ = 1/

  • 7q2 + (q + r)2 ≈ 1/677.3, for s = 5.

This ℓ is in fact the exact ℓs(I) and it gives Ss(I) ≈ 2.3859 × 10−16. So even by skipping the first three coordinates of each output vector, we still have a very bad structure. By searching at random for bad MRGs with a similar m and k, it is extremely rare to find such a bad one! Birthday spacings test: Skip first 3 values of each block and keep the next 5. Apply the test with t = 10, d = 64, n = 107, N = 10. We get p < 10−300. We have E[C] = 2168 and get C = 4220.

slide-17
SLIDE 17

Draft

17

Conclusion

◮ The MIXMAX is a fast generator with a very large period. It may look very good and safe at first sight. It passed the Crush batteries of TestU01. But we showed here that it has significant weaknesses that show up in statistical tests. ◮ We have seen this type of story several times before, with other very fast generators with large periods. ◮ Bottom line: It is important to analyze and understand (theoretically) the structure of the points produced by RNGs.

slide-18
SLIDE 18

Draft

17

Some references Afflerbach, L., and H. Grothe. 1988. “The Lattice Structure of Pseudo-Random Vectors Generated by Matrix Generators”. Journal of Computational and Applied Mathematics 23:127–131. Couture, R., and P. L’Ecuyer. 1994. “On the Lattice Structure of Certain Linear Congruential Sequences Related to AWC/SWB Generators”. Mathematics of Computation 62 (206): 798–808. L’Ecuyer, P. 1997. “Bad Lattice Structures for Vectors of Non-Successive Values Produced by Some Linear Recurrences”. INFORMS Journal on Computing 9 (1): 57–60. L’Ecuyer, P. 1999. “Good Parameters and Implementations for Combined Multiple Recursive Random Number Generators”. Operations Research 47 (1): 159–164. L’Ecuyer, P. 2017. “History of Uniform Random Number Generation”. In Proceedings of the 2017 Winter Simulation Conference, 202–230: IEEE Press. L’Ecuyer, P., and R. Couture. 1997. “An Implementation of the Lattice and Spectral Tests for Multiple Recursive Linear Random Number Generators”. INFORMS Journal on Computing 9 (2): 206–217.

slide-19
SLIDE 19

Draft

17

L’Ecuyer, P., and R. Simard. 2007, August. “TestU01: A C Library for Empirical Testing of Random Number Generators”. ACM Transactions on Mathematical Software 33 (4): Article 22. L’Ecuyer, P., and R. Simard. 2014. “On the Lattice Structure of a Special Class of Multiple Recursive Random Number Generators”. INFORMS Journal on Computing 26 (2): 449–460. L¨ uscher, M. 1994. “A Portable High-Quality Random Number Generator for Lattice Field Theory Simulations”. Computer Physics Communications 79:100–110. Marsaglia, G., and A. Zaman. 1991. “A New Class of Random Number Generators”. The Annals of Applied Probability 1:462–480. Savvidy, K. G. 2015. “The MIXMAX Random Number Generator”. Computer Physics Communications 196:161–165. Savvidy, K. G. 2017. MIXMAX manual. see https://www.hepforge.org/archive/mixmax/MANUAL.pdf. Tezuka, S., P. L’Ecuyer, and R. Couture. 1993. “On the Add-with-Carry and Subtract-with-Borrow Random Number Generators”. ACM Transactions of Modeling and Computer Simulation 3 (4): 315–331.