prng requirements and the new mixmax prng
play

PRNG REQUIREMENTS AND THE NEW MIXMAX PRNG J. Apostolakis 1 - PowerPoint PPT Presentation

PRNG REQUIREMENTS AND THE NEW MIXMAX PRNG J. Apostolakis 1 Overview What do we need from a PRNG? What makes a good PRNG? What Dynamical Systems tell us The MIXMAX generator Properties, speed Availability 2


  1. PRNG REQUIREMENTS AND THE NEW MIXMAX PRNG J. Apostolakis 1

  2. Overview • What do we need from a PRNG? • What makes a good PRNG? • What Dynamical Systems tell us • The MIXMAX generator • Properties, speed • Availability 2

  3. Particle Transport • Each interaction can create new, secondary particles (cascade = tree of particles) • Many ‘decision’ depends on using a “random” number • Deciding which interaction (e.g. absorption or scattering?) • Generating a secondary particle depends on sampling value from a probability distribution function • A good PseudoRandom Number Generator must provide the same behaviour which cannot be distinguished from numbers sampled from a truly random source 3

  4. PRNG requirements • “Separate” sub-sequences using seeds • Excellent* statistical properties - same as truly random #s • ‘no’/‘perfect’ correlation of numbers within a stream • ‘no’ correlation between streams • Fast implementation • Efficient seeding from large integer (128-bit+ ?) • Same sequence on all hardware & OSes for the same seed • Size of the state is not a critical factor for Geant4 MT - remains < 3KB 4

  5. Needs of Particle Transport • Good statistical properties for the values • Stream of reliable, portable random numbers are critical 6 steps/event * 10 10 events/year • Large period - 30 * 10 • Low correlation for the full sub-sequence of a stream • No correlations between streams (used for different events) • Computing performance - is 3-10% of CPU time • but RANLUX @ Luxury Level=5 can be > 10% • it matters - so we should seek to make it < 2-3%, if possible • Reproducibility/portability between operating systems & CPU arch. 5

  6. Parallelism • Clusters: Using from O(100) to O(1000) on a site • Grids - taking part of the time of many more cores O(100,000) for one application or one experiment • Inside one system - SIMD instructions in CPU, multiple cores (desktop or accelerator) • Parallelism can be used at many levels of granularity • Job - different CPUs using batch processing or grid • Event parallelism - choice for multi-threading • Track parallelism - for primary tracks (or finer ?) 6

  7. Status in HEP & challenge • Provably good PRNG: RANLUX @ luxury = 5 • slow (can take >10% CPU) • unclear if the overhead is needed • Empirical PRNGs - fast, but without math. guarantees • Mersenne Twister - state of 624 words, enormous period • Can we bridge the dilemma - obtain a provably-good PRNG which is fast ? 7

  8. A theory for PRNGs • Martin Luscher’s RANLUX was brought to us by Fred James remedies the most troubling issue - provides a mathematical underpinning (Savvidy 1990) • Theory of dynamical systems provides guarantee of decorrelation between nearby initial states • “The Lyapunov exponents of the latter set the time scales for statistical correlations in the generated random sequences. In particular, subsequences of numbers separated by several correlation times are highly decorrelated and thus provide a much better source of (pseudo) random numbers.” ( Ref 1a) ) • After 16 iterations of its core RCARRY algorithm, the initial correlation is completely lost • Based on Kolmogorov-Anisov mixing - paper of G. Savvidy et al (1990) 8

  9. Turning theory into a PRNG • Many PRNG use a recursion relation to obtain a new set of values x’ from a previous one x • Mixmax uses a specific class of matrix A which are realizations of chaotic dynamical matrix-recursive systems Replacing now with (3+s), & optimise ’s’ to obtain good Proposed in 1990 (Akopov et. al.) period & eigenvalues 9

  10. Entropy and de-correlation ❖ One must study the spectrum of eigenvalues and none of the eigenvalues should be on the unit circle X ❖ Entropy is equal to h = ln λ | λ | > 1 ❖ Decay of correlations is governed by entropy: 
 τ 0 ≤ 1 /h | λ | 64 16 4 1/4 i=1...N Kostas Savvidy, 3 July 2015 10

  11. Kostas Savvidy, 3 July 2015 RCARRY - The eigenvalues closest Other generators to the circle has | λ | ≈ 1.0085, and the farthest | λ | ≈ 1.043. ❖ RCARRY at the core of RANLUX has a simple multiplier matrix so gives small eigenvalues - so the deceleration is slow. So RANLUX needed to iterate many times to get good decorrelation (luxury level=5) ❖ The Mersenne Twister is worse - also due to sparse matrix and polynomial. Mersenne Twister- eigenvalues have | λ |< 1.002 11

  12. Speed, Period of MIXMAX • Full matrix-matrix multiplication is too slow • Need a fast algorithm for the next state - done! • Must calculate the period for trial values of N • Advances from Kostas Savvidy: • created fast code for special structure of matrix ‘A’ • Calculated period for some N & “magic” number ‘s’ 12

  13. Particular realisations of MIXMAX Size Magic Entropy Period q is N s (lower bound) τ /q ≈ log 10 ( q ) fully factored BigCrush 10 − 1 6.2 1/4 165 Yes 33 16 6 9.9 1/32 275 Yes > 13 40 1 24.6 1/4 716 Yes 3 44 0 27.1 1/4 789 No 4 60 4 37.0 1 1083 Yes 2 64 6 39.4 1/8 1156 No 1 (?) 88 1 54.2 1/2 1597 No Pass 256 − 1 157.7 1 4682 No Pass 508 5 313.0 1 9309 No Pass 720 1 443.6 1 13202 No Pass 1000 0 616.1 1/20 18344 No Pass 1260 15 776.3 1/2 23118 No Pass 3150 − 11 1940.8 1/12 57824 No Pass New: 256 487013230256099064 with good period and large eigenvalue (3000) for fast deceleration is the current choice 13

  14. Towards production use • MIXMAX can provide the mathematical guarantees we obtained from RANLUX • Passes empirical BIGCRUSH tests • Skipping algorithm is available, using pre-calculated data • It remains only to define what is the acceptable level of de- correlation => obtain a ‘perfect’ RNG using decimation • Used ‘Raw’ (not decimated) its implementation is faster than CLHEP’s Merseine Twister 14

  15. Using it in Geant4 • A CLHEP interface to MIXMAX has been prepared • Submitted to CLHEP editors for inclusion in next release • Available already in Geant4 ‘internal’ CLHEP • Should validate its use & measure the computing performance in simulations. 15

  16. References 1. RANLUX a) http://cern.ch/luscher/ranlux/index.html b) M. Lüscher, Comp. Phys. Comm. 79 (1994) 100 2. MIXMAX Proposal • N. Akopov, G. Savvidy, and N. Ter-Arutyunyan-Savvidy, J. Comput. Phys., p. 573–579, Dec. 1991. 3. MIXMAX: Efficient implementation & period a) K. G. Savvidy, Comp. Physics Comm., Vol 196, November 2015, pp 161-165, http://dx.doi.org/10.1016/j.cpc.2015.06.003 b) http://mixmax.hepforge.org/ 4. Many slides were from the recent “MIXMAX Network meeting”, 3 July 2015, at https://indico.cern.ch/event/404547/ 16

  17. BACKUP 17

  18. 18

  19. 19

  20. Implementation of MIXMAX Kostas Savvidy, 3 July 2015 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend