Core bench: micro-benchmarking for OCaml Christopher S. Hardin and - PowerPoint PPT Presentation

Overview Implementation Core bench: micro-benchmarking for OCaml Christopher S. Hardin and Roshan P. James Jane Street September 24, 2013, OUD Workshop Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Micro-benchmarking Precise measurement is essential for writing performance sensitive code. Objective: Measure the execution cost of functions that are relatively cheap. Functions with execution times on the order of nanoseconds to a tens or hundreds of milli-seconds. A 3.4 GHz cpu runs several simple instructions per nanosecond. Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Micro-benchmarking : Timing let t1 = Time.now () in f (); let t2 = Time.now () in report (t2 - t1) Time.now is often too imprecise (about 1 microsec). Asking for current time also takes time. Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Micro-benchmarking : Batch sizes let t1 = Time.now () in for i = 1 to batch_size do f (); done; let t2 = Time.now () in report batch_size (t2 - t1) Compute a batch size to account for the timer. Criterion for Haskell. Mean, Std deviation to account for system noise. Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Micro-benchmarking : Noise System noise from other processes and OS activity. More importantly, there are delayed costs due to GC. Variance in execution times is influenced by batch size. 5e+07 4e+07 runtime (ms) 3e+07 2e+07 1e+07 0 0 2000 4000 6000 8000 10000 batch size Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Core bench : Linear regression Treats micro-benchmarking as a linear regression. Simple case: fit of execution time to batch size. Data of larger batch sizes have smaller %-error. Geometric sampling of batch sizes to get a better linear fit. 7000 6000 runtime (ms) 5000 4000 3000 2000 1000 0 0 1e+06 batch size Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Core bench : Linear regression No need to estimate the clock and other constant errors: Constant overheads are accounted for in the y-intercept. Predict other costs in the same way. Estimate memory allocations and promotions using batch size. Estimate garbage collection using batch size. User specifies how much sampling time is allowed. More data allows better estimates. Error estimation, goodness of fit by Bootstrapping R 2 Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Example source (basic) open Core.Std open Core_bench.Std let t1 = Bench.Test.create ~name:"id" (fun () -> ()) let t2 = Bench.Test.create ~name:"Time.now" (fun () -> ignore (Time.now ())) let t3 = Bench.Test.create ~name:"Array.create300" (fun () -> ignore (Array.create ~len:300 0)) let () = Command.run (Bench.make_command [t1; t2; t3]) Output Name Time/Run Minor Major ----------------- ---------- ------- ------- id 3.08 Time.now 843 2.00 Array.create300 3_971 301 Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Some functions have strange execution times let benchmark = Bench.Test.create ~name:"List.init" (fun () -> ignore(List.init 100_000 ~f:id)) 700 observed 1-predictor model 600 500 runtime (ms) 400 300 200 100 0 0 100 200 300 400 500 batch size Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Multiple predictors 700 observed runtime runs 600 promoted words compactions 500 milliseconds 400 300 200 100 0 0 50 100 150 200 250 300 350 400 450 batch size Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Multiple predictors: fit Using runs, compactions, promoted as predictors 700 observed 1-predictor model 600 3-predictor model 500 runtime (ms) 400 300 200 100 0 0 100 200 300 400 500 batch size Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Runtime cost decomposition example X = [batch size x , minor GCs, compactions], y = runtime (ns). Solve X β = y , x γ = X . Suppose we get   1 . 06 × 10 4 � � 1 . 04 × 10 6 β = γ = 1 0 . 00299 0 . 00149   2 . 25 × 10 6 Then (predicted) runtime is ns/mGC ns/cmp mGCs/run cmps/run � �� γβ = (1 . 06 × 10 4 )(1) (1 . 04 × 10 6 ) (2 . 25 × 10 6 ) + (0 . 00299) + (0 . 00149) � �� nominal minor GC cost compaction cost = 10 . 6 µ s + 3 . 1 µ s + 3 . 4 µ s = 17 . 4 µ s (Note: Just solving xm = y gives 17 . 4 µ s.) Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Conclusion and Future Work opam install core bench Expose more predictors Measure the effect of live words on performance. Counters for major collection work per minor GC. Accuracy of results Ordinary least-squares is susceptible to outliers. Incorporate the fact that measurement error is heavy-tailed (on the positive side). Automatically select execution time based on error. Automatically pick predictors from a set. Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Overview Implementation Thank you. Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Core bench: micro-benchmarking for OCaml Christopher S. Hardin and - PowerPoint PPT Presentation

Overview Implementation Core bench: micro-benchmarking for OCaml Christopher S. Hardin and Roshan P. James Jane Street September 24, 2013, OUD Workshop Christopher S. Hardin and Roshan P. James Core bench: micro-benchmarking for OCaml

Multicore OCaml GC KC Sivaramakrishnan, Stephen Dolan University of OCaml Labs Cambridge

The state of OCaml, 2013 Xavier Leroy INRIA Paris-Rocquencourt OCaml Workshop, 2013-09-24 X.

State of Multicore OCaml KC Sivaramakrishnan University of OCaml Labs Cambridge Outline

Accessing and using weather data in OCaml Hez Carty - OCaml 2013 MDA Information Systems LLC

The state of OCaml, 2012 Xavier Leroy INRIA Paris-Rocquencourt OCaml Users and Developers

OCaml Scope: a New OCaml API Search Jun Furuse - Standard Chartered Bank Who am I? OCaml

High level OCaml optimisations Pierre Chambart, OCamlPro OCaml 2013, 23 September 2013 OCaml is

CIS 500 Software Foundations Fall 2005 Programming with OCaml CIS 500, Programming

Retrofitting a Concurrent GC onto OCaml KC Sivaramakrishnan University of OCaml Labs Cambridge

Welcome Welcome Core: Core A Regional Destination Core: Core UL Core: Core Downtown

OCaml Tutorial Abram Hindle Kitchener Waterloo Perl Monger http://kw.pm.org abez@abez.ca

Practical Algebraic Effect Handlers in Multicore OCaml KC Sivaramakrishnan University of

Concurrent System Programming with Effect Handlers KC Sivaramakrishnan University of OCaml

Melt: L A T EX with OCaml Romain Bardou GT ProVal June 11, 2010 L A T EX versus OCaml L A T

OCamlot: OCaml Online Testing State for the Open Source OCaml Community David Sheets, Anil

BEDSIDE BENCH knowledge intervention COMMERCE BEDSIDE BENCH knowledge intervention

No signal yet: The elusive birefringence of the vacuum, and whether gravitational wave detectors

Model-Based Explainable AI for Safe and Trusted Human-Autonomy Teaming Daniele Magazzeni

LBNE Alfons Weber University of Oxford STFC/RAL LBNE European Collaborators UK Italy

A Tour of Machine Learning Security Florian Tramr Intel, Santa Clara, CA August 30 th 2018 The

Mixing pattern quantification in node-attributed networks Salvatore Citraro A brief context (of

Overview of the POSIXct type James Lamb Instructor DataCamp Time Series with data.table in R

Merging Whats Cracked, Cracking Whats Merged Adaptive Indexing in Main-Memory Column-Stores

Extracting Narrative Structure Daphne Ippolito Chris Callison-Burch