Neil Mitchell www.cs.york.ac.uk/~ndm/ The Problem Count the - PowerPoint PPT Presentation

Fastest Lambda First λ Neil Mitchell www.cs.york.ac.uk/~ndm/

The Problem  Count the number of lines in a file – “” = 0 – “test” = 1 – “test\n” = 1 – “test\ntest” = 2  Read from the console – Using getchar only – No buffering

The Haskell main = print . length . lines =<< getContents  getContents :: IO String  lines :: String → [String]  length :: [a] → Int  print :: Show a ⇒ a → String

Thanks to Andrew Wilkinson The C int main() { int count = 0, last_newline = 1, c; while ((c = getchar()) != EOF) { if (last_newline) count++; last_newline = (c == '\n'); } printf("%i\n", count); return 0; } /* Is this correct? */

The Results 10 9 8 7 6 5 4 3 2 1 0 C Supero GHC

Disclaimer Slide  Uses GHC as a backend – GHC does some really cool optimisation – Inlining, strictness, unboxing  Only one benchmark presented – Promising results on others, but not enough yet

Other Benchmarks  Three results – wc -c 13% faster GHC, 3% slower C – wc -l 47% faster GHC, 2% slower C – wc -w 70% faster GHC, 20% slower C  All very similar programs…

Overview  Different approach  First order code  First order code without data  Termination  What could be improved  Conclusion

Whole program analysis  Look at all the code at once  Done by a few compilers (MLton, JHC)  Usually compilation is really slow  Linking is whole-program  Mine is quite quick

Bullets versus a nuclear bomb  Most (all?) optimising compilers use “bullets” – Small, targeted transformations – Hit programs with a hail of bullets  I use one single optimisation – No issues of “enabling transformations” – No optimisation “dials” – No “swings and roundabouts”

Alpha Renaming  Some optimisers rely on special names – foldr/build – stream/unstream  Achieves good practical results – Limits what can be optimised well – Requires functions to be defined unnaturally – They tend to go wrong (take in GHC 6.6)

First Order Haskell  Remove all lambda abstractions (lambda lift)  Leaving only partial application/currying odd = (.) not even (.) f g x = f (g x)  Generate templates (specialised fragments)

Oversaturation f x y z, where arity(f) < 3 main = odd 12 <odd _> x = (.) not even x main = <odd _> 12

Undersaturation f x (g y) z, where arity(g) > 1 <odd _> x = (.) not even x <(.) not even _> x = not (even x) <odd _> x = <(.) not even _> x

Special Rules let z = f x y, where arity(f) > 2 (let-under) – inline z, after sharing x and y d = Ctor (f x) y, where arity(f) > 1 (ctor-under) – inline d – The “dictionary” rule

Standard Rules  let x = ( let y = z in q) in … (let/let)  case ( let x = y in z) of … (case/let)  case ( case x of …) of … (case/case)  ( case x of …) y z (app/case)  case C x of … (case/ctor)

Removing functions Application Closure \x → f x head x head x

Removing data Consumption Production case x of … x : xs …

Efficient Interpretation by Transforming Data Types and Patterns to Functions, TFP 2006 Church Encoding data List a = = \n c → Nil nil n cons x xs = \n c → | Cons a (List a) c x xs len x = case x of len x = x Nil → 0 0 Cons y ys → (\y ys → 1 + len ys 1 + len ys)

Optimisation Algorithm Remove higher-order functions 1. Church encode 2. Remove higher-order functions 3.

Proof: It doesn’t work  A program has no data, and no functions  Implies its not Turing complete!  Linear Bounded Turing Machine  Therefore, removing HO cannot be perfect

Failing Example showPosInt x = f x “” f 0 acc = acc f i acc = f (i / 10) (c:acc) where c = ord ‘0’ + (i % 10)  Requires a buffer O(log 10 n)  Cannot be removed automatically

Failing pleasantly  Keep running  At some point, stop – 1000 new functions created – 100 based on a particular function – Some particular name recurring  Leaves higher-order functions around

Thanks to Tom Shackell Failing Church Encoding  Church encoding requires rank-2 types – Cannot be inferred automatically – Makes some things more complex  Why not merely “pretend” Church Encode – Failure is now left-over data – Much more pleasant Pretend we are Church encoding

Summing the Integers main n = sum (range 0 n) sum xs = case xs of [] → 0 (y:ys) → y + sum ys range i n = if i > n then [] else i : range (i+1) n

Undersaturation of Data  A constructor is higher-order main n = sum (range 0 n) <sum (range#2)> i n = case range i n of … main n = <sum (range#2)> 0 n

Oversaturation of Data  A case is an application case range i n of {[] → 0; (y:ys) → y + sum ys} < case range#2 {[] → 0; (y:ys) → y+sum ys}> i n = if i > n then 0 else i + sum (range (i+1) n)

Final Result main n = sum’ 0 n sum’ i n = range’ i n range’ i n = if i > n then 0 else i + sum’ (i+1) n  All constructors have disappeared  First-order with Church encoding

Special Cases let x = C y z – inline x, after sharing y and z let x = f y z, where f produces data – inlining may break sharing – only if one use of x

What isn’t Optimised?  This optimisation does a lot  But doesn’t always produce optimal code  What can we do better? – Ignore “better algorithms”

GHC is very good at this Call overhead f1 x y = f2 x y f2 x y = f3 y x f3 y x = g x + y  My optimisation gives loads of these!

Again, GHC is good at this Strictness/Boxing  Lazy evaluation requires “thunks”  Strictness avoids these thunks  Int is box stored in the heap  Int# is more like a C int

Can cause space leaks Sharing/lets g (f x) (f x) ⇒ let y = f x in g y y  Common sub expression map (g 100) ys g x y = f x + y  Strength reduction

Constant movement countLines xs = count ‘\n’ xs count n (x:xs) | n == x = 1 + count xs | otherwise = count n xs  This one remains in linecount example  Should make the Haskell faster

Can Haskell beat C?  A question of abstraction – In C, abstraction is painful – For linecount, not worth it  Haskell can remove abstraction better than C – Won’t win on micro-benchmarks (may draw) – May win on real programs

http://shootout.alioth.debian.org/ Faster than C print . sum . map readInt . lines =<< getContents readInt :: Int → String  Haskell can optimise sum/readInt  C can’t optimise between them  NB. Not actually tried, yet…

More Benchmarks  Needs refactoring – Some transformations in Yhc.Core – Some in the optimiser – Don’t glue together nicely  GHC sometimes “over-optimises” – Turns getchar into a constant! – Need to integrate with GHC’s IO Monad

Conclusion  Haskell can be made faster – Nearly the speed of C (sometimes) – But always more beautiful  You can’t draw conclusions from small benchmarks

Neil Mitchell www.cs.york.ac.uk/~ndm/ The Problem Count the - PowerPoint PPT Presentation

Fastest Lambda First Neil Mitchell www.cs.york.ac.uk/~ndm/ The Problem Count the number of lines in a file = 0 test = 1 test\n = 1 test\ntest = 2 Read from the console Using getchar

Total Pasta: Unfailing Pointer Programs Neil Mitchell, ndm AT cs.york.ac.uk Department of

community.haskell.org/~ndm/firstify Neil Mitchell, Colin Runciman University of York The

Supercompilation for Haskell Neil Mitchell, Colin Runciman www.cs.york.ac.uk/~ndm/supero The

Deriving Generic Functions by Example Neil Mitchell www.cs.york.ac.uk/~ndm/derive Generic

Hoog e Fast Type Searching Neil Mitchell www.cs.york.ac.uk/~ndm/ Hoogle Synopsis Hoogle is

Supero: Making Haskell Faster Neil Mitchell, Colin Runciman www.cs.york.ac.uk/~ndm/supero The

Faster Haskell Neil Mitchell www.cs.york.ac.uk/~ndm The Goal Make Haskell faster

Instances for Free* Neil Mitchell www.cs.york.ac.uk/~ndm (* Postage and packaging charges may

York University www.cs.york.ac.uk/~ndm First order vs Higher order Higher order:

Rethinking Supercompilation Neil Mitchell ICFP 2010 community.haskell.org/~ndm/supero

Deriving a Relationship from a Single Example Neil Mitchell community.haskell.org/~ndm/derive

Shake Before Building Replacing Make with Haskell Neil Mitchell community.haskell.org/~ndm/shake

Detecting Pattern-Match Failures in Haskell Neil Mitchell and Colin Runciman York University

Making Every Contact Count (MECC) Content What is Making Every Contact Count? Who is

Recitation 4 Question 3: Flying off the handle Parent Child fork() count++; print(count); 1

Termination checking for a lazy functional language Neil Mitchell Neil Mitchell - Termination

Last time: GADTs a b 1/ 41 This time: monads (etc.) = > > 2/ 41 What do monads

Lecture 7: Shared memory programming David Bindel 20 Sep 2011 Logistics Still have a couple

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 16: Uncertainty and

CS70: Jean Walrand: Lecture 35. Continuous Probability 2 1. Review: CDF , PDF 2. Examples 3.

Generation of Non-Uniform Random Numbers Generation of Non-Uniform Random Numbers Refs: Chapter 8

Upper and Lower Semimodularity of the Supercharacter Theory Lattices of Cyclic Groups Samuel

PVMD Olindo Isabella Delft University of Technology Learning objectives 1. PV module design

TEAM MANAGERS' BRIEFING & DRAW 15 th August 2019 @7pm STTA Conference Room Briefing Agenda

Neil Mitchell www.cs.york.ac.uk/~ndm/ The Problem Count the - PowerPoint PPT Presentation

Fastest Lambda First Neil Mitchell www.cs.york.ac.uk/~ndm/ The Problem Count the number of lines in a file = 0 test = 1 test\n = 1 test\ntest = 2 Read from the console Using getchar

Total Pasta: Unfailing Pointer Programs Neil Mitchell, ndm AT cs.york.ac.uk Department of

community.haskell.org/~ndm/firstify Neil Mitchell, Colin Runciman University of York The

Supercompilation for Haskell Neil Mitchell, Colin Runciman www.cs.york.ac.uk/~ndm/supero The

Deriving Generic Functions by Example Neil Mitchell www.cs.york.ac.uk/~ndm/derive Generic

Hoog e Fast Type Searching Neil Mitchell www.cs.york.ac.uk/~ndm/ Hoogle Synopsis Hoogle is

Supero: Making Haskell Faster Neil Mitchell, Colin Runciman www.cs.york.ac.uk/~ndm/supero The

Faster Haskell Neil Mitchell www.cs.york.ac.uk/~ndm The Goal Make Haskell faster

Instances for Free* Neil Mitchell www.cs.york.ac.uk/~ndm (* Postage and packaging charges may

York University www.cs.york.ac.uk/~ndm First order vs Higher order Higher order:

Rethinking Supercompilation Neil Mitchell ICFP 2010 community.haskell.org/~ndm/supero

Deriving a Relationship from a Single Example Neil Mitchell community.haskell.org/~ndm/derive

Shake Before Building Replacing Make with Haskell Neil Mitchell community.haskell.org/~ndm/shake

Detecting Pattern-Match Failures in Haskell Neil Mitchell and Colin Runciman York University

Making Every Contact Count (MECC) Content What is Making Every Contact Count? Who is

Recitation 4 Question 3: Flying off the handle Parent Child fork() count++; print(count); 1

Termination checking for a lazy functional language Neil Mitchell Neil Mitchell - Termination

Last time: GADTs a b 1/ 41 This time: monads (etc.) = &gt; &gt; 2/ 41 What do monads

Lecture 7: Shared memory programming David Bindel 20 Sep 2011 Logistics Still have a couple

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 16: Uncertainty and

CS70: Jean Walrand: Lecture 35. Continuous Probability 2 1. Review: CDF , PDF 2. Examples 3.

Generation of Non-Uniform Random Numbers Generation of Non-Uniform Random Numbers Refs: Chapter 8

Upper and Lower Semimodularity of the Supercharacter Theory Lattices of Cyclic Groups Samuel

PVMD Olindo Isabella Delft University of Technology Learning objectives 1. PV module design

TEAM MANAGERS' BRIEFING &amp; DRAW 15 th August 2019 @7pm STTA Conference Room Briefing Agenda

Last time: GADTs a b 1/ 41 This time: monads (etc.) = > > 2/ 41 What do monads

TEAM MANAGERS' BRIEFING & DRAW 15 th August 2019 @7pm STTA Conference Room Briefing Agenda