Supero: Making Haskell Faster Neil Mitchell, Colin Runciman - - PowerPoint PPT Presentation

supero making haskell faster
SMART_READER_LITE
LIVE PREVIEW

Supero: Making Haskell Faster Neil Mitchell, Colin Runciman - - PowerPoint PPT Presentation

Supero: Making Haskell Faster Neil Mitchell, Colin Runciman www.cs.york.ac.uk/~ndm/supero The Goal Make Haskell faster Reduce the runtime But keep high-level declarative style Without user annotations Different from


slide-1
SLIDE 1

Supero: Making Haskell Faster

Neil Mitchell, Colin Runciman

www.cs.york.ac.uk/~ndm/supero

slide-2
SLIDE 2

The Goal

Make Haskell ‘faster’

– Reduce the runtime – But keep high-level declarative style

Without user annotations

– Different from foldr/build, steam/unstream

slide-3
SLIDE 3

Word Counting

In Haskell

main = print . length . words =<< getContents

Very high level A nice ‘specification’ of the problem

slide-4
SLIDE 4

And in C

int main() { int i = 0, c, last_space = 1; while ((c = getchar()) != EOF) { int this_space = isspace(c); if (last_space && !this_space) i++; last_space = this_space; } printf("%i\n", i); return 0; } About 3 times faster than Haskell (gcc vs ghc)

slide-5
SLIDE 5

Why is Haskell slower?

Intermediate lists! (and other things)

– GHC allocates and garbage collects memory – C requires a fixed ~13Kb

length . words =<< getContents

– getContents produces a list – words consumes a list, produces a list of lists – length consumes the outer list

slide-6
SLIDE 6

Removing the lists

GHC already has foldr/build fusion

– e.g. map f (map g x) == map (f . g) x

But getContents is trapped under IO

– Much harder to fuse automatically – Don’t want to rewrite everything as foldr – Easy to go wrong (take function in GHC 6.6)

slide-7
SLIDE 7

Supero: Optimiser

No annotations or special functions Uses ideas of supercompilation Whole program Evaluate the program at compile time

– Start at main, and execute

Residuate when you reach a primitive

– The primitive is in the optimised program

slide-8
SLIDE 8

Optimising an Expression

Ο [case x of alts] = case Ο [x] of alts Ο [let v = x in y] = let v = Ο [x] in Ο [y] Ο [x y] = Ο [x] y Ο [f] = unfold f, if f is a not primitive Ο* = apply Ο until no further changes

Optimise the head of the expression Also apply standard simplification rules

slide-9
SLIDE 9

The tie back

Once an expression is optimised with Ο*

– The outmost expression is frozen – The inner expressions are assigned names

Each name and expression is then optimised

further

Identical expressions receive identical names

– Finitely many expressions/names

slide-10
SLIDE 10

An Example

sum x = case x of [] → 0 x:xs → x + sum xs range i n = case i > n of True → [] False → i : range (i+1) n main n = sum (range 0 n)

slide-11
SLIDE 11

Evaluation proceeds

main n sum (range 0 n) main n = main2 0 n where main2 i n = sum (range i n) case range i n of {[] → 0; x:xs → x + sum xs} case (case i > n of {True → []; False → …}) of … case i > n of {True → 0 ;False → i + sum (range (i+1) n)} tie back: main2 (i+1) n Generalise

slide-12
SLIDE 12

The Residual Program

main n = main2 i n main2 i n = if i > n then 0 else i + main2 (i+1) n

Lists have gone entirely Everything is now strict Using sum as foldl or foldl’ would have given

accumulator version

slide-13
SLIDE 13

Termination

Ο* does not necessarily terminate Some expressions may keep getting bigger Size bound on an expression

– If an expression exceeds a threshold – Then freeze the outermost expression shell

case map head xs of [] → True (y:ys) → and ys case map head xs of [] → True (y:ys) → and ys

slide-14
SLIDE 14

Termination Problems

Some programs like different bounds Ad hoc numeric parameters A better method may be based on

homeomorphic embedding

– Positive Supercompilation for a higher order call-

by-value language, by Peter A. Jonsson

slide-15
SLIDE 15

‘Supero’ Compilation

Haskell Core Core Haskell Executable

Yhc GHC Supero Yhc.Core

slide-16
SLIDE 16

GHC’s Contributions

GHC is a mature optimising compiler Primitives (Integer etc) Strictness analysis and unboxing STG code generation Machine code generation

slide-17
SLIDE 17

Comparative Runtime (40Mb file)

5 10 15 20 25

sec.

charcount linecount wordcount

C (gcc) Supero+GHC GHC

slide-18
SLIDE 18

Runtime as % of GHC time

10 20 30 40 50 60 70 80 90 100

% digits-e1 digits-e2 exp3 primes queens

slide-19
SLIDE 19

Conclusions

Still more work to be done

– Complete nofib suite is the target – Termination is the ‘open issue’

Haskell can perform as fast as C Haskell programs can go faster