[PPT] - Compilers and computer architecture: Just-in-time compilation Martin PowerPoint Presentation

SLIDE 1

Compilers and computer architecture: Just-in-time compilation

Martin Berger 1 December 2019

1Email: M.F.Berger@sussex.ac.uk, Office hours: Wed 12-13 in

Chi-2R312

1 / 1

SLIDE 2

Recall the function of compilers

2 / 1

SLIDE 3

Welcome to the cutting edge

3 / 1

SLIDE 4

Welcome to the cutting edge

Compilers are used to translate from programming languages humans can understand to machine code executable by

computers. Compilers come in two forms:

◮ Conventional ahead-of-time compilers where translation is

done once, long before program execution.

◮ Just-in-time (JIT) compilers where translation of program

fragments happens at the last possible moment and is interleaved with program execution.

4 / 1

SLIDE 5

Welcome to the cutting edge

Compilers are used to translate from programming languages humans can understand to machine code executable by

computers. Compilers come in two forms:

◮ Conventional ahead-of-time compilers where translation is

done once, long before program execution.

◮ Just-in-time (JIT) compilers where translation of program

fragments happens at the last possible moment and is interleaved with program execution. We spend the whole term learning about the former. Today I want to give you a brief introduction to the latter.

5 / 1

SLIDE 6

Why learn about JIT compilers?

6 / 1

SLIDE 7

Why learn about JIT compilers?

In the past, dynamically typed languages (e.g. Python, Javascript) were much more slow than statically typed languages (factor of 10 or worse). Even OO languages (e.g. Java) were a lot slower than procedural languages like C.

7 / 1

SLIDE 8

Why learn about JIT compilers?

In the past, dynamically typed languages (e.g. Python, Javascript) were much more slow than statically typed languages (factor of 10 or worse). Even OO languages (e.g. Java) were a lot slower than procedural languages like C. In the last couple of years, this gap has been narrowed

considerably. JIT compilers where the main cause of this

performance revolution.

8 / 1

SLIDE 9

Why learn about JIT compilers?

In the past, dynamically typed languages (e.g. Python, Javascript) were much more slow than statically typed languages (factor of 10 or worse). Even OO languages (e.g. Java) were a lot slower than procedural languages like C. In the last couple of years, this gap has been narrowed

considerably. JIT compilers where the main cause of this

performance revolution. JIT compilers are cutting (bleeding) edge technology and considerably more complex than normal compilers, which are already non-trivial. Hence the presentation today will be massively simplifying.

9 / 1

SLIDE 10

If JIT compilers are the answer ... what is the problem?

10 / 1

SLIDE 11

If JIT compilers are the answer ... what is the problem?

Let’s look at two examples. Remember the compilation of

bjects and classes?

a dptr Instances of A Pointer to f_A Pointer to g_A dptr dptr dptr Method table for A Code for f_A Code for g_A Method bodies Pointer to f_B Method table for B Method bodies a Pointer to g_A Code for f_B dptr bdptr b Instances of B

To deal with inheritance of methods, invoking a method is indirect via the method table. Each invocation has to follow two

pointers. Without inheritance, no need for indirection.

11 / 1

SLIDE 12

If JIT compilers are the answer ... what is the problem?

Of course an individual indirection takes < 1 nano-second on a modern CPU. So why worry?

12 / 1

SLIDE 13

If JIT compilers are the answer ... what is the problem?

Of course an individual indirection takes < 1 nano-second on a modern CPU. So why worry? Answer: loops!

interface I { int f ( int n ); } class A implements I { public int f ( int n ) { return n; } } class B implements I { public int f ( int n ) { return 2*n; } } class Main { public static void main ( String [] args ) { I o = new A (); for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) {

.f ( i+j ); } } } }

Performance penalties add up.

13 / 1

SLIDE 14

If JIT compilers are the answer ... what is the problem?

14 / 1

SLIDE 15

If JIT compilers are the answer ... what is the problem?

But, I hear you say, it’s obvious, even at compile time, that the

bject o is of class A. A good optimising compiler should be

able to work this out, and replace the indirect invocation of f with a cheaper direct jump.

class Main { public static void main ( String [] args ) { I o = new A (); for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) {

.f ( i+j ); } } } }

15 / 1

SLIDE 16

If JIT compilers are the answer ... what is the problem?

But, I hear you say, it’s obvious, even at compile time, that the

bject o is of class A. A good optimising compiler should be

able to work this out, and replace the indirect invocation of f with a cheaper direct jump.

class Main { public static void main ( String [] args ) { I o = new A (); for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) {

.f ( i+j ); } } } }

Yes, in this simple example, a good optimising compiler can do

this. But what about the following?

16 / 1

SLIDE 17

If JIT compilers are the answer ... what is the problem?

public static void main ( String [] args ) { I o = null; if ( args [ 0 ] == "hello" ) new A (); else new B (); for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) {

.f ( i+j ); } } } }

17 / 1

SLIDE 18

If JIT compilers are the answer ... what is the problem?

public static void main ( String [] args ) { I o = null; if ( args [ 0 ] == "hello" ) new A (); else new B (); for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) {

.f ( i+j ); } } } }

Now the type of o is determined only at run-time.

18 / 1

SLIDE 19

If JIT compilers are the answer ... what is the problem?

public static void main ( String [] args ) { I o = null; if ( args [ 0 ] == "hello" ) new A (); else new B (); for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) {

.f ( i+j ); } } } }

Now the type of o is determined only at run-time. What is the problem?

19 / 1

SLIDE 20

If JIT compilers are the answer ... what is the problem?

public static void main ( String [] args ) { I o = null; if ( args [ 0 ] == "hello" ) new A (); else new B (); for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) {

.f ( i+j ); } } } }

Now the type of o is determined only at run-time. What is the problem? Not enough information at compile-time to carry

ut optimisation!

20 / 1

SLIDE 21

If JIT compilers are the answer ... what is the problem?

public static void main ( String [] args ) { I o = null; if ( args [ 0 ] == "hello" ) new A (); else new B (); for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) {

.f ( i+j ); } } } }

Now the type of o is determined only at run-time. What is the problem? Not enough information at compile-time to carry

ut optimisation! At run-time we do have this information, but

that’s too late (for normal compilers).

21 / 1

SLIDE 22

If JIT compilers are the answer ... what is the problem?

public static void main ( String [] args ) { I o = null; if ( args [ 0 ] == "hello" ) new A (); else new B (); for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) {

.f ( i+j ); } } } }

Now the type of o is determined only at run-time. What is the problem? Not enough information at compile-time to carry

ut optimisation! At run-time we do have this information, but

that’s too late (for normal compilers). (Aside, can you see a hack to deal with this problem in an AOT compiler?)

22 / 1

SLIDE 23

If JIT compilers are the answer ... what is the problem?

23 / 1

SLIDE 24

If JIT compilers are the answer ... what is the problem?

Dynamically typed languages have a worse problem.

24 / 1

SLIDE 25

If JIT compilers are the answer ... what is the problem?

Dynamically typed languages have a worse problem. Simplifying a little, variables in dynamically typed languages store not just the usual value, e.g. 3, but also the type of the value, e.g. Int, and sometimes even more.

25 / 1

SLIDE 26

If JIT compilers are the answer ... what is the problem?

Dynamically typed languages have a worse problem. Simplifying a little, variables in dynamically typed languages store not just the usual value, e.g. 3, but also the type of the value, e.g. Int, and sometimes even more. Whenever you carry an innocent operation like

x = x + y

under the hood something like the following happens.

let tx = typeof ( x ) let ty = typeof ( y ) if ( tx == Int && ty == Int ) let vx = value ( x ) let vy = value ( y ) let res = integer_addition ( vx, vy ) x_result_part = res x_type_part = Int else ... // even more complicated.

26 / 1

SLIDE 27

If JIT compilers are the answer ... what is the problem?

Imagine this in a nested loop!

for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) { let tx = typeof ( x ) let ty = typeof ( y ) if ( tx == Int && ty == Int ) let vx = value ( x ) let vy = value ( y ) let res = integer_addition ( vx, vy ) x_result_part = res x_type_part = Int ...

27 / 1

SLIDE 28

If JIT compilers are the answer ... what is the problem?

Imagine this in a nested loop!

for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) { let tx = typeof ( x ) let ty = typeof ( y ) if ( tx == Int && ty == Int ) let vx = value ( x ) let vy = value ( y ) let res = integer_addition ( vx, vy ) x_result_part = res x_type_part = Int ...

This is painful. This is why dynamically typed languages are slow(er).

28 / 1

SLIDE 29

If JIT compilers are the answer ... what is the problem?

But ...

29 / 1

SLIDE 30

If JIT compilers are the answer ... what is the problem?

But ... in practise, variables usually do not change their types in inner loops.

30 / 1

SLIDE 31

If JIT compilers are the answer ... what is the problem?

But ... in practise, variables usually do not change their types in inner loops. Why?

31 / 1

SLIDE 32

If JIT compilers are the answer ... what is the problem?

But ... in practise, variables usually do not change their types in inner loops. Why? Because typically innermost loops work on big and uniform data structures (usually big arrays).

32 / 1

SLIDE 33

If JIT compilers are the answer ... what is the problem?

But ... in practise, variables usually do not change their types in inner loops. Why? Because typically innermost loops work on big and uniform data structures (usually big arrays). So the compiler should move the type-checks outside the loops.

33 / 1

SLIDE 34

If JIT compilers are the answer ... what is the problem?

Recall that in dynamically typed languages

for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) { a [i, j] = a[i,j] + 1 } }

34 / 1

SLIDE 35

If JIT compilers are the answer ... what is the problem?

Recall that in dynamically typed languages

for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) { a [i, j] = a[i,j] + 1 } }

Is really

for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) { let ta = typeof ( a[i, j] ) // always same let t1 = typeof ( 1 ) // always same if ( ta == Int && t1 == Int ) { let va = value ( a[i, j] ) let v1 = value ( 1 ) // simplifying let res = integer_addition ( va, v1 ) a[ i, j ]_result_part = res a[ i, j ] _type_part = Int } else { ... } } }

35 / 1

SLIDE 36

If JIT compilers are the answer ... what is the problem?

So program from last slide can be

let ta = typeof ( a ) let t1 = typeof ( 1 ) if ( ta == Array [...] of Int && t1 == Int ) { for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) { let va = value ( a[i, j] ) let v1 = value ( 1 ) // simplifying let res = integer_addition ( va, v1 ) a[ i, j ]_result_part = res } } } else { ... }

36 / 1

SLIDE 37

If JIT compilers are the answer ... what is the problem?

So program from last slide can be

let ta = typeof ( a ) let t1 = typeof ( 1 ) if ( ta == Array [...] of Int && t1 == Int ) { for ( int i = 0; i < 1000000; i++ ) { for ( int j = 0; i < 1000000; j++ ) { let va = value ( a[i, j] ) let v1 = value ( 1 ) // simplifying let res = integer_addition ( va, v1 ) a[ i, j ]_result_part = res } } } else { ... }

Alas, at compile-time, the compiler does not have enough information to make this optimisation safely.

37 / 1

SLIDE 38

If JIT compilers are the answer ... what is the problem?

38 / 1

SLIDE 39

If JIT compilers are the answer ... what is the problem?

Let’s summarise the situation.

◮ Certain powerful optimisations cannot be done at

compile-time, because the compiler has not got enough information to know they are safe.

◮ At run-time we have enough information to carry out these

ptimisations.

39 / 1

SLIDE 40

If JIT compilers are the answer ... what is the problem?

Let’s summarise the situation.

◮ Certain powerful optimisations cannot be done at

compile-time, because the compiler has not got enough information to know they are safe.

◮ At run-time we have enough information to carry out these

ptimisations.

Hmmm, what could we do ...

40 / 1

SLIDE 41

41 / 1

SLIDE 42

How about we compile and optimise only at run-time?

42 / 1

SLIDE 43

How about we compile and optimise only at run-time? But there is no run-time if we don’t have a compilation process, right?

43 / 1

SLIDE 44

How about we compile and optimise only at run-time? But there is no run-time if we don’t have a compilation process, right? Enter interpreters!

44 / 1

SLIDE 45

Interpreters

45 / 1

SLIDE 46

Interpreters

Recall from the beginning of the course, that interpreters are a second way to run programs.

Compiler Source program Executable Data Output Source program Interpreter Data Output

At runtime. 46 / 1

SLIDE 47

Interpreters

Recall from the beginning of the course, that interpreters are a second way to run programs.

Compiler Source program Executable Data Output Source program Interpreter Data Output

At runtime.

◮ Compilers generate a program

that has an effect on the world.

47 / 1

SLIDE 48

Interpreters

Recall from the beginning of the course, that interpreters are a second way to run programs.

Compiler Source program Executable Data Output Source program Interpreter Data Output

At runtime.

◮ Compilers generate a program

that has an effect on the world.

◮ Interpreters effect the world

directly.

48 / 1

SLIDE 49

Interpreters

Recall from the beginning of the course, that interpreters are a second way to run programs.

Compiler Source program Executable Data Output Source program Interpreter Data Output

At runtime.

◮ The advantage of compilers is

that generated code is faster, because a lot of work has to be done only once (e.g. lexing, parsing, type-checking,

ptimisation). And the results
f this work are shared in

every execution. The interpreter has to redo this work every time.

◮ The advantage of interpreters

is that they are much simpler than compilers.

49 / 1

SLIDE 50

JIT compiler, key idea

50 / 1

SLIDE 51

JIT compiler, key idea

Interpret the program, and compile (parts of) the program at run-time.

51 / 1

SLIDE 52

JIT compiler, key idea

Interpret the program, and compile (parts of) the program at run-time. This suggests the following questions.

52 / 1

SLIDE 53

JIT compiler, key idea

Interpret the program, and compile (parts of) the program at run-time. This suggests the following questions.

◮ When shall we compile, and which parts of the program?

53 / 1

SLIDE 54

JIT compiler, key idea

Interpret the program, and compile (parts of) the program at run-time. This suggests the following questions.

◮ When shall we compile, and which parts of the program? ◮ How do interpreter and compiled program interact?

54 / 1

SLIDE 55

JIT compiler, key idea

Interpret the program, and compile (parts of) the program at run-time. This suggests the following questions.

◮ When shall we compile, and which parts of the program? ◮ How do interpreter and compiled program interact? ◮ But most of all: compilation is really slow, especially

ptimising compilation. Don’t we make performance worse

if we slow an already slow interpreter down with a lengthy compilation process?

55 / 1

SLIDE 56

JIT compiler, key idea

Interpret the program, and compile (parts of) the program at run-time. This suggests the following questions.

◮ When shall we compile, and which parts of the program? ◮ How do interpreter and compiled program interact? ◮ But most of all: compilation is really slow, especially

ptimising compilation. Don’t we make performance worse

if we slow an already slow interpreter down with a lengthy compilation process? In other words, we are facing the following conundrum:

56 / 1

SLIDE 57

JIT compiler, key idea

Interpret the program, and compile (parts of) the program at run-time. This suggests the following questions.

◮ When shall we compile, and which parts of the program? ◮ How do interpreter and compiled program interact? ◮ But most of all: compilation is really slow, especially

ptimising compilation. Don’t we make performance worse

if we slow an already slow interpreter down with a lengthy compilation process? In other words, we are facing the following conundrum:

◮ We want to optimise as much as possible, because

ptimised programs run faster.

57 / 1

SLIDE 58

JIT compiler, key idea

Interpret the program, and compile (parts of) the program at run-time. This suggests the following questions.

◮ When shall we compile, and which parts of the program? ◮ How do interpreter and compiled program interact? ◮ But most of all: compilation is really slow, especially

ptimising compilation. Don’t we make performance worse

if we slow an already slow interpreter down with a lengthy compilation process? In other words, we are facing the following conundrum:

◮ We want to optimise as much as possible, because

ptimised programs run faster.

◮ We want to optimises as little as possible, because running

the optimisers is really slow.

58 / 1

SLIDE 59

JIT compiler, key idea

Interpret the program, and compile (parts of) the program at run-time. This suggests the following questions.

◮ When shall we compile, and which parts of the program? ◮ How do interpreter and compiled program interact? ◮ But most of all: compilation is really slow, especially

ptimising compilation. Don’t we make performance worse

if we slow an already slow interpreter down with a lengthy compilation process? In other words, we are facing the following conundrum:

◮ We want to optimise as much as possible, because

ptimised programs run faster.

◮ We want to optimises as little as possible, because running

the optimisers is really slow. Hmmmm ...

59 / 1

SLIDE 60

Pareto principle and compiler/interpreter ∆ to our rescue

Compiling Running Running Interpreter Compiler Time Paid every time Paid once

Interpretation is much faster than (optimising) compilation. But a compiled program is much faster than

interpretation. And we have

to compile only once.

60 / 1

SLIDE 61

Pareto principle and compiler/interpreter ∆ to our rescue

Compiling Running Running Interpreter Compiler Time Paid every time Paid once

Interpretation is much faster than (optimising) compilation. But a compiled program is much faster than

interpretation. And we have

to compile only once. Combine this with the Pareto principle, and you have a potent weapon at hand.

61 / 1

SLIDE 62

Pareto principle, aka 80-20 rule

62 / 1

SLIDE 63

Pareto principle, aka 80-20 rule

Vilfredo Pareto, late 19th, early 20th century Italian economist. Noticed:

◮ 80% of land in Italy was owned by 20% of the population. ◮ 20% of the pea pods in his garden contained 80% of the

peas.

63 / 1

SLIDE 64

Pareto principle, aka 80-20 rule

Vilfredo Pareto, late 19th, early 20th century Italian economist. Noticed:

◮ 80% of land in Italy was owned by 20% of the population. ◮ 20% of the pea pods in his garden contained 80% of the

peas. This principle applies in many other areas of life, including program execution:

64 / 1

SLIDE 65

Pareto principle, aka 80-20 rule

Vilfredo Pareto, late 19th, early 20th century Italian economist. Noticed:

◮ 80% of land in Italy was owned by 20% of the population. ◮ 20% of the pea pods in his garden contained 80% of the

peas. This principle applies in many other areas of life, including program execution: The great majority of a program’s execution time is spent running in a tiny fragment of the code.

65 / 1

SLIDE 66

Pareto principle, aka 80-20 rule

Vilfredo Pareto, late 19th, early 20th century Italian economist. Noticed:

◮ 80% of land in Italy was owned by 20% of the population. ◮ 20% of the pea pods in his garden contained 80% of the

peas. This principle applies in many other areas of life, including program execution: The great majority of a program’s execution time is spent running in a tiny fragment of the code. Such code is referred to as hot.

66 / 1

SLIDE 67

Putting the pieces together

67 / 1

SLIDE 68

Putting the pieces together

Compiling Running Running Interpreter Compiler Time

Clearly compiling at run-time code that’s executed infrequently will slow down execution. Trade-offs are different for hot code.

68 / 1

SLIDE 69

Putting the pieces together

Compiling Running Running Interpreter Compiler Time

Clearly compiling at run-time code that’s executed infrequently will slow down execution. Trade-offs are different for hot code. An innermost loop may be executed billions of

times. The more often, the more optimising

compilation pays off.

69 / 1

SLIDE 70

Putting the pieces together

Compiling Running Running Interpreter Compiler Time

Clearly compiling at run-time code that’s executed infrequently will slow down execution. Trade-offs are different for hot code. An innermost loop may be executed billions of

times. The more often, the more optimising

compilation pays off. Pareto’s principle tells us that (typically) a program contains some hot code.

70 / 1

SLIDE 71

Putting the pieces together

Compiling Running Running Interpreter Compiler Time

Clearly compiling at run-time code that’s executed infrequently will slow down execution. Trade-offs are different for hot code. An innermost loop may be executed billions of

times. The more often, the more optimising

compilation pays off. Pareto’s principle tells us that (typically) a program contains some hot code. With the information available at run-time, we can aggressively optimise such hot code, and get a massive speed-up. The rest is interpreted. Sluggishness of interpretation doesn’t matter, because it’s only a fraction of program execution time.

71 / 1

SLIDE 72

There is just one problem ... how do we find hot code?

72 / 1

SLIDE 73

There is just one problem ... how do we find hot code?

Remember, at compiler time, the optimiser couldn’t work it out (reliably).

73 / 1

SLIDE 74

There is just one problem ... how do we find hot code?

Remember, at compiler time, the optimiser couldn’t work it out (reliably).

74 / 1

SLIDE 75

There is just one problem ... how do we find hot code?

Remember, at compiler time, the optimiser couldn’t work it out (reliably). Let’s use counters at run-time!

75 / 1

SLIDE 76

There is just one problem ... how do we find hot code?

Remember, at compiler time, the optimiser couldn’t work it out (reliably). Let’s use counters at run-time! We instrument the interpreter with counters, that increment every time a method is called,

r every time we go round a loop.

76 / 1

SLIDE 77

There is just one problem ... how do we find hot code?

Remember, at compiler time, the optimiser couldn’t work it out (reliably). Let’s use counters at run-time! We instrument the interpreter with counters, that increment every time a method is called,

r every time we go round a loop.

Whenever these counters reach a threshold, we assume that the associated code is hot. We compile that hot code, and jump to the compiled code.

77 / 1

SLIDE 78

There is just one problem ... how do we find hot code?

Remember, at compiler time, the optimiser couldn’t work it out (reliably). Let’s use counters at run-time! We instrument the interpreter with counters, that increment every time a method is called,

r every time we go round a loop.

Whenever these counters reach a threshold, we assume that the associated code is hot. We compile that hot code, and jump to the compiled code. Making this play nice with garbage collection, exceptions, concurrency, debugging isn’t easy ...

78 / 1

SLIDE 79

There is just one problem ... how do we find hot code?

Remember, at compiler time, the optimiser couldn’t work it out (reliably). Let’s use counters at run-time! We instrument the interpreter with counters, that increment every time a method is called,

r every time we go round a loop.

Whenever these counters reach a threshold, we assume that the associated code is hot. We compile that hot code, and jump to the compiled code. Making this play nice with garbage collection, exceptions, concurrency, debugging isn’t easy ... When the compiled code terminates, we switch back to interpretation.

79 / 1

SLIDE 80

In a picture

80 / 1

SLIDE 81

In a picture

Interpret Increment counter Hot code Compile hot code and

ptimise

No Yes Execute compiled hot code to termination Source code

81 / 1

SLIDE 82

Aside

82 / 1

SLIDE 83

Aside

Have you noticed that Java programs start up quite slowly?

83 / 1

SLIDE 84

Aside

Have you noticed that Java programs start up quite slowly? This is because at the beginning, everything is interpreted, hence slow. Then JIT compilation starts, also slow.

84 / 1

SLIDE 85

Aside

Have you noticed that Java programs start up quite slowly? This is because at the beginning, everything is interpreted, hence slow. Then JIT compilation starts, also slow. Eventually, the hot code is detected and compiled with a great deal of optimisation. Then execution gets really fast.

85 / 1

SLIDE 86

The devil is in the details

86 / 1

SLIDE 87

The devil is in the details

This picture omits many subtleties.

87 / 1

SLIDE 88

The devil is in the details

This picture omits many subtleties. Chief among those is that the handover of control from interpreter to compiler and back works seamlessly.

88 / 1

SLIDE 89

The devil is in the details

This picture omits many subtleties. Chief among those is that the handover of control from interpreter to compiler and back works seamlessly. Also, we don’t want to recompile code, typically use cache of already compiled code.

89 / 1

SLIDE 90

The devil is in the details

This picture omits many subtleties. Chief among those is that the handover of control from interpreter to compiler and back works seamlessly. Also, we don’t want to recompile code, typically use cache of already compiled code. How actually to do the optimisations, taking information available at run-time into account.

90 / 1

SLIDE 91

The devil is in the details

This picture omits many subtleties. Chief among those is that the handover of control from interpreter to compiler and back works seamlessly. Also, we don’t want to recompile code, typically use cache of already compiled code. How actually to do the optimisations, taking information available at run-time into account. Etc etc.

91 / 1

SLIDE 92

JIT compilers summary

92 / 1

SLIDE 93

JIT compilers summary

JIT compilers are the cutting edge of compiler technology. They were first conceived (in rudimentary form) in the 1960s, but came to life in the last 10 years or so.

93 / 1

SLIDE 94

JIT compilers summary

JIT compilers are the cutting edge of compiler technology. They were first conceived (in rudimentary form) in the 1960s, but came to life in the last 10 years or so. JIT compilers are very complicated. The JVM, probably the best known JIT compiler, probably took 1000+ person years to build.

94 / 1

SLIDE 95

JIT compilers summary

JIT compilers are the cutting edge of compiler technology. They were first conceived (in rudimentary form) in the 1960s, but came to life in the last 10 years or so. JIT compilers are very complicated. The JVM, probably the best known JIT compiler, probably took 1000+ person years to build. So what’s next in compiler technology?

95 / 1

SLIDE 96

JIT compilers summary

JIT compilers are the cutting edge of compiler technology. They were first conceived (in rudimentary form) in the 1960s, but came to life in the last 10 years or so. JIT compilers are very complicated. The JVM, probably the best known JIT compiler, probably took 1000+ person years to build. So what’s next in compiler technology? Let me introduce you to ...

96 / 1

SLIDE 97

Tracing JIT compilers

97 / 1

SLIDE 98

Tracing JIT compilers

Tracing JIT compilers are a form of JIT compilation where

ptimisation is especially aggressive.

98 / 1

SLIDE 99

Tracing JIT compilers

99 / 1

SLIDE 100

Tracing JIT compilers

Hot code can contain code that is not used (much).

100 / 1

SLIDE 101

Tracing JIT compilers

Hot code can contain code that is not used (much). Imagine the compilation of:

for ( x = 1 to 1000000 ) for ( y = 1 to 1000000 ) try a[ x ][ y ] = a[ x+1 ][ a [ y-1 ][ y+1 ] ] catch ... // error handling

101 / 1

SLIDE 102

Tracing JIT compilers

Hot code can contain code that is not used (much). Imagine the compilation of:

for ( x = 1 to 1000000 ) for ( y = 1 to 1000000 ) try a[ x ][ y ] = a[ x+1 ][ a [ y-1 ][ y+1 ] ] catch ... // error handling

Clearly the try-catch block is an innermost loop, so potentially hot code. But if the programmer does a good job, the exception handling will never be triggered. Yet we have all this exception handling code (tends to be large) in the hot loop. This causes all manner of problems, e.g. cache locality is destroyed.

102 / 1

SLIDE 103

Tracing JIT compilers

Hot code can contain code that is not used (much). Imagine the compilation of:

for ( x = 1 to 1000000 ) for ( y = 1 to 1000000 ) try a[ x ][ y ] = a[ x+1 ][ a [ y-1 ][ y+1 ] ] catch ... // error handling

It is difficult to figure out, even at run-time (!) to find such parts.

103 / 1

SLIDE 104

Tracing JIT compilers

Hot code can contain code that is not used (much). Imagine the compilation of:

for ( x = 1 to 1000000 ) for ( y = 1 to 1000000 ) try a[ x ][ y ] = a[ x+1 ][ a [ y-1 ][ y+1 ] ] catch ... // error handling

It is difficult to figure out, even at run-time (!) to find such parts. Why can’t we use counters?

104 / 1

SLIDE 105

Tracing JIT compilers

Hot code can contain code that is not used (much). Imagine the compilation of:

for ( x = 1 to 1000000 ) for ( y = 1 to 1000000 ) try a[ x ][ y ] = a[ x+1 ][ a [ y-1 ][ y+1 ] ] catch ... // error handling

It is difficult to figure out, even at run-time (!) to find such parts. Why can’t we use counters? Yes but ... counters only give us some relevant information ... for good optimisation we need more information. Traces give us this information. What are traces?

105 / 1

SLIDE 106

Tracing JIT compilers

106 / 1

SLIDE 107

Tracing JIT compilers

Tracing JIT compilers have not one, but several compilers (or interpreters) inside (simplifying greatly).

107 / 1

SLIDE 108

Tracing JIT compilers

Tracing JIT compilers have not one, but several compilers (or interpreters) inside (simplifying greatly). After the interpreter has found hot code, the hot code is compiled and executed once (called tracing execution).

108 / 1

SLIDE 109

Tracing JIT compilers

Tracing JIT compilers have not one, but several compilers (or interpreters) inside (simplifying greatly). After the interpreter has found hot code, the hot code is compiled and executed once (called tracing execution). In the tracing execution, the machine code actually executed is recorded, yielding the trace of the hot code.

109 / 1

SLIDE 110

Tracing JIT compilers

Tracing JIT compilers have not one, but several compilers (or interpreters) inside (simplifying greatly). After the interpreter has found hot code, the hot code is compiled and executed once (called tracing execution). In the tracing execution, the machine code actually executed is recorded, yielding the trace of the hot code. Note that if the machine code to be traced is branching, only the branch taken is in the trace. Traces are linear, no branching. This makes optimisation algorithms much simpler and faster.

110 / 1

SLIDE 111

Tracing JIT compilers

Tracing JIT compilers have not one, but several compilers (or interpreters) inside (simplifying greatly). After the interpreter has found hot code, the hot code is compiled and executed once (called tracing execution). In the tracing execution, the machine code actually executed is recorded, yielding the trace of the hot code. Note that if the machine code to be traced is branching, only the branch taken is in the trace. Traces are linear, no branching. This makes optimisation algorithms much simpler and faster. Once tracing has finished, e.g. the body of the hot loop has been executed once: then analyse and optimise trace.

111 / 1

SLIDE 112

Tracing JIT compilers

Tracing JIT compilers have not one, but several compilers (or interpreters) inside (simplifying greatly). After the interpreter has found hot code, the hot code is compiled and executed once (called tracing execution). In the tracing execution, the machine code actually executed is recorded, yielding the trace of the hot code. Note that if the machine code to be traced is branching, only the branch taken is in the trace. Traces are linear, no branching. This makes optimisation algorithms much simpler and faster. Once tracing has finished, e.g. the body of the hot loop has been executed once: then analyse and optimise trace. Based on the analysis another compiler generates another (highly optimised) executable, which is then run to termination, then control goes back to interpreter.

112 / 1

SLIDE 113

Tracing JIT compilers

Analysing and optimising the trace:

◮ Find out if variables change type in the loop, if not, move

type-checking out of the loop. (For dynamically typed languages.)

◮ Find out if object change type in the loop, if not, use

short-cut method invocations, no need to go via method table.

◮ Let the interpreter handle the rarely used parts of the hot

loop (e.g. error handling).

◮ ... ◮ Finally, enter the third phase, the ’normal’ execution of the

ptimised trace.

113 / 1

SLIDE 114

A tracing JIT compiler in a picture

Interpret Increment counter Hot code Compile No Yes Execute compiled code and record trace Analyse trace Optimise trace Execute

ptimised

trace Source code

114 / 1

SLIDE 115

Difficulties

As with normal JIT compilers, we have to orchestrate the interplay of all these compiler phases, e.g.: Handover of control from interpreter to compiler, to tracing, to execution of

ptimised trace, and back. Garbage collection, exceptions,

concurrency etc must all also work.

115 / 1

SLIDE 116

Difficulties

As with normal JIT compilers, we have to orchestrate the interplay of all these compiler phases, e.g.: Handover of control from interpreter to compiler, to tracing, to execution of

ptimised trace, and back. Garbage collection, exceptions,

concurrency etc must all also work. Typical optimisations: type-specialisation, bypassing method invocation, function inlining, register allocation, dead code elimination.

116 / 1

SLIDE 117

Difficulties

As with normal JIT compilers, we have to orchestrate the interplay of all these compiler phases, e.g.: Handover of control from interpreter to compiler, to tracing, to execution of

ptimised trace, and back. Garbage collection, exceptions,

concurrency etc must all also work. Typical optimisations: type-specialisation, bypassing method invocation, function inlining, register allocation, dead code elimination. Etc etc.

117 / 1

SLIDE 118

Example compilers

118 / 1

SLIDE 119

Example compilers

The JVM (from Oracle). It is a method based JIT compiler, meaning that methods are the units of compilation. It is not tracing.

119 / 1

SLIDE 120

Example compilers

The JVM (from Oracle). It is a method based JIT compiler, meaning that methods are the units of compilation. It is not tracing. The first implementation of a tracing JIT was HPs Dynamo. It does not compile from a high-level language to a low-level

language. Instead it optimises machine-code.

120 / 1

SLIDE 121

Example compilers

The JVM (from Oracle). It is a method based JIT compiler, meaning that methods are the units of compilation. It is not tracing. The first implementation of a tracing JIT was HPs Dynamo. It does not compile from a high-level language to a low-level

language. Instead it optimises machine-code.

HotpathVM was the first tracing JIT for a high-level language (Java).

121 / 1

SLIDE 122

Example compilers

The JVM (from Oracle). It is a method based JIT compiler, meaning that methods are the units of compilation. It is not tracing. The first implementation of a tracing JIT was HPs Dynamo. It does not compile from a high-level language to a low-level

language. Instead it optimises machine-code.

HotpathVM was the first tracing JIT for a high-level language (Java). TraceMonkey, one of Firefox’s JavaScript implementations was first JIT compiler for Javascript. (NB: Current Firefox’s SpiderMonkey is not tracing.)

122 / 1

SLIDE 123

Example compilers

The JVM (from Oracle). It is a method based JIT compiler, meaning that methods are the units of compilation. It is not tracing. The first implementation of a tracing JIT was HPs Dynamo. It does not compile from a high-level language to a low-level

language. Instead it optimises machine-code.

HotpathVM was the first tracing JIT for a high-level language (Java). TraceMonkey, one of Firefox’s JavaScript implementations was first JIT compiler for Javascript. (NB: Current Firefox’s SpiderMonkey is not tracing.) Hard to say exactly who uses what (e.g. Apple Safari) since companies rarely say what they’re using. They can use more than one. Trade secrets.

123 / 1

SLIDE 124

Example compilers

Open source: PyPy, a meta-tracing framework for Python.

124 / 1

SLIDE 125

Example compilers

Open source: PyPy, a meta-tracing framework for Python. Meta-tracing, what’s that?

125 / 1

SLIDE 126

Meta-tracing?

126 / 1

SLIDE 127

Meta-tracing

Background:

127 / 1

SLIDE 128

Meta-tracing

Background: Writing compilers is hard, writing optimising compilers is harder, writing JIT compilers is harder still, but writing tracing JIT compilers is the hardest.

128 / 1

SLIDE 129

Meta-tracing

Background: Writing compilers is hard, writing optimising compilers is harder, writing JIT compilers is harder still, but writing tracing JIT compilers is the hardest. Designers of new programming languages cannot really produce a good code generator for a new language. Typically language designers write interpreters for new languages. But that means the new language is hampered. This impedes progress in programming languages.

129 / 1

SLIDE 130

Meta-tracing

Background: Writing compilers is hard, writing optimising compilers is harder, writing JIT compilers is harder still, but writing tracing JIT compilers is the hardest. Designers of new programming languages cannot really produce a good code generator for a new language. Typically language designers write interpreters for new languages. But that means the new language is hampered. This impedes progress in programming languages. Great idea: how about using a JIT compiler to compile the interpreter, hoping that JITing will speed up interpreter, hence new PL.

130 / 1

SLIDE 131

Meta-tracing

Background: Writing compilers is hard, writing optimising compilers is harder, writing JIT compilers is harder still, but writing tracing JIT compilers is the hardest. Designers of new programming languages cannot really produce a good code generator for a new language. Typically language designers write interpreters for new languages. But that means the new language is hampered. This impedes progress in programming languages. Great idea: how about using a JIT compiler to compile the interpreter, hoping that JITing will speed up interpreter, hence new PL. This idea is ingenious, simple, old and ...

131 / 1

SLIDE 132

Meta-tracing

Background: Writing compilers is hard, writing optimising compilers is harder, writing JIT compilers is harder still, but writing tracing JIT compilers is the hardest. Designers of new programming languages cannot really produce a good code generator for a new language. Typically language designers write interpreters for new languages. But that means the new language is hampered. This impedes progress in programming languages. Great idea: how about using a JIT compiler to compile the interpreter, hoping that JITing will speed up interpreter, hence new PL. This idea is ingenious, simple, old and ... wrong!

132 / 1

SLIDE 133

Meta-tracing

Background: Writing compilers is hard, writing optimising compilers is harder, writing JIT compilers is harder still, but writing tracing JIT compilers is the hardest. Designers of new programming languages cannot really produce a good code generator for a new language. Typically language designers write interpreters for new languages. But that means the new language is hampered. This impedes progress in programming languages. Great idea: how about using a JIT compiler to compile the interpreter, hoping that JITing will speed up interpreter, hence new PL. This idea is ingenious, simple, old and ... wrong! The problem is that interpreter loops are the kinds of loops that JITers do not optimise well. Let’s explain this in detail.

133 / 1

SLIDE 134

Why JIT compilers can’t optimise interpreter loops

134 / 1

SLIDE 135

Why JIT compilers can’t optimise interpreter loops

An interpreter is a big loop that gets the next command and acts on it, e.g.

135 / 1

SLIDE 136

Why JIT compilers can’t optimise interpreter loops

An interpreter is a big loop that gets the next command and acts on it, e.g.

while true do: cmd = getNextCommand if cmd is: "x := E" then ... "if C then M else N" then ... "while C do M" then ... "repeat M until C" then ... "print(M)" then ... ...

136 / 1

SLIDE 137

Why JIT compilers can’t optimise interpreter loops

An interpreter is a big loop that gets the next command and acts on it, e.g.

while true do: cmd = getNextCommand if cmd is: "x := E" then ... "if C then M else N" then ... "while C do M" then ... "repeat M until C" then ... "print(M)" then ... ...

Now JIT compilers are really good at optimising loops, why do they fail with interpreter loops?

137 / 1

SLIDE 138

Key requirements for good JIT optimisation of loops

The essence of JIT compilation are tight inner loops that are executed a large number of times. This insight can be split into separate parts.

138 / 1

SLIDE 139

Key requirements for good JIT optimisation of loops

The essence of JIT compilation are tight inner loops that are executed a large number of times. This insight can be split into separate parts.

◮ Because they are executed a large number of times the

effect of the optimisation is magnified.

139 / 1

SLIDE 140

Key requirements for good JIT optimisation of loops

The essence of JIT compilation are tight inner loops that are executed a large number of times. This insight can be split into separate parts.

◮ Because they are executed a large number of times the

effect of the optimisation is magnified.

◮ Optimising these inner loops heavily gives substantial

performace benefits.

140 / 1

SLIDE 141

Key requirements for good JIT optimisation of loops

The essence of JIT compilation are tight inner loops that are executed a large number of times. This insight can be split into separate parts.

◮ Because they are executed a large number of times the

effect of the optimisation is magnified.

◮ Optimising these inner loops heavily gives substantial

performace benefits.

◮ Each iteration (or at least most of them) do the same thing.

141 / 1

SLIDE 142

Key requirements for good JIT optimisation of loops

The essence of JIT compilation are tight inner loops that are executed a large number of times. This insight can be split into separate parts.

◮ Because they are executed a large number of times the

effect of the optimisation is magnified.

◮ Optimising these inner loops heavily gives substantial

performace benefits.

◮ Each iteration (or at least most of them) do the same thing.

Last requirement is violated in interpreter loops.

142 / 1

SLIDE 143

Why can’t interpreter loops be JITed?

The problem is that the source language to be interpreted has loops too.

143 / 1

SLIDE 144

Why can’t interpreter loops be JITed?

The problem is that the source language to be interpreted has loops too. Let’s assume this is the programm we are interpreting.

while i > 0: j = j+i i = i-1

144 / 1

SLIDE 145

Why can’t interpreter loops be JITed?

The problem is that the source language to be interpreted has loops too. Let’s assume this is the programm we are interpreting.

while i > 0: j = j+i i = i-1

This gives rise to something like the following bytecode

loop: br r17 exit add r21 r33 r21 subabs r33 1 r33 jump loop exit: ...

145 / 1

SLIDE 146

Why can’t interpreter loops be JITed?

Let’s have bytecode and bytecode interpreter side-by-side:

loop: br r17 exit add r21 r33 r21 subabs r33 1 r33 jump loop exit: ... while true:

p = mem [ pc ]

pc = pc+1 case op = br: r = mem [ pc ] pc = pc+1 if mem [ r ] == 0: pc := mem [ pc ] case op = add: r1 = mem [ pc ] pc = pc+1 ...

146 / 1

SLIDE 147

Why can’t interpreter loops be JITed?

Let’s have bytecode and bytecode interpreter side-by-side:

loop: br r17 exit add r21 r33 r21 subabs r33 1 r33 jump loop exit: ... while true:

p = mem [ pc ]

pc = pc+1 case op = br: r = mem [ pc ] pc = pc+1 if mem [ r ] == 0: pc := mem [ pc ] case op = add: r1 = mem [ pc ] pc = pc+1 ...

Now every round of the interpreter takes a different branch.

147 / 1

SLIDE 148

Why can’t interpreter loops be JITed?

Let’s have bytecode and bytecode interpreter side-by-side:

loop: br r17 exit add r21 r33 r21 subabs r33 1 r33 jump loop exit: ... while true:

p = mem [ pc ]

pc = pc+1 case op = br: r = mem [ pc ] pc = pc+1 if mem [ r ] == 0: pc := mem [ pc ] case op = add: r1 = mem [ pc ] pc = pc+1 ...

Now every round of the interpreter takes a different branch. The tracing JIT can just optimise one branch through the loop.

148 / 1

SLIDE 149

Why can’t interpreter loops be JITed?

Let’s have bytecode and bytecode interpreter side-by-side:

loop: br r17 exit add r21 r33 r21 subabs r33 1 r33 jump loop exit: ... while true:

p = mem [ pc ]

pc = pc+1 case op = br: r = mem [ pc ] pc = pc+1 if mem [ r ] == 0: pc := mem [ pc ] case op = add: r1 = mem [ pc ] pc = pc+1 ...

Now every round of the interpreter takes a different branch. The tracing JIT can just optimise one branch through the loop. This is the worst case scenario: we pay the price of tracing,

ptimisation (since loop is executed a lot), only to throw away

the optimisation and go back to interpretation.

149 / 1

SLIDE 150

Why can’t interpreter loops be JITed?

150 / 1

SLIDE 151

Why can’t interpreter loops be JITed?

Profiling detects the wrong loop as hot code!

151 / 1

SLIDE 152

Why can’t interpreter loops be JITed?

Profiling detects the wrong loop as hot code! We want profiling to detect the (code corresponding to the) user loop, not the interpreter loop.

152 / 1

SLIDE 153

Why can’t interpreter loops be JITed?

Profiling detects the wrong loop as hot code! We want profiling to detect the (code corresponding to the) user loop, not the interpreter loop. Note that the (code corresponding to the) user loop consists of several rounds of the interpreter loop.

153 / 1

SLIDE 154

Why can’t interpreter loops be JITed?

Profiling detects the wrong loop as hot code! We want profiling to detect the (code corresponding to the) user loop, not the interpreter loop. Note that the (code corresponding to the) user loop consists of several rounds of the interpreter loop. This is too difficult to detect for profiling, since user programs can vary greatly.

154 / 1

SLIDE 155

Why can’t interpreter loops be JITed?

The interpreter writer knows what the user loops are like:

while true do: cmd = getNextCommand if cmd is: "x := E" then ... "if C then M else M" then ... "while C do M" then ... "repeat M until C" then ... "print(M)" then ... ...

155 / 1

SLIDE 156

Why can’t interpreter loops be JITed?

The interpreter writer knows what the user loops are like:

while true do: cmd = getNextCommand if cmd is: "x := E" then ... "if C then M else M" then ... "while C do M" then ... "repeat M until C" then ... "print(M)" then ... ...

The idea of meta-tracing is to let the interpreter writer annotate the interpreter code with ’hooks’ that tell the tracing JIT compiler where user loops start and end. The profiler can then identify the hot loops in (the interpretation of) user code.

156 / 1

SLIDE 157

Why can’t interpreter loops be JITed?

while true do: beginInterpreterLoop cmd = getNextCommand if cmd is: "x := E" then ... "while C do M" then beginUserLoop ... endUserLoop "repeat M until C" then beginUserLoop ... endUserLoop ... endInterpreterLoop

Annotations are used by profiler for finding hot user loops. Then user loops are traced & optimised.

157 / 1

SLIDE 158

Why can’t interpreter loops be JITed?

while true do: beginInterpreterLoop cmd = getNextCommand if cmd is: "x := E" then ... "while C do M" then beginUserLoop ... endUserLoop "repeat M until C" then beginUserLoop ... endUserLoop ... endInterpreterLoop

Annotations are used by profiler for finding hot user loops. Then user loops are traced & optimised. Result: Speedup!

158 / 1

SLIDE 159

Why can’t interpreter loops be JITed?

while true do: beginInterpreterLoop cmd = getNextCommand if cmd is: "x := E" then ... "while C do M" then beginUserLoop ... endUserLoop "repeat M until C" then beginUserLoop ... endUserLoop ... endInterpreterLoop

Annotations are used by profiler for finding hot user loops. Then user loops are traced & optimised. Result: Speedup! It is simple to annotate an interpreter.

159 / 1

SLIDE 160

Meta-tracing as game changer in PL development

The real advantage of this is that it divides the problem of developing high-performance JIT compilers for a language into several parts, each of which separately is much more mangable:

160 / 1

SLIDE 161

Meta-tracing as game changer in PL development

The real advantage of this is that it divides the problem of developing high-performance JIT compilers for a language into several parts, each of which separately is much more mangable:

1. Develop a (meta-)tracing JIT compiler. Hard, but needs to

be done only once.

161 / 1

SLIDE 162

Meta-tracing as game changer in PL development

The real advantage of this is that it divides the problem of developing high-performance JIT compilers for a language into several parts, each of which separately is much more mangable:

1. Develop a (meta-)tracing JIT compiler. Hard, but needs to

be done only once.

2. Develop an interpreter for the given source language.

Easy!

162 / 1

SLIDE 163

Meta-tracing as game changer in PL development

The real advantage of this is that it divides the problem of developing high-performance JIT compilers for a language into several parts, each of which separately is much more mangable:

1. Develop a (meta-)tracing JIT compiler. Hard, but needs to

be done only once.

2. Develop an interpreter for the given source language.

Easy!

3. Add annotations in the interpreter to expose user loops.

Easy!

163 / 1

SLIDE 164

Meta-tracing as game changer in PL development

The real advantage of this is that it divides the problem of developing high-performance JIT compilers for a language into several parts, each of which separately is much more mangable:

1. Develop a (meta-)tracing JIT compiler. Hard, but needs to

be done only once.

2. Develop an interpreter for the given source language.

Easy!

3. Add annotations in the interpreter to expose user loops.

Easy!

4. Run the interpreter using the tracing JIT from (1). Easy!

164 / 1

SLIDE 165

Meta-tracing as game changer in PL development

The real advantage of this is that it divides the problem of developing high-performance JIT compilers for a language into several parts, each of which separately is much more mangable:

1. Develop a (meta-)tracing JIT compiler. Hard, but needs to

be done only once.

2. Develop an interpreter for the given source language.

Easy!

3. Add annotations in the interpreter to expose user loops.

Easy!

4. Run the interpreter using the tracing JIT from (1). Easy!

The tracing JIT from (1) can be reused for an unlimited number

f language interpreters.

165 / 1

SLIDE 166

Meta-tracing as game changer in PL development

The real advantage of this is that it divides the problem of developing high-performance JIT compilers for a language into several parts, each of which separately is much more mangable:

1. Develop a (meta-)tracing JIT compiler. Hard, but needs to

be done only once.

2. Develop an interpreter for the given source language.

Easy!

3. Add annotations in the interpreter to expose user loops.

Easy!

4. Run the interpreter using the tracing JIT from (1). Easy!

The tracing JIT from (1) can be reused for an unlimited number

f language interpreters. Once a meta-tracing JIT is available,

we can easily develop new languages and have high-performance compilers for them (almost) for free.

166 / 1

SLIDE 167

Meta-tracing as game changer in PL development

The real advantage of this is that it divides the problem of developing high-performance JIT compilers for a language into several parts, each of which separately is much more mangable:

1. Develop a (meta-)tracing JIT compiler. Hard, but needs to

be done only once.

2. Develop an interpreter for the given source language.

Easy!

3. Add annotations in the interpreter to expose user loops.

Easy!

4. Run the interpreter using the tracing JIT from (1). Easy!

The tracing JIT from (1) can be reused for an unlimited number

f language interpreters. Once a meta-tracing JIT is available,

we can easily develop new languages and have high-performance compilers for them (almost) for free. The PyPy meta-tracing framework runs Python substantially faster than e.g. the CPython framework.

167 / 1

SLIDE 168

Brief remarks on performance

168 / 1

SLIDE 169

Brief remarks on performance

JIT compilers are built upon many trade-offs.

169 / 1

SLIDE 170

Brief remarks on performance

JIT compilers are built upon many trade-offs. Although JIT compilers can give lightning fast execution on typical programs, their worst-case execution time can be dreadful.

170 / 1

SLIDE 171

Brief remarks on performance

JIT compilers are built upon many trade-offs. Although JIT compilers can give lightning fast execution on typical programs, their worst-case execution time can be dreadful. JIT compilers work best for languages that do a lot of stuff at run-time (e.g. type-checking).

171 / 1

SLIDE 172

Brief remarks on performance

JIT compilers are built upon many trade-offs. Although JIT compilers can give lightning fast execution on typical programs, their worst-case execution time can be dreadful. JIT compilers work best for languages that do a lot of stuff at run-time (e.g. type-checking). For bare-bones languages like C, there is little to optimise at run-time, and code generated by a conventional C compiler with heavy (hence slow) optimisation will almost always beat a modern JIT compiler.

172 / 1

SLIDE 173

Compiler development in industry

173 / 1

SLIDE 174

Compiler development in industry

Lot’s of research going on into compilers, both conventional and (tracing) JITs. It’s super high-tech.

174 / 1

SLIDE 175

Compiler development in industry

Lot’s of research going on into compilers, both conventional and (tracing) JITs. It’s super high-tech. Big companies (Google, Microsoft, Oracle, Intel, Arm, Apple) compete heavily on quality (e.g. speed, energy usage) of their

compilers. They have large teams working on this. Find it

difficult to hire, because advanced compiler knowledge is rare.

175 / 1

SLIDE 176

Compiler development in industry

Lot’s of research going on into compilers, both conventional and (tracing) JITs. It’s super high-tech. Big companies (Google, Microsoft, Oracle, Intel, Arm, Apple) compete heavily on quality (e.g. speed, energy usage) of their

compilers. They have large teams working on this. Find it

difficult to hire, because advanced compiler knowledge is rare. Much work left to be done.

176 / 1

SLIDE 177

Interested?

177 / 1

SLIDE 178

Interested?

Compilers (and related subjects) great subject for final year projects.

178 / 1

SLIDE 179

Interested?

Compilers (and related subjects) great subject for final year projects. JRA (Junior Research Assistant) in the summer 2019.

179 / 1

SLIDE 180

Interested?

Compilers (and related subjects) great subject for final year projects. JRA (Junior Research Assistant) in the summer 2019. Feel free to talk to me about this.

180 / 1

SLIDE 181

181 / 1

SLIDE 182

182 / 1