Compiler Construction Compiler Construction 1 / 104 Mayer Goldberg - - PowerPoint PPT Presentation

compiler construction
SMART_READER_LITE
LIVE PREVIEW

Compiler Construction Compiler Construction 1 / 104 Mayer Goldberg - - PowerPoint PPT Presentation

Compiler Construction Compiler Construction 1 / 104 Mayer Goldberg \ Ben-Gurion University Monday 2 nd December, 2019 Mayer Goldberg \ Ben-Gurion University Chapter 4 Roadmap Compiler Construction 2 / 104 Scope The lexical environment


slide-1
SLIDE 1

Compiler Construction

Mayer Goldberg \ Ben-Gurion University Monday 2nd December, 2019

Mayer Goldberg \ Ben-Gurion University Compiler Construction 1 / 104

slide-2
SLIDE 2

Chapter 4

Roadmap

▶ Scope ▶ The lexical environment ▶ Boxing

Mayer Goldberg \ Ben-Gurion University Compiler Construction 2 / 104

slide-3
SLIDE 3

Scope

▶ Scope has to do with names as bindings:

▶ Variables ▶ Functions ▶ Methods ▶ Modules ▶ Packages ▶ Namespaces

etc.

▶ Scope has to do with where, and under what circumstances is a

binding valid

☞ Where, and under what circumstances is a name

visible/accessible

Mayer Goldberg \ Ben-Gurion University Compiler Construction 3 / 104

slide-4
SLIDE 4

Scope (continued)

Here are the kinds of questions we might ask:

▶ Point at a specifjc variable declaration in your code: What parts

  • f your code can access this variable?

⚠ Not another variable by the same name!

▶ Point at a variable occurrence anywhere in the code: Where is it

defjned? If you grew up on 1-2 programming languages, you might wonder whether these are serious questions: After all, how many difgerent ways can there be to relate names and bindings?

▶ Well, we’re going to see two very difgerent ways: Dynamic

Scope, and Lexical (aka Static) Scope

Mayer Goldberg \ Ben-Gurion University Compiler Construction 4 / 104

slide-5
SLIDE 5

Scope (continued)

Dynamic Scope

▶ When you call a function, the function may take arguments.

The values of these arguments are bound to special variables we call function parameters or just parameters.

▶ As we mentioned previously, the term dynamic means that

something is happening or known or computed during run-time. It’s another way of saying that it is not or cannot happen, be known, or be computed at compile-time.

▶ As we examine dynamic scope, you should ask yourself what are

the dynamic properties of this particular scope, and what do they imply about the experience of programming in languages that make use of dynamic scope…

Mayer Goldberg \ Ben-Gurion University Compiler Construction 5 / 104

slide-6
SLIDE 6

Scope (continued)

Dynamic Scope

Dynamic scope begins with a very natural intuition about variable name bindings:

▶ Upon a function call, push the bindings, i.e., the associations of

names & values onto the run-time stack

▶ When evaluating/executing the body of the function, look up

the values of any variable names on the run-time stack

▶ Upon returning from the function call, pop the parameter-value

bindings ofg of the stack, so the top of the stack is where it was just before the call

💤 Because this intuition is so natural, it took a couple of decades

before people realized just how problematic it really is!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 6 / 104

slide-7
SLIDE 7

Scope (continued)

Dynamic Scope (continued)

Let us see what is implied by such a model

▶ Consider the following code, written in some dialect of Scheme

that uses dynamic scope: > (define foo (lambda (n) (if (zero? n) k (+ n (foo (- n 1)))))) > (let ((k 0)) (foo 1000000)) 500000500000

Mayer Goldberg \ Ben-Gurion University Compiler Construction 7 / 104

slide-8
SLIDE 8

Scope (continued)

Dynamic Scope (continued)

How does the code (let ((k 0)) (foo 1000000)) evaluate?

▶ The binding k:0 is pushed onto the stack ▶ The binding n:1000000 is pushed onto the stack ▶ The binding n:999999 is pushed onto the stack ▶ The binding n:999998 is pushed onto the stack ▶ … ▶ The binding n:0 is pushed onto the stack ▶ The value of k is searched from the top of the stack…

▶ The access time for variable lookup is dynamic, and a function

  • f run-time data

▶ The access time is vastly difgerent for difgerent variables Mayer Goldberg \ Ben-Gurion University Compiler Construction 8 / 104

slide-9
SLIDE 9

Scope (continued)

Dynamic Scope (continued)

▶ This behaviour is very difgerent from what you see in C/C++ or

Java or Scheme…

▶ This is called the deep binding implementation of dynamic scope

▶ It’s not the ideas here that are very “deep”, but rather the

stack!

▶ Deep binding was used to implement LISP way back in 1959,

when LISP was invented

▶ The terrible performance that resulted from deep binding gave

LISP a bad reputation it had a hard time shaking ofg decades later…

▶ Because the problem was understood in terms of effjciency, a

solution was soon found that was suffjciently effjcient: The shallow binding implementation of dynamic scope

Mayer Goldberg \ Ben-Gurion University Compiler Construction 9 / 104

slide-10
SLIDE 10

Scope (continued)

Dynamic Scope (continued)

Shallow binding is a difgerent implementation of dynamic scope:

▶ Rather than a single stack onto which bindings are pushed, a

difgerent data structure is used:

▶ A hash-table of stacks, with one stack per variable name:

n k Hash Table ⋯ ⋯ Stack Per Variable

▶ The current value of a variable is always at the top of the stack

associated with that variable name, so access time is constant

Mayer Goldberg \ Ben-Gurion University Compiler Construction 10 / 104

slide-11
SLIDE 11

Scope (continued)

Dynamic Scope (continued)

▶ Variable-lookup under shallow bindings is much faster than

under deep binding, although it’s still not as effjcient as the situation in C/C++/Java, etc.

▶ The performance of dynamically-scoped programming languages

was no longer atrocious

▶ Shallow binding replaced deep binding as the implementation

technique for dynamic scope

Mayer Goldberg \ Ben-Gurion University Compiler Construction 11 / 104

slide-12
SLIDE 12

Scope (continued)

Dynamic Scope (continued)

The problem is that there’s still something quite dynamic about shallow binding

▶ that has nothing to do with effjciency ▶ that poses a more signifjcant problem than performance

Mayer Goldberg \ Ben-Gurion University Compiler Construction 12 / 104

slide-13
SLIDE 13

Scope (continued)

Dynamic Scope (continued)

Imagine the following implementation of the map function:

▶ Problem: Find a function f and a list s, such that:

▶ s should be a fjnite list, without any circularities ▶ f should terminate on all inputs

And yet, despite the reasonableness of f & s, (map f s) should go into an infjnite loop…

Mayer Goldberg \ Ben-Gurion University Compiler Construction 13 / 104

slide-14
SLIDE 14

Scope (continued)

Dynamic Scope (continued)

▶ Let f be (lambda (a) (set! s '(la la la la)) a) ▶ Let s be '(mary had a little lambda!) ▶ And the following code enters into an infjnite loop:

> (map (lambda (a) (set! s '(la la la la)) a) '(mary had a little lambda!))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 14 / 104

slide-15
SLIDE 15

Scope (continued)

The defjnition of map

(define map (lambda (f s) (if (null? s) '() (let ((x (f (car s)))) (cons x (map f (cdr s)))))))

Enter an infjnite loop…

> (map (lambda (a) (set! s '(la la la la)) a) '(mary had a little lambda!))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 15 / 104

slide-16
SLIDE 16

Scope (continued)

Dynamic Scope (continued)

So here’s the real problem with dynamic scope, regardless of how it is implemented (whether deep binding or shallow binding):

▶ The variable-bindings are visible across procedure boundaries:

▶ This means that if a procedure f has a local variable x, and f

calls g, then g can access x, and any procedure called by g can access x

▶ This means that as long as your code calls functions written

elsewhere, these functions may access any of your local variables for read/write

▶ This is a form of code-injection that cannot be sanitized away! Mayer Goldberg \ Ben-Gurion University Compiler Construction 16 / 104

slide-17
SLIDE 17

Scope (continued)

Dynamic Scope (continued)

So here’s the real problem with dynamic scope, regardless of how it is implemented (whether deep binding or shallow binding):

▶ Or from the perspective of the callee:

▶ If the callee relies on some global function, and some code

along the call-chain defjned a local variable by the same name, the callee is unable to access the global function, because it is

  • verridden by a “closer”, local binding

Mayer Goldberg \ Ben-Gurion University Compiler Construction 17 / 104

slide-18
SLIDE 18

Scope (continued)

Dynamic Scope (continued)

So here’s the real problem with dynamic scope, regardless of how it is implemented (whether deep binding or shallow binding):

▶ Under dynamic scope, the whole concept of correctness of the

code takes on a weird, unexpected meaning:

▶ Correctness is no longer a feature of the source code, but of the

specifjc circumstance in which it is used:

▶ Under dynamic scope, the correctness of the code becomes a

feature of a specifjc instance of running the code: It becomes situational —

▶ One way of using the code will be correct, and yield correct

results

▶ Another way of using the code will be incorrect, and yield

incorrect results

▶ We cannot prove correctness and be assured the program will

act correctly under all circumstances

Mayer Goldberg \ Ben-Gurion University Compiler Construction 18 / 104

slide-19
SLIDE 19

Scope (continued)

Dynamic Scope (continued)

So here’s the real problem with dynamic scope, regardless of how it is implemented (whether deep binding or shallow binding):

▶ Dynamic scope loses something that today we generally take for

granted

▶ There’s a name for this: It’s called referential transparency ▶ Referential transparency means that we may replace an

expression with its value in any context in which the original expression appears, without changing the meaning/behaviour of the program

☞ Dynamic scope creates complex relations between difgerent parts

  • f a program, that are diffjcult to trace or predict, resulting in

bugs that are diffjcult to fjnd and remove

Mayer Goldberg \ Ben-Gurion University Compiler Construction 19 / 104

slide-20
SLIDE 20

Scope (continued)

Dynamic Scope (continued)

▶ The main objection to dynamic scope isn’t effjciency:

▶ The issue of effjciency merely underscored the problems with a

particular implementation of dynamic scope, namely deep binding

▶ The main objection to dynamic scope is from a

software-engineering perspective

▶ Dynamically-scoped code is diffjcult to understand, and prone

to errors that are diffjcult to trace

▶ Loss of referential transparency Mayer Goldberg \ Ben-Gurion University Compiler Construction 20 / 104

slide-21
SLIDE 21

Scope (continued)

Dynamic Scope (continued)

▶ Since the late 1950’s and up to the 1970’s, dynamic scope was

popular in many dynamic programming languages: Early dialects

  • f LISP, early dialects of Smalltalk, Snobol, Logo, most dialects
  • f APL, Mathematica, Macsyma, bash, and many other

▶ Dynamic scope is optional in some modern programming

languages

Mayer Goldberg \ Ben-Gurion University Compiler Construction 21 / 104

slide-22
SLIDE 22

Scope (continued)

Dynamic Scope (continued)

▶ Dynamic scope pretty much disappeared as a mechanism for

managing the scope of variables

▶ Newly-created languages do not use dynamic scope for variables ▶ Dynamic scope has nevertheless not retired:

☞ It is now used to manage the scope of exception-handler in

most programming languages

▶ Unlike the situation the management of scope of variables, the

scope of exception-handlers mandates deep binding

Mayer Goldberg \ Ben-Gurion University Compiler Construction 22 / 104

slide-23
SLIDE 23

Dynamic Scope (continued)

Exception handling

▶ Exception handling is a control mechanism that provides for a

non-local exit-point where exceptional circumstances are handled

▶ Exceptional circumstances include error conditions, the

unavailability of resources, and other problems

▶ Some exceptional circumstances warrant the termination of the

program, in which case, action is taken to stabilize the computing environment: Closing fjles & network connections, de-allocating resources, etc.

▶ Some exceptional circumstances can be handled by the

program, after which execution proceeds as usual

▶ Non-local exit-points are placed in the program where action is

taken in response to exceptional circumstances

Mayer Goldberg \ Ben-Gurion University Compiler Construction 23 / 104

slide-24
SLIDE 24

Exception handling (continued)

Non-locality

The signifjcance of non-locality of the exception handler is that errors may be handled at very difgerent places from where they occur:

▶ Consider an SQL query at one point in the program:

▶ The query fails because of the exceptional circumstance of a

failure of a DBMS connection

▶ The code that sends the query might know nothing about the

DBMS connection itself

▶ The code that handles the exceptional situation might know

nothing about the SQL query

▶ So the exception handler and the SQL query appear in difgerent

places of the program

▶ Each having its own state ▶ Each having its own concerns Mayer Goldberg \ Ben-Gurion University Compiler Construction 24 / 104

slide-25
SLIDE 25

Exception handling (continued)

Non-locality (continued)

There may be more than a single handler for an exception:

▶ A later handler may override an earlier handler, or ▶ A later handler may provide some handling of the situation, and

then delegate control to an earlier handler, to continue handling

▶ A later handler could save information on the query that had

failed, so that it could be re-issued later

▶ An earlier handler could add a line to the system log ▶ Yet an earlier handler could attempt to restore the DBMS

connection or try alternative servers elsewhere

And all these event handlers would each do something, and then raise the same exception, triggering a cascading sequence of events

Mayer Goldberg \ Ben-Gurion University Compiler Construction 25 / 104

slide-26
SLIDE 26

Dynamic Scope (continued)

Exception handling (continued)

▶ From the description of how event handling is used, it follows

that exception handlers must be arranged in a linear, LIFO structure: A stack

▶ Handlers must be picked in a LIFO manner, by their name/type ▶ When an exception handler is reached, all exception handlers for

any exception that were placed on the stack after this handler are removed

▶ This implies deep-binding dynamic scope

Mayer Goldberg \ Ben-Gurion University Compiler Construction 26 / 104

slide-27
SLIDE 27

Dynamic Scope (continued)

Exception handling (continued)

▶ If a program raises exceptions often, the handlers of which are

far from the top of the stack, a heavy performance penalty will result

⚠ The situation in C++, and other languages that do not use

dynamic memory management, is actually worse than in languages that do, because when an exception raised, all objects allocated on the stack after the target handler must be de-allocated (as on function/method return).

Mayer Goldberg \ Ben-Gurion University Compiler Construction 27 / 104

slide-28
SLIDE 28

Scope (continued)

Question

Code compiled to use dynamic scoping:

👏 Is always less effjcient than code compiled to use lexical scoping 👏 Always uses more memory than code compiled to use lexical

scoping

👏 Always behaves the same as code compiled to use lexical scoping 👏 Never contains pre-computed, static memory addresses 👎 Breaks our notion of correctness

Mayer Goldberg \ Ben-Gurion University Compiler Construction 28 / 104

slide-29
SLIDE 29

Scope (continued)

Lexical Scope

▶ With dynamic scope, when a program calls some function or

refers to some variable, and this function or variable is not a parameter of the function, it is often not possible to know where the function or variable is defjned.

▶ The idea behind lexical scope (aka static scope) is that such

questions should be answerable at compile-time, and known by the compiler

▶ Static scope means that the scope of a name can be

determined before the program is executed

▶ Lexical scope means that the scope of a name is a lexical

property of the code, namely, a property of the syntax

▶ Both lexical scope & static scope mean exactly the same thing,

seen from two difgerent angles

Mayer Goldberg \ Ben-Gurion University Compiler Construction 29 / 104

slide-30
SLIDE 30

Scope (continued)

Lexical Scope

▶ That we can know where a name is defjned means that we can

know its address

▶ The absolute, physical (64-bit) address is a run-time artifact ▶ What can be known statically is the lexical address of any name

▶ The lexical address abstracts over the physical address ▶ The lexical address can be used to generate effjcient assembly

code

▶ The lexical address is relative Mayer Goldberg \ Ben-Gurion University Compiler Construction 30 / 104

slide-31
SLIDE 31

Scope (continued)

Lexical Scope

We recognize three kinds of variables in Scheme (and all lexically-scoped languages do something similar):

▶ Parameters ▶ Bound variables ▶ Free variables

Mayer Goldberg \ Ben-Gurion University Compiler Construction 31 / 104

slide-32
SLIDE 32

Lexical Scope

Parameters

▶ Parameters are the variables procedures use to access their

arguments

▶ Example: The parameters of the procedure (lambda (a b) (+

(* a a) (* b b))) are a, b

▶ The variables +, * are not parameters!

▶ The lexical address of a parameter is its 0-based index in the

parameter-list of the lambda-expression in which it is defjned

▶ Example: In the above code, a would be parameter-0, and b

would be parameter-1

Mayer Goldberg \ Ben-Gurion University Compiler Construction 32 / 104

slide-33
SLIDE 33

Lexical Scope (continued)

Bound variables

▶ The value of a lambda-expression is a closure ▶ A closure is a tagged data-structure that encloses a lexical

environment and a code pointer:

CLOSURE

lexical environmen code pointer

▶ The lexical environment is similar to the state of an object ▶ The code pointer is similar to the address of a method

☞ Closures are like objects with a single method called apply

Mayer Goldberg \ Ben-Gurion University Compiler Construction 33 / 104

slide-34
SLIDE 34

Example: Closure, bound variable

> (define count (let ((n 0)) (lambda () (set! n (+ n 1)) n))) > (count) 1 > (count) 2

▶ The variable n is in the lexical environment of count

▶ n is a bound variable within the body of count

▶ The procedure count takes no parameters!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 34 / 104

slide-35
SLIDE 35

Lexical Scope

Bound variables

(lambda (a) (a (lambda (b) (a b (lambda (c) (a b c)))))

▶ Parameter occurrences (on the stack) ▶ Bound variable occurrences (in the lexical environment, stored

in the heap)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 35 / 104

slide-36
SLIDE 36

Lexical Scope

The lexical environment

▶ We see that as the very same variable is accessed from within

inner lambda-expressions, what used to be a parameter becomes a bound variable, and what used to reside on the stack now resides in the heap

▶ The process of copying values from the stack onto the heap

extends the lexical environment of the outer lambda-expression, and the extended environment becomes the lexical environment

  • f the inner lambda-expression

▶ The extended lexical environment is the environment used in the

closure of the inner lambda-expression

▶ The maximal size of the lexical environment is the number of

nested lambda-expressions minus 1

Mayer Goldberg \ Ben-Gurion University Compiler Construction 36 / 104

slide-37
SLIDE 37

Lexical Scope

The lexical environment (continued)

▶ Question: What is the size of the largest lexical environment

generated by the following code: (define foo (lambda (q a z) (lambda (w s x) (lambda (e d c) (lambda (r f v) (lambda (t g b) (+ q a z w s x e d c r f v t g b)))))))

☞ Answer: 4

Mayer Goldberg \ Ben-Gurion University Compiler Construction 37 / 104

slide-38
SLIDE 38

Lexical Scope

The lexical environment (continued)

▶ The lexical addressing for bound variables is tightly coupled with

the representation of the environment

▶ We can discuss lexical addressing abstractly, without

committing to any specifjc data structures with which to represent the environment

▶ To implement the lexical environment effjciently, we must delve

into the nitty-gritty details of how and where things are represented, so we can compute the lexical address of a variable

▶ The diagram on the next slide describes

▶ The structure of the system stack ▶ The structure of the activation frame ▶ The lexical environment: its structure & location on the stack ▶ The extended environment: its structure & relation to the

environment it extends

Mayer Goldberg \ Ben-Gurion University Compiler Construction 38 / 104

slide-39
SLIDE 39

Lexical Scope

The lexical environment (continued)

The pushdown stack argument n-1 argument n-2 argument 0 arg count = n lexical environment return address pushed by the caller

  • ld frame pointer

local data pushed by the callee activation frame n-2 n-1 1 2 3 1 2 3 4 1 2 3 1 2 3 4 1 1 2 ⋯ ⋯ extended lexical environment lexical environment system stack Extending the lexical environment involves copying the major ribs from the old environment, allocating the new, zeroth rib, according to the number of parameters in the current frame, and copying the parameters from the stack, onto the zeroth rib

Mayer Goldberg \ Ben-Gurion University Compiler Construction 39 / 104

slide-40
SLIDE 40

Lexical Scope

The lexical environment ☞ You should commit the above diagram to memory, and

understand each and every component in it!

☞ Our implementation of lexical scope is very difgerent from the

  • ne you learned in PPL (Principles of Programming Language);

We shall discuss both implementations later on, but in the meantime, know that our implementation is designed for effjciency

☞ Our implementation of lexical scope is the same as that of most

functional and classless object-oriented programming languages, and mutatis mutandis similar to that of most class-based

  • bject-oriented programming languages

Mayer Goldberg \ Ben-Gurion University Compiler Construction 40 / 104

slide-41
SLIDE 41

Lexical Scope

The lexical environment (continued)

▶ The lexical environment is a vector of vectors

▶ This is not the same as a a two-dimensional array: The inner

vectors needn’t be the same size

▶ The lexical address of a bound variable consists of two integers,

indexing the major and minor vectors, respectively

▶ Example: What is the address of x in:

(lambda (x) (lambda (y z) (lambda (t) x)))

▶ Answer: bound-1-0. Mayer Goldberg \ Ben-Gurion University Compiler Construction 41 / 104

slide-42
SLIDE 42

Lexical Scope

The lexical environment (continued)

▶ Example: What are the addresses of x in:

(lambda (x) (x (lambda (y) (x y (lambda (z) (x y z))))))

▶ x ▶ parameter-0 ▶ bound-0-0 ▶ bound-1-0 Mayer Goldberg \ Ben-Gurion University Compiler Construction 42 / 104

slide-43
SLIDE 43

Lexical Scope

The lexical environment (continued)

▶ Example: What are the addresses of y & z in:

(lambda (x) (x (lambda (y) (x y (lambda (z) (x y z))))))

▶ y ▶ parameter-0 ▶ bound-0-0 ▶ z ▶ parameter-0 Mayer Goldberg \ Ben-Gurion University Compiler Construction 43 / 104

slide-44
SLIDE 44

Scope (continued)

Lexical Scope (continued)

Summing up the above example:

▶ Notice that difgerent variables can have the same lexical address ▶ Notice that the same variable can have difgerent lexical addresses ▶ The lexical address is relative

▶ Relative to the lexical environment ▶ Relative to the frame pointer Mayer Goldberg \ Ben-Gurion University Compiler Construction 44 / 104

slide-45
SLIDE 45

Scope (continued)

Lexical Scope (continued)

▶ An empty lexical environment is one that contains no variables ▶ The lexical environment is constructed incrementally, by

extending an existing environment, starting all the way from the empty lexical environment

🤕 When is the lexical environment extended? ☞ Answer: There are two possibilities:

▶ During application (PPL course) ▶ As part of the creation of a new closure (Compilers course)

▶ The lexical address of free variables is the address of their values

in the top-level

▶ The top-level & its structure shall be discussed later on, as we

approach the topic of code generation

Mayer Goldberg \ Ben-Gurion University Compiler Construction 45 / 104

slide-46
SLIDE 46

Lexical Scope (continued)

Extending the lexical environment

▶ Customarily all LISP/Scheme interpreters (ever since the fjrst

  • ne in 1959) extend the environment during application: Recall

that your Scheme interpreter —

▶ did not use a stack ▶ did not distinguish between parameters and bound variables ▶ all variables were implemented in an ”environment” ▶ Including free variables, which made up the initial environment ▶ was not very effjcient 😊

▶ Customarily all stack-based LISP/Scheme compilers extend the

environment during the creation of of closures:

▶ Parameters are distinguished from bound variables ▶ Parameters live on the stack ▶ Bound variables live in lexical environments ▶ Free variables live in hash tables Mayer Goldberg \ Ben-Gurion University Compiler Construction 46 / 104

slide-47
SLIDE 47

Scope (continued)

Lexical Scope (continued)

What did you do in your PPL interpreters:

▶ You extended the environment on applications ▶ You copied the address of the environment on closure-creation

This means that

▶ Constructing new closures was very cheap ▶ Applications were expensive

🤕 Which is more common?

Mayer Goldberg \ Ben-Gurion University Compiler Construction 47 / 104

slide-48
SLIDE 48

Scope (continued)

Lexical Scope (continued)

What we shall do in our compilers:

▶ We extend the environment on closure-creation ▶ We push [the address of] the environment onto the stack on

application This means that

▶ Constructing new closures is expensive ▶ Applications are cheap

🤕 Which is more common?

Mayer Goldberg \ Ben-Gurion University Compiler Construction 48 / 104

slide-49
SLIDE 49

Scope (continued)

Lexical Scope (continued)

In this course:

▶ By the lexical environment we mean only the vector of vectors

that implements bound variables, and that is part of the closure data-structure

▶ Lexical environments do not maintain: ▶ Parameters — Those live on the stack ▶ Global variables — Those live in the top-level envionment,

which is a hash-table, rather than a lexical-environment

☞ In PPL, the global environment was also called the initial

envionment, and all lexical environments were extensions of this initial environment

☞ In the Compilers Course, unlike in PPL, we do not build lexical

environments on top of the global, top-level environment

Mayer Goldberg \ Ben-Gurion University Compiler Construction 49 / 104

slide-50
SLIDE 50

Scope (continued)

Lexical Scope (continued)

In this course:

▶ Variable names are not important in the compiler:

▶ Names are used for writing code and for debugging the compiler ▶ Names are compiled away into lexical addresses, which are

symbolic representations for locations:

▶ Parameters live on the stack ▶ Bound variables live in lexical environments ▶ Free variables live in the top-level

☞ By assigning to each class of variables its own

access-mechanism, supported by appropriate data structures and addressing schemas, we achieve faster access-time!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 50 / 104

slide-51
SLIDE 51

Scope (continued)

Lexical Scope (continued)

Extending the lexical environment during the creation of closures isn’t very ineffjcient either, however:

▶ The average size of the lexical environment is around 1.2. This

means that

▶ Most people don’t write code with nested lambda-expressions ▶ Most people don’t rely on nested procedures for abstraction

☞ Moving the cost of extending the lexical environment to the

creation of new closures pays ofg, because people create far less closures than they apply!

▶ The same is true in OOPLs

🤕 How often have you seen code that uses nested classes, and

how deep was the nesting??

Mayer Goldberg \ Ben-Gurion University Compiler Construction 51 / 104

slide-52
SLIDE 52

Scope (continued)

What happens during closure creation ① A new environment is allocated

▶ The size of the new env is 1 + the size of the old env

② The addresses of the minor vectors are copied from the old env

to the ext env

▶ ExtEnvj+1 ← Envj, j = 0, 1, . . . , |Env|

③ A new rib is allocated for ExtEnv0:

▶ ExtEnv0[j] ← Paramj, j = 0, 1, . . . , ParamCount

☞ The env is now extended! ④ A closure data-structure is allocated ⑤ The closure is set to point to the extended lexical environment,

and to the code

Mayer Goldberg \ Ben-Gurion University Compiler Construction 52 / 104

slide-53
SLIDE 53

Scope (continued)

Lexical Scope (continued)

To acquire some intuition as to why it is more effjcient to extend the environment during the creation of closures rather than during function application, just think about the analogous situation in OOPLs:

▶ The creation of closures is very similar to the creation of objects ▶ The application of closures is very similar to the application of

methods

🤕 Which implementation of an OOPL would you rather have:

▶ One that makes it cheap to create objects, and expensive to

call their methods

▶ One that makes it expensive to create objects, and cheap to

call their methods

🤕 Which does your code do more: Create objects or call methods?

Mayer Goldberg \ Ben-Gurion University Compiler Construction 53 / 104

slide-54
SLIDE 54

Scope (continued)

Lexical Scope (continued)

The pushdown stack argument n-1 argument n-2 argument 0 arg count = n lexical environment return address pushed by the caller

  • ld frame pointer

local data pushed by the callee activation frame n-2 n-1 1 2 3 1 2 3 4 1 2 3 1 2 3 4 1 1 2 ⋯ ⋯ extended lexical environment lexical environment system stack Extending the lexical environment involves copying the major ribs from the old environment, allocating the new, zeroth rib, according to the number of parameters in the current frame, and copying the parameters from the stack, onto the zeroth rib

Mayer Goldberg \ Ben-Gurion University Compiler Construction 54 / 104

slide-55
SLIDE 55

Scope (continued)

What happens during procedure calls

▶ Before the call

① The arguments are evaluated and pushed from last to fjrst ② The number of arguments are pushed

▶ This supports procedures with an indefjnite number of

arguments

③ The procedure-expression is evaluated

▶ Verify that the value is indeed a closure!

④ The lexical environment of the closure is pushed ⑤ Call the code-pointer of the closure

▶ Calls in tail-position are handled difgerently! More on this later…

▶ After the call

① The stack is restored to the state before the call

▶ Again, tail-calls make this tricky! More on this later… Mayer Goldberg \ Ben-Gurion University Compiler Construction 55 / 104

slide-56
SLIDE 56

Chapter 4

Roadmap 🗹 Scope

▶ The lexical environment ▶ Boxing

Mayer Goldberg \ Ben-Gurion University Compiler Construction 56 / 104

slide-57
SLIDE 57

Lexical Scope (continued)

Sharing the lexical environment

Closures can share code-pointers, environments, and also parts of the lexical environment:

▶ Closures with difgerent environments and the same code-pointer ▶ Closures with the same environment and difgerent code-pointers ▶ Closures with partly-shared environments and difgerent

code-pointers

Mayer Goldberg \ Ben-Gurion University Compiler Construction 57 / 104

slide-58
SLIDE 58

Lexical Scope (continued)

Sharing the lexical environment

Closures with difgerent environments and the same code-pointer: (define ^count (lambda () (let ((n 0)) (lambda () (set! n (+ n 1)) n)))) (define count-1 (^count)) (define count-2 (^count))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 58 / 104

slide-59
SLIDE 59

Lexical Scope (continued)

Sharing the lexical environment

Closures with difgerent environments and the same code-pointer: > (count-1) 1 > (count-1) 2 > (count-1) 3 > (count-2) 1 > (count-2) 2 > (count-1) 4

Mayer Goldberg \ Ben-Gurion University Compiler Construction 59 / 104

slide-60
SLIDE 60

Lexical Scope (continued)

Sharing the lexical environment

Closures with the same environment and difgerent code-pointers: (define count #f) (define reset #f) (let ((n 0)) (set! count (lambda () (set! n (+ 1 n)) n)) (set! reset (lambda () (set! n 0))))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 60 / 104

slide-61
SLIDE 61

Lexical Scope (continued)

Sharing the lexical environment

Closures with the same environment and difgerent code-pointers: > (count) 1 > (count) 2 > (count) 3 > (reset) > (count) 1 > (count) 2

Mayer Goldberg \ Ben-Gurion University Compiler Construction 61 / 104

slide-62
SLIDE 62

Lexical Scope (continued)

Sharing the lexical environment

Closures sharing parts of their lexical environments in a tree-like fashion: (define f (lambda (q a) (lambda (z w s) (lambda (x e d c) ... )))) (define f12 (f 1 2)) (define f34 (f 3 4)) (define f12345 (f12 3 4 5)) (define f12678 (f12 6 7 8)) (define f12345abcd (f12345 'a 'b 'c 'd)) (define f12345efgh (f12345 'e 'f 'g 'h))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 62 / 104

slide-63
SLIDE 63

Lexical Scope (continued)

Closures sharing parts of their lexical environments in a tree-like fashion:

1 1

1 1

1 2 1 2

2 1 2 1

1 2 3 1 2 3

Env 1 Env 2 Env 3 Env 4 Env 5 Env 6 1 2 3 4 3 4 5 6 7 8 a b c d e f g h

Mayer Goldberg \ Ben-Gurion University Compiler Construction 63 / 104

slide-64
SLIDE 64

Lexical Scope (continued)

Sharing the lexical environment

Closures sharing parts of their lexical environments in a tree-like fashion: What parts of the code are shared: Env 1 Env 3 Env 4 Env 2 Env 5 Env 6

Mayer Goldberg \ Ben-Gurion University Compiler Construction 64 / 104

slide-65
SLIDE 65

Scope (continued)

Lexical addressing in Comp vs PPL

▶ In the compiler-construction course, we extend the lexical

environment during the creation of new closures, and push it during application

▶ As a result, we distinguish between parameters & bound

variables

▶ In the PPL course, we extend the lexical environment during

application, and save it during the creation of new closures

▶ As a result, we do not distinguish between parameters & bound

variables

▶ What is referred to as parameters in the compilers are simply

the zeroth rib in the lexical environment

▶ This difgerence has an efgect on the computation of lexical

addresses

Mayer Goldberg \ Ben-Gurion University Compiler Construction 65 / 104

slide-66
SLIDE 66

Scope (continued)

Example:

Lex Addr for Comp Const

(lambda (x) (xp0 (lambda (y) (xb00 yp0 (lambda (z) (xb10 yb00 zp0))))))

Lex Addr for PPL

(lambda (x) (xb00 (lambda (y) (xb10 yb00 (lambda (z) (xb20 yb10 zb00))))))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 66 / 104

slide-67
SLIDE 67

Scope (continued)

Distinguishing scope

Suppose we’re working on some system for which you have no manual… How can we tell the scope?

▶ The trick is to refer to some free variable from within a

procedure

▶ Defjne the variable globally with one value ▶ To defjne another variable by the same name locally within a

second procedure, with a difgerent value

▶ Call the second procedure

▶ If we get the local value, we’re running under dynamic scope ▶ If we get the global value, we’re running under lexical scope Mayer Goldberg \ Ben-Gurion University Compiler Construction 67 / 104

slide-68
SLIDE 68

Scope (continued)

Distinguishing scope

(define *free-variable* 'lexical-scope) (define return-free-variable (lambda () *free-variable*)) (define get-scope (lambda () (let ((*free-variable* 'dynamic-scope)) (return-free-variable)))) Call the procedure get-scope to fjnd whether you’re running under lexical scope or dynamic scope…

Mayer Goldberg \ Ben-Gurion University Compiler Construction 68 / 104

slide-69
SLIDE 69

Chapter 4

Roadmap 🗹 Scope

▶ The lexical environment ▶ Boxing

Mayer Goldberg \ Ben-Gurion University Compiler Construction 69 / 104

slide-70
SLIDE 70

The OOP world

What we learned about lexical scope is not unique to LISP/Scheme

  • r even to [quasi-]functional programming languages

▶ All modern programming languages use lexical scope ▶ Any language that supports higher-order procedures supports

closures and the sharing parts of environments

▶ Compiling methods is very similar to compiling closures ▶ The run-time behaviour of methods is very similar to that of

closures

▶ Objects are very similar to lexical environments

☞ We would like to explore these similarities

▶ Learn how to compile OOPLs ▶ Leverage our intuition about OOPLs ▶ Leverage our intuition about functional programming Mayer Goldberg \ Ben-Gurion University Compiler Construction 70 / 104

slide-71
SLIDE 71

The OOP world (continued)

Closures & Objects

▶ A closure is a data structure that combines a lexical

environment & some code:

CLOSURE

lexical environmen code pointer

▶ What if we wanted to have more than one code-pointer?

Mayer Goldberg \ Ben-Gurion University Compiler Construction 71 / 104

slide-72
SLIDE 72

The OOP world (continued)

Closures & Objects (continued)

Closures with more than one code pointer:

① Have a function return a list/vector of functions ② Have a function take a function of several arguments and apply

it to several functions

Mayer Goldberg \ Ben-Gurion University Compiler Construction 72 / 104

slide-73
SLIDE 73

The OOP world (continued)

Simple, Object-Oriented-like count/reset ① Here’s how to defjne it

(define make-counter (lambda () (let ((n 0)) (let ((count (lambda () (set! n (+ n 1)) n)) (reset (lambda () (set! n 0)))) (list count reset)))))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 73 / 104

slide-74
SLIDE 74

The OOP world (continued)

Simple, Object-Oriented-like count/reset ① Here’s how to use it

> (define c1 #f) > (define c2 #f) > (define r1 #f) > (define r2 #f) > (apply (lambda (_c1 _r1) (set! c1 _c1) (set! r1 _r1)) (make-counter)) > (apply (lambda (_c2 _r2) (set! c2 _c2) (set! r2 _r2)) (make-counter))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 74 / 104

slide-75
SLIDE 75

The OOP world (continued)

Simple, Object-Oriented-like count/reset ① Here’s how to use it

> (c1) 1 > (c1) 2 > (c2) 1 > (c2) 2 > (r1) > (c2) 3 > (c1) 1

Mayer Goldberg \ Ben-Gurion University Compiler Construction 75 / 104

slide-76
SLIDE 76

The OOP world (continued)

Simple, Object-Oriented-like count/reset ② Here’s how to defjne it

(define make-counter (lambda () (let ((n 0)) (let ((count (lambda () (set! n (+ n 1)) n)) (reset (lambda () (set! n 0)))) (lambda (u) (u count reset))))))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 76 / 104

slide-77
SLIDE 77

The OOP world (continued)

Simple, Object-Oriented-like count/reset ② Here’s how to use it

> ((make-counter) (lambda (_c1 _r1) (set! c1 _c1) (set! r1 _r1))) > ((make-counter) (lambda (_c2 _r2) (set! c2 _c2) (set! r2 _r2)))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 77 / 104

slide-78
SLIDE 78

The OOP world (continued)

Simple, Object-Oriented-like count/reset ② Here’s how to use it

> (c1) 1 > (c1) 2 > (c2) 1 > (c2) 2 > (r1) > (c2) 3 > (c1) 1

Mayer Goldberg \ Ben-Gurion University Compiler Construction 78 / 104

slide-79
SLIDE 79

The OOP world (continued)

Simple, Object-Oriented-like count/reset

▶ As you can see, the two implementations behave identically ▶ We can associate any number of “methods” with the same

lexical environment

▶ These “methods” are used similarly to how methods are used in

OOPLs

Mayer Goldberg \ Ben-Gurion University Compiler Construction 79 / 104

slide-80
SLIDE 80

The OOP world (continued)

Closures & Objects (continued)

▶ A classless object (Javascript-like) is a structure very similar to a

closure:

▶ It contains state, in the form of instance variables ▶ It contains pointers to code

☞ Issue: The size of the object can be very large

ivar-1 ivar-2 ivar-n method-1 method-2 method-m method-3 method-4

CLASSLESS OBJECT

⋯ ⋯ Mayer Goldberg \ Ben-Gurion University Compiler Construction 80 / 104

slide-81
SLIDE 81

The OOP world (continued)

Closures & Objects (continued)

A classless object (Javascript-like):

ivar-1 ivar-2 ivar-n method-1 method-2 method-m method-3 method-4

CLASSLESS OBJECT

⋯ ⋯

☞ Issue: The size of the object can be very large

▶ Observation: While the ivars may change, the code pointers do

not

▶ Observation: The code pointers are common to all objects of

this kind

Mayer Goldberg \ Ben-Gurion University Compiler Construction 81 / 104

slide-82
SLIDE 82

The OOP world (continued)

Introducing classes

▶ Contain all data that is shared by all objects of the same kind

▶ Virtual-Method Table (VMT) ▶ Static, class vars ▶ Static, class methods ▶ In Smalltalk: Collection of all instances of the class ▶ Various data in support of administration & refmection

▶ Moving from instance-based OOP to class-based OOP can be

thought of as an extreme case of refactoring using the fmyweight pattern

▶ After refactoring, the instance becomes much smaller, consisting

  • f only the instance variables + a pointer to the corresponding

class

Mayer Goldberg \ Ben-Gurion University Compiler Construction 82 / 104

slide-83
SLIDE 83

The OOP world (continued)

Introducing classes

cvar-1 cvar-2 cvar-k ⋮

CVARS

method-1 method-2 method-m method-3 method-4 ⋮

VMT CLASS Foo

method-1 method-2 method-m method-3 ⋮

CMETHODS ⋯

ivar-1 ivar-2 ivar-n

OBJECT Foo

⋮ class:

Mayer Goldberg \ Ben-Gurion University Compiler Construction 83 / 104

slide-84
SLIDE 84

The OOP world (continued)

Object creation

▶ Allocate memory for object ▶ Initialize instance variables ▶ Link the object to its class

▶ In Smalltalk: Add the instance to the instances-container in the

class object

Mayer Goldberg \ Ben-Gurion University Compiler Construction 84 / 104

slide-85
SLIDE 85

The OOP world (continued)

Virtual-method call

▶ Upon call

▶ Evaluate method arguments, push values from last to fjrst ▶ Optionally: Push the number of arguments ▶ Push this/self ▶ De-reference this → class → VMT[· · · ] to arrive at the

address of the method

▶ Call the method ▶ For tail-calls, handle difgerently: More on this later, when we

cover the tail-call optimization

▶ Upon return

▶ Restore stack to its position before the call ▶ Again, tail-calls make this tricky! More on this later… Mayer Goldberg \ Ben-Gurion University Compiler Construction 85 / 104

slide-86
SLIDE 86

The OOP world (continued)

Closures & Objects (continued)

Summary:

▶ Objects & closures are similar ▶ Calling a method & calling a closure are similar ▶ The lexical environment & this/self are similar ▶ Bound variables & instance variables are similar ▶ Functions are constructors for objects of the type of the body of

the function:

▶ cos is a constructor of fmoating-point numbers ▶ string-append is a constructor of strings

etc.

▶ Closures are objects with a single method, apply

Mayer Goldberg \ Ben-Gurion University Compiler Construction 86 / 104

slide-87
SLIDE 87

Further reading

🔘 The Flyweight Pattern 🕯 Design Patterns: Elements of Reusable Object-Oriented

Software

🕯 Refactoring to Patterns

Mayer Goldberg \ Ben-Gurion University Compiler Construction 87 / 104

slide-88
SLIDE 88

Chapter 4

Roadmap 🗹 Scope 🗹 The lexical environment

▶ Boxing

Mayer Goldberg \ Ben-Gurion University Compiler Construction 88 / 104

slide-89
SLIDE 89

Lexical scope, sharing (continued)

We mentioned before that as we evaluate inner lambda-expressions, parameters are copied from the stack onto the extended lexical environment of the closure that is the value of the inner lambda-expression:

▶ Until control is returned from the outer lambda-expressions, the

variables are both on the stack and in a lexical environment

▶ That the value of a variable is duplicated and appears

simultaneously at addresses A & B, raises the question of whether change to the object at A would/could/should be

  • bservable at location B?

☞ Solution: Move a pointer away!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 89 / 104

slide-90
SLIDE 90

Lexical scope, sharing (continued)

Moving a pointer away

▶ The object appears in only one place: C ▶ Both A & B contain the address of the object, i.e., C ▶ “Changing the object at address A” means de-referencing the

pointer in A, which gives C, and changing the one and only

  • ccurrence of the object, which is in C. Such a change would be
  • bservable from anywhere that contains the address of C.

Mayer Goldberg \ Ben-Gurion University Compiler Construction 90 / 104

slide-91
SLIDE 91

Lexical scope, sharing (continued)

interface I { void foo(int x); } ... int z; ... I obj1 = new I() { void foo(int x) { I obj2 = new I() { void foo(int y) { z = (++x) * (--y); } }; ... } }; A parameter of a method of an outer class must be declared final

Mayer Goldberg \ Ben-Gurion University Compiler Construction 91 / 104

slide-92
SLIDE 92

Lexical scope, sharing (continued)

interface I { void foo(final int x); } ... int z; ... I obj1 = new I() { void foo(final int x) { I obj2 = new I() { void foo(final int y) { z = (++x) * (--y); } }; ... } }; ... This, of course, is an error, since the parameters are declared fjnal!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 92 / 104

slide-93
SLIDE 93

Lexical scope, sharing (continued)

interface I { void foo(final int [] x); } ... int z; ... I obj1 = new I() { void foo(final int [] x) { I obj2 = new I() { void foo(final int [] y) { z = (++x[0]) * (--y[0]); } }; ... } }; This is one solution…

Mayer Goldberg \ Ben-Gurion University Compiler Construction 93 / 104

slide-94
SLIDE 94

Lexical scope, sharing (continued)

The process of moving one-pointer away from an object is known as “boxing”:

▶ We can use a container object in Java, or an array of size 1 ▶ For each variable being boxed, all references to it are replaced

with de-references, either for set or for get

▶ The reference is indeed final, and does not change

▶ This is why it can be copied any number of times!

▶ What does change is the contents of the de-referenced object

Mayer Goldberg \ Ben-Gurion University Compiler Construction 94 / 104

slide-95
SLIDE 95

Lexical scope, sharing (continued)

▶ In Java, boxing has not been supported so far, but is scheduled

for a future version

▶ Until Java supports boxing transparently, it must be done

explicitly by the programmer, as shown in the above example

▶ Boxing is automatic & transparent in LISP/Scheme/Smalltalk

— It is done when needed

▶ Boxing raises the access cost for variables, so we want to box as

few variables as possible

☞ We shall add support for boxing variables in our compiler

Mayer Goldberg \ Ben-Gurion University Compiler Construction 95 / 104

slide-96
SLIDE 96

Lexical scope, sharing (continued)

When a variable must be boxed

▶ We present a criterion that is suffjcient, but not necessary

▶ This means that sometimes we box variables in situations where

boxing is not, in fact, necessary

▶ Our criterion is conservative: It shall always box variabls when

necessary

▶ For our compiler, you should box a variable if:

▶ The variable has [at least] one occurrence in for read within

some closure, and [at least] one occurrence for write in another closure

▶ Both occurrences do not already refer to the same rib in a

lexical environment

Mayer Goldberg \ Ben-Gurion University Compiler Construction 96 / 104

slide-97
SLIDE 97

Example of boxing

We should box

(lambda (n) (list (lambda () (set! n (+ n 1)) n) (lambda () (set! n 0))))

▶ Read occurrence within a closure ▶ Write occurrence within another closure ▶ Both occurrences do not already share a rib

Mayer Goldberg \ Ben-Gurion University Compiler Construction 97 / 104

slide-98
SLIDE 98

Example of boxing

We should not box

(lambda (n) (lambda () (list (lambda () (set! n (+ n 1)) n) (lambda () (set! n 0)))))

▶ Read occurrence within a closure ▶ Write occurrence within another closure ▶ Both occurrences already share a rib

Mayer Goldberg \ Ben-Gurion University Compiler Construction 98 / 104

slide-99
SLIDE 99

Example of boxing

We should not box

(lambda (n) (set! n (+ n 1)) n)

▶ The read/write occurrences are within the same closure

Mayer Goldberg \ Ben-Gurion University Compiler Construction 99 / 104

slide-100
SLIDE 100

Example of boxing

We should not box

(lambda (n) (lambda (u) (u (lambda () (set! n (+ n 1)) n) (lambda () (set! n 0)))))

▶ Both the set & get occurrences of n, though in two difgerent

closures, do share the same rib in their lexical environments

Mayer Goldberg \ Ben-Gurion University Compiler Construction 100 / 104

slide-101
SLIDE 101

Example of boxing

We should box

(lambda (n) (list (begin (set! n (* n n)) n) (lambda () n)))

▶ The set & get occurrences of n occur within two difgerent

closures (note that the set occurrence is to a parameter!)

▶ They do not share the same rib in their respective, lexical

environments

Mayer Goldberg \ Ben-Gurion University Compiler Construction 101 / 104

slide-102
SLIDE 102

Example of boxing

We should not box

(lambda (n) (lambda () (set! n 0)) (lambda () (set! n 1)))

▶ There is no get occurrence for n

Mayer Goldberg \ Ben-Gurion University Compiler Construction 102 / 104

slide-103
SLIDE 103

Chapter 4

Roadmap 🗹 Scope 🗹 The lexical environment 🗹 Boxing

Mayer Goldberg \ Ben-Gurion University Compiler Construction 103 / 104

slide-104
SLIDE 104

Further reading

Mayer Goldberg \ Ben-Gurion University Compiler Construction 104 / 104