Compiler Construction Compiler Construction 1 / 112 Mayer Goldberg - - PowerPoint PPT Presentation

compiler construction
SMART_READER_LITE
LIVE PREVIEW

Compiler Construction Compiler Construction 1 / 112 Mayer Goldberg - - PowerPoint PPT Presentation

Compiler Construction Compiler Construction 1 / 112 Mayer Goldberg \ Ben-Gurion University Thursday 3 rd December, 2020 Mayer Goldberg \ Ben-Gurion University Chapter 3 Roadmap Compiler Construction 2 / 112 Expressions in Scheme The


slide-1
SLIDE 1

Compiler Construction

Mayer Goldberg \ Ben-Gurion University Thursday 3rd December, 2020

Mayer Goldberg \ Ben-Gurion University Compiler Construction 1 / 112

slide-2
SLIDE 2

Chapter 3

Roadmap

▶ Expressions in Scheme

▶ The expr datatype ▶ The Tag-Parser ▶ Macros & special forms ▶ Lexical hygiene Mayer Goldberg \ Ben-Gurion University Compiler Construction 2 / 112

slide-3
SLIDE 3

Expressions in Scheme

Recall that Scheme has two parsers

▶ A parser for data, known as a reader ▶ A parser for code, known as a tag-parser

Having completed our study of the reader, we now turn our attention to the tag-parser

Mayer Goldberg \ Ben-Gurion University Compiler Construction 3 / 112

slide-4
SLIDE 4

Expressions in Scheme

The abstract syntax for code in Scheme can be described in terms of

▶ Constants ▶ Variables ▶ if-Expressions ▶ or-Expressions ▶ Sequences ▶ Assignments ▶ Defjnitions ▶ Lambda-expressions of various kinds (more on that later) ▶ Applications

☞ The above forms in the abstract syntax represent a choice that

is reasonably complete, and easy to implement effjciently

☞ This is not the only possible choice

Mayer Goldberg \ Ben-Gurion University Compiler Construction 4 / 112

slide-5
SLIDE 5

Expressions in Scheme

What about all the rest??

▶ All other forms (e.g., let, cond, etc) can be written in terms of

these core forms

▶ The process of translating other forms (e.g., let-expressions)

into the language of the core forms is called macro expansion

▶ The Scheme programming language provides facilities for

user-defjned macros

▶ Users can invent new syntactic forms ▶ Describe their translation into pre-existing forms using Scheme

code

▶ The Scheme system uses the user-code to translate user-defjned

macros into the core forms

Mayer Goldberg \ Ben-Gurion University Compiler Construction 5 / 112

slide-6
SLIDE 6

The choice of the core forms

The choice of the core forms is guided by minimalism & effjciency:

Minimalism

▶ Supporting many forms in the compiler will make the compiler

huge, complex, and harder to modify

▶ Allowing the compiler to focus on optimizing a small number of

core forms results in smaller, simpler compilers

▶ Debugging macro-expanded code is very diffjcult

Mayer Goldberg \ Ben-Gurion University Compiler Construction 6 / 112

slide-7
SLIDE 7

Minimalism

The choice of the core forms is guided by minimalism & effjciency:

Effjciency

There is a debate as to whether less forms would make for a better compiler:

▶ On the one hand, the minimalist compiler can focus more on less ▶ On the other hand, the minimalist compiler knows less about

the programmer’s intentions The question of the advantages vs the disadvantages of minimalism is an age-old dilemma: Archilochus πόλλ' οἶδ' ἀλώπηξ, ἀλλ' ἐχῖνος ἓν μέγα Erasmus Multa novit vulpes, verum echinus unum magnum The fox knows many things, but the hedgehog knows one great thing

Mayer Goldberg \ Ben-Gurion University Compiler Construction 7 / 112

slide-8
SLIDE 8

Minimalism (continued)

For example, as we shall see later on, we’re going to translate let-expressions into applications. But is this such a good idea? Pro The semantic analysis & code generation will be smaller and simpler Con The resulting code is going to be ineffjcient Pro A good optimizing compiler can optimize away this ineffjciency Con But mapping the variable from memory to a register, for effjciency, would be harder Pro … This debate can go on for a while!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 8 / 112

slide-9
SLIDE 9

Minimalism (continued)

For your project, we shall attempt to strike a balance:

▶ The compiler should not be diffjcult to write ▶ The code should not be overly ineffjcient 😊

The list of core forms suggested achieves this balance

Mayer Goldberg \ Ben-Gurion University Compiler Construction 9 / 112

slide-10
SLIDE 10

Minimalism (continued)

Just a quick question

What is the absolute minimum number of core forms that are needed to be Turing-complete?

Mayer Goldberg \ Ben-Gurion University Compiler Construction 10 / 112

slide-11
SLIDE 11

Minimalism (continued)

Answer

▶ In principle, applications and lambda-expressions are all you need ▶ Alternately, applications, and the following one, single constant

are suffjcient: (define love (lambda (x) ((x (lambda (x) (lambda (y) (lambda (z) ((x z) (y z)))))) (lambda (x) (lambda (y) x))))) And “love is all you need!” (with apologies to the Beatles)…

Mayer Goldberg \ Ben-Gurion University Compiler Construction 11 / 112

slide-12
SLIDE 12

Expressions in Scheme

We now defjne the core forms in terms of their AST node types, which are represented as type-constructors in the disjoint type expr: type expr = | Const of sexpr | Var of string | If of expr * expr * expr | Seq of expr list | Set of expr * expr | Def of expr * expr | Or of expr list | LambdaSimple of string list * expr | LambdaOpt of string list * string * expr | Applic of expr * (expr list);;

Mayer Goldberg \ Ben-Gurion University Compiler Construction 12 / 112

slide-13
SLIDE 13

Expressions in Scheme

Constants (type expr, type-constructor Const)

▶ Type: Const of sexpr ▶ The type-constructor Const is used both for self-evaluating and

non-self-evaluating constants

🤕 A self-evaluating constant is one you can type at the Scheme

prompt and see it printed back at you: Numbers, chars, Booleans, strings ⟨self-evaluating sexpr⟩ = Const(⟨self-evaluating sexpr⟩) (quote ⟨sexpr⟩) = Pair(Symbol("quote"), Pair(⟨sexpr⟩, Nil)) = Const(⟨sexpr⟩)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 13 / 112

slide-14
SLIDE 14

Expressions in Scheme

Constants (type expr, type-constructor Const) ⚠ A tricky example:

(quote (quote ⟨sexpr⟩)) = Pair(Symbol("quote"), Pair(Pair(Symbol("quote"), Pair(⟨sexpr⟩, Nil)), Nil)) = Const(Pair(Symbol("quote"), Pair(⟨sexpr⟩, Nil)))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 14 / 112

slide-15
SLIDE 15

Expressions in Scheme

Variables (type expr, type-constructor Var)

▶ Type: Var of string ▶ Variables are literal symbols that are not reserved words

▶ The latest version of Scheme (R6RS) does not have many

reserved words

▶ Not having reserved words makes the parser more complex ▶ We’re going to ignore this, and assume that words that are

used for syntax are reserved words. These include:

▶ and, begin, cond, define, else, if, lambda, let, let*

letrec, or, quasiquote, quote, set! unquote, unquote-splicing

▶ There are additional reserved words, but we’ll ignore those Mayer Goldberg \ Ben-Gurion University Compiler Construction 15 / 112

slide-16
SLIDE 16

Expressions in Scheme

Conditionals (type expr, type-constructor If)

▶ Type: If of expr * expr * expr ▶ Scheme supports if-then variant without an else-clause

▶ These are used when the then-clause contains side-efgects ▶ The “missing”/implicit else-clause is defjned to be

Const(Void)

▶ We shall support the if-then variant, and tacitly add the implicit

else-clause

▶ This is your fjrst recursive case of the expr datatype: An expr

that contains sub-exprs.

▶ Obviously, the tag-parser will have to be recursive! Mayer Goldberg \ Ben-Gurion University Compiler Construction 16 / 112

slide-17
SLIDE 17

Expressions in Scheme

Sequences (type expr, type-constructor Seq)

▶ Type: Seq of expr list ▶ There are two types of sequences:

▶ Explicit sequences (begin-expressions) ▶ Implicit sequences ▶ Body of lambda ▶ In the Ribs of cond ▶ In the body of let, let*, letrec ▶ Other syntactic forms we shall not support

▶ Both implicit & explicit sequences are encoded as single

expressions using the type-constructor Seq

▶ All expressions within a sequence are generally executed from

fjrst to last

Mayer Goldberg \ Ben-Gurion University Compiler Construction 17 / 112

slide-18
SLIDE 18

Expressions in Scheme

Sequences (type expr, type-constructor Seq)

▶ The value of a sequence is generally the value of the last

expression in the sequence

▶ Sequences have to do with side-efgects: Expressions that have

no side efgects are redundant (can be removed) in all but the last position of a sequence

▶ Continuation-handling forms, such as call/cc, afgect the

behaviour of sequences

▶ We shall not consider call/cc in this course

▶ It is possible to macro-expand sequences

▶ We will learn the expansion later on ▶ The expansion results in impractically-ineffjcient code ▶ We support sequences directly for reasons of effjciency Mayer Goldberg \ Ben-Gurion University Compiler Construction 18 / 112

slide-19
SLIDE 19

Expressions in Scheme

Assignments (type expr, type-constructor Set)

▶ Type: Set of expr * expr ▶ The AST node for set! (pronounced “set-bang”) expressions

☠ The mother of all change; The essence of side-efgects: While

assignment is diffjcult to analyze, and does most of its damage at run-time, there’s not much to say about it syntactically.

Mayer Goldberg \ Ben-Gurion University Compiler Construction 19 / 112

slide-20
SLIDE 20

Expressions in Scheme

Defjnitions (type expr, type-constructor Def)

▶ Type: Def of expr * expr ▶ The AST node for define-expressions ▶ Two syntaxes for define:

▶ ① (define ⟨var⟩ ⟨expr⟩) ▶ Example:

(define pi (* 4 (atan 1)))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 20 / 112

slide-21
SLIDE 21

Expressions in Scheme

Defjnitions (type expr, type-constructor Def)

▶ Two syntaxes for define:

▶ ② (define (⟨var⟩ . ⟨arglist⟩) . (⟨expr⟩+)) ▶ Called MIT-define, because of its use in the MIT course (and

textbookf by the same name) The Structure and Interpretation

  • f Computer Programs (see bibliography)

▶ This form is macro-expanded into

(define ⟨var⟩ (lambda ⟨arglist⟩ . (⟨expr⟩+)))

▶ Used to defjne functions without specifying the λ: This is

almost always a bad idea!

☞ Note the implicit sequences!

▶ Example: (define (square x) (* x x)) Mayer Goldberg \ Ben-Gurion University Compiler Construction 21 / 112

slide-22
SLIDE 22

Expressions in Scheme

Disjunctions (type expr, type-constructor Or)

▶ Type: Or of expr list ▶ (or) = #f (by defjnition) ▶ (or ⟨expr⟩) = ⟨expr⟩ (#f is the unit element of or) ▶ The real work is done here:

(or ⟨expr1⟩ · · · ⟨exprn⟩) = Or([⟨expr1⟩; · · · ; ⟨exprn⟩])

▶ It is possible to macro-expand disjunctions

▶ We will learn the expansion later on ▶ The expansion results in impractically-ineffjcient code ▶ We support disjunctions directly for reasons of effjciency Mayer Goldberg \ Ben-Gurion University Compiler Construction 22 / 112

slide-23
SLIDE 23

Expressions in Scheme

Applications (type expr, type-constructor Applic)

▶ Type: Applic of expr * (expr list) ▶ The AST node separates the expression in the procedure

position from the list of arguments

▶ The tag-parser recurses over the procedure & the list of

arguments: (⟨expr⟩ ⟨expr⟩1 · · · ⟨expr⟩n) = Applic(⟨expr⟩, [⟨expr⟩1; · · · ; ⟨expr⟩n])

Mayer Goldberg \ Ben-Gurion University Compiler Construction 23 / 112

slide-24
SLIDE 24

Expressions in Scheme

Lambdas (type expr, type-constructor LambdaSimple, LambdaOpt)

▶ Types:

▶ LambdaSimple of string list * expr ▶ LambdaOpt of string list * string * expr

▶ Scheme has three lambda-forms, and we’re going to represent

these three forms using the two AST nodes LambdaSimple & LambdaOpt.

Mayer Goldberg \ Ben-Gurion University Compiler Construction 24 / 112

slide-25
SLIDE 25

Expressions in Scheme

Lambdas (type expr, type-constructor LambdaSimple, LambdaOpt)

▶ The general form of lambda-expressions is

(lambda ⟨arglist⟩ . (⟨expr⟩+)):

▶ ① If ⟨arglist⟩ is a proper list of unique variable names, then

the lambda-expression is said to be simple, and we represent it using the AST node LambdaSimple

Mayer Goldberg \ Ben-Gurion University Compiler Construction 25 / 112

slide-26
SLIDE 26

Expressions in Scheme

Lambdas (type expr, type-constructor LambdaSimple, LambdaOpt)

▶ The general form of lambda-expressions is

(lambda ⟨arglist⟩ . (⟨expr⟩+)):

▶ ② If ⟨arglist⟩ is the improper list (v1 · · · vn . vs), then the

lambda-expression is said to take at least n arguments:

▶ The fjrst n arguments are mandatory, and are assigned to v1

through vn respectively (unique variable names)

▶ The list of values of any additional arguments is going to be the

value of the optional parameter vs

▶ If precisely n arguments are given, then the value of vs is going

to be the empty list

▶ We represent lambda-expressions with optional arguments by

using the AST node LambdaOpt

Mayer Goldberg \ Ben-Gurion University Compiler Construction 26 / 112

slide-27
SLIDE 27

Expressions in Scheme

Lambdas (type expr, type-constructor LambdaSimple, LambdaOpt)

▶ The general form of lambda-expressions is

(lambda ⟨arglist⟩ . (⟨expr⟩+)):

▶ ③ If ⟨arglist⟩ is the symbol vs, then the lambda-expression is

said to be variadic, and may be applied to any number of arguments:

▶ The list of values of the arguments is going to be the value of

the optional parameter vs

▶ If no arguments are given, then the value of vs is going to be

the empty list

▶ We represent variadic lambda-expressions using the AST node

LambdaOpt (with an empty list, and the optional var)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 27 / 112

slide-28
SLIDE 28

Expressions in Scheme

Lambda With Optional Arguments — Demonstration

> (define f (lambda (a b c . d) `((a ,a) (b ,b) (c ,c) (d ,d)))) > (f 1) Exception: incorrect number of arguments to #<procedure f> Type (debug) to enter the debugger.

Mayer Goldberg \ Ben-Gurion University Compiler Construction 28 / 112

slide-29
SLIDE 29

Expressions in Scheme

Lambda With Optional Arguments — Demonstration

> (f 1 2) Exception: incorrect number of arguments to #<procedure f> Type (debug) to enter the debugger. > (f 1 2 3) ((a 1) (b 2) (c 3) (d ())) > (f 1 2 3 4 5) ((a 1) (b 2) (c 3) (d (4 5)))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 29 / 112

slide-30
SLIDE 30

Expressions in Scheme

Variadic Lambda — Demonstration

> (define g (lambda s `(s ,s))) > (g) (s ()) > (g 1 2 3) (s (1 2 3))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 30 / 112

slide-31
SLIDE 31

Roadmap

▶ Expressions in Scheme

🗹 The expr datatype

▶ The Tag-Parser ▶ Macros & special forms ▶ Lexical hygiene Mayer Goldberg \ Ben-Gurion University Compiler Construction 31 / 112

slide-32
SLIDE 32

The Tag-Parser

The tag-parser is a function, mapping from sexprs to exprs:

▶ Not all valid sexprs are valid exprs ▶ Some sexprs need to be disambiguated before they can be

converted to an expr (most notably, the lambda-forms)

▶ Plenty of testing needs to be done to ensure that

syntactically-incorrect forms are rejected

Mayer Goldberg \ Ben-Gurion University Compiler Construction 32 / 112

slide-33
SLIDE 33

The Tag-Parser (continued)

Example: The syntax for (* 4 (atan 1))

Concrete syntax

SYMBOL * INTEGER 4 SYMBOL atan INTEGER 1 NIL PAIR CAR CDR PAIR CAR CDR NIL PAIR CAR CDR PAIR CAR CDR PAIR CAR CDR

Abstract syntax

VAR * INTEGER 4 CONST VALUE VAR atan INTEGER 1 CONST VALUE APPLIC PROC ARGS ARG0 APPLIC PROC ARGS ARG0 ARG1

Mayer Goldberg \ Ben-Gurion University Compiler Construction 33 / 112

slide-34
SLIDE 34

The Tag-Parser (continued)

Some examples of syntactically-incorrect exprs

These are all valid sexprs and invalid expressions:

▶ (lambda (x) . x) (the body is not a [proper] list of exprs) ▶ (quote . a) (not a proper list) ▶ (quote he said he understood unquote) (length does not

equal 2)

▶ (lambda (a b c a a b) (+ a b c)) (the param list contains

duplicates)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 34 / 112

slide-35
SLIDE 35

The Tag-Parser (continued)

How to write a tag-parser

▶ A recursive function tag_parse : sexpr -> expr

🤕 The concrete syntax for expr is the abstract syntax for sexpr

▶ For the core forms (we shall deal with macro-expanded forms

later on)

▶ ① Use pattern-matching to match over the concrete syntax of

various syntactic forms

▶ Check out the expr datatype so you know what you must

support

▶ ② Perform any additional testing necessary ▶ For example, that argument-lists in lambda-expressions contain

no duplicates!

▶ ③ Call tag_parse recursively for sub-expressions ▶ ④ Generate the corresponding AST Mayer Goldberg \ Ben-Gurion University Compiler Construction 35 / 112

slide-36
SLIDE 36

Roadmap

▶ Expressions in Scheme

🗹 The expr datatype 🗹 The Tag-Parser

▶ Macros & special forms ▶ Lexical hygiene Mayer Goldberg \ Ben-Gurion University Compiler Construction 36 / 112

slide-37
SLIDE 37

Macros & special forms

What are macros

▶ Transformers on source-code

▶ They take source code, and rewrite it ▶ They operate on the concrete syntax

▶ Generally execute at compile time

▶ Some work on run-time macro-expansion has been done in the

past, but is considered esoteric

▶ Used to provide shallow support for syntactic forms

▶ Macros are syntactic sugar or notational conveniences ▶ They are “expanded away”, and then they are gone ▶ Macros are not supported deep within the compiler ▶ The semantic analysis or code-generation stages of the compiler

pipeline know nothing about macros

Mayer Goldberg \ Ben-Gurion University Compiler Construction 37 / 112

slide-38
SLIDE 38

Macros & special forms (continued)

Illustrating “shallow support” vs “deep support”

▶ When you think of the word “home”, perhaps you think of:

▶ stability, security, safety ▶ family, relations ▶ mortgage

etc.

▶ All this is part of the meaning of “home” ▶ Meaning in the compiler has to do with the semantic analysis

phase of the compiler

▶ The meaning of “home” enriches anything you say and do that

pertains to your home

Mayer Goldberg \ Ben-Gurion University Compiler Construction 38 / 112

slide-39
SLIDE 39

Macros & special forms (continued)

Illustrating “shallow support” vs “deep support” (cont)

▶ Suppose home for you is just shorthand for a location

▶ You could then translate the word “home” to USC

32.0260699N, 34.7580834E

▶ This translation would take place early on, so that ▶ “going home” would mean going to a specifjc coordinate ▶ “longing for home” would mean longing to be at a specifjc

coordinate

etc.

▶ Such sentences would carry none of the meaning, signifjcance,

feelings, associated with the word “home”

▶ The word “home” would be disconnected from any meaning

  • ther than a geographical location

▶ This would be insanity! Mayer Goldberg \ Ben-Gurion University Compiler Construction 39 / 112

slide-40
SLIDE 40

Macros & special forms (continued)

Illustrating “shallow support” vs “deep support” (cont)

Back to the compiler:

▶ When you have a notion of “loop” in your language, the

compiler can associate with it many things:

▶ a code fragment that gets executed over and over ▶ termination conditions ▶ branch prediction information

etc.

▶ We can macro-expand a loop into a recursive function with all

recursive calls in tail-position

▶ All intentions about the code (namely, its loop-like behaviour)

are gone

Mayer Goldberg \ Ben-Gurion University Compiler Construction 40 / 112

slide-41
SLIDE 41

Macros & special forms (continued)

Illustrating “shallow support” vs “deep support” (cont)

▶ Some information can be reconstructed through analysis during

the semantic analysis phase

▶ Other information is lost

▶ For example, we might want our compiler to keep the index

variables of the loop in registers, or generate branch-prediction hints

▶ These would be simple to do when considering loops ▶ These would be diffjcult, and require a great deal of analysis

with functions

☞ Keeping around the meaning of syntactic forms would make it

easier to translate them effjciently

☞ This meaning having been lost in macro-expansion, it is diffjcult

to recover completely

Mayer Goldberg \ Ben-Gurion University Compiler Construction 41 / 112

slide-42
SLIDE 42

Macros & special forms (continued)

▶ We now consider some special forms in Scheme

▶ Some of these will be implemented in our compiler ▶ Some of these are of theoretical interest only ▶ We want to understand what special forms can be dispensed

with

▶ We don’t want our compiler to be overly ineffjcient Mayer Goldberg \ Ben-Gurion University Compiler Construction 42 / 112

slide-43
SLIDE 43

Scheme, Boolean values, Boolean operators

▶ Some languages (such as Java or ocaml) are strict about

Booleans:

▶ Conjunctions, disjunctions, conditionals, etc. only take

expressions that evaluate to Boolean values

▶ The distinction is beteen false and true ▶ Not false is exactly true:

# not false;;

  • : bool = true

# not true;;

  • : bool = false

# 4 && 5;; Characters 0-1: 4 && 5;; ^ Error: This expression has type int but an expression was expected of type bool

Mayer Goldberg \ Ben-Gurion University Compiler Construction 43 / 112

slide-44
SLIDE 44

Scheme, Boolean values, Boolean operators

▶ Some languages (such as Java or ocaml) are strict about

Booleans:

▶ Conjudations, disjunctions, conditionals, etc. only take

expressions that evaluate to Boolean values

▶ The distinction is beteen false and true ▶ Not false is exactly true:

# 4 || 5;; Characters 0-1: 4 || 5;; ^ Error: This expression has type int but an expression was expected of type bool # if "moshe" then "then" else "else";; Characters 3-10: if "moshe" then "then" else "else";; ^^^^^^^ Error: This expression has type string but an expression was expected of type bool

Mayer Goldberg \ Ben-Gurion University Compiler Construction 44 / 112

slide-45
SLIDE 45

Scheme, Boolean values, Boolean operators

▶ Some languages (such as C or Scheme) are lenient about

Booleans:

▶ Conjudations, disjunctions, conditionals, etc. can take any

expression

▶ The distinction is between false and not false ▶ Not false is not the same as true:

> (not #f) #t > (not #t) #f > (not 'moshe) #f > (not (not 'moshe)) #t

Mayer Goldberg \ Ben-Gurion University Compiler Construction 45 / 112

slide-46
SLIDE 46

Scheme, Boolean values, Boolean operators

▶ Some languages (such as C or Scheme) are lenient about

Booleans:

▶ Conjudations, disjunctions, conditionals, etc. can take any

expression

▶ The distinction is between false and not false ▶ Not false is not the same as true:

> (and 2 3 4) 4 > (or 2 3 4) 2 > (if 3 'then 'else) then > (if '() 'then 'else) then > (if #f 'then 'else) else

Mayer Goldberg \ Ben-Gurion University Compiler Construction 46 / 112

slide-47
SLIDE 47

Macros & special forms (continued)

and

▶ Conjunctions are easily expanded into nested if-expressions:

▶ (and) = #t (by defjnition) ▶ (and ⟨expr⟩) = ⟨expr⟩ (#t is the unit element of and) ▶ (and ⟨expr1⟩ ⟨expr2⟩ · · · ⟨exprn⟩) =

(if ⟨expr1⟩ (and ⟨expr2⟩ · · · ⟨exprn⟩) #f)

▶ The assembly code generated for and-expansions is no difgerent

from the assembly code that would have been generated had we supported and-expressions as a core syntactic form

☞ You should implement this macro-expansion in your tag-parser

Mayer Goldberg \ Ben-Gurion University Compiler Construction 47 / 112

slide-48
SLIDE 48

Macros & special forms (continued)

  • r

Macro-expanding or-expressions is very difgerent from macro-expanding and-expressions

▶ The fjrst two clauses are similar to what we do with

and-expressions:

▶ (or) = #f (by defjnition) ▶ (or ⟨expr⟩) = ⟨expr⟩ (because #f is the unit of or)

▶ For the third clause, you might consider something like:

(or ⟨expr1⟩ ⟨expr2⟩ · · · ⟨exprn⟩) = (if ⟨expr1⟩ #t (or ⟨expr2⟩ · · · ⟨exprn⟩))

▶ This macro-expansion is, of course, incorrect! (think why)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 48 / 112

slide-49
SLIDE 49

Macros & special forms (continued)

  • r (continued)

▶ Take another look at (or ⟨expr1⟩ ⟨expr2⟩ · · · ⟨exprn⟩) =

(if ⟨expr1⟩ #t (or ⟨expr2⟩ · · · ⟨exprn⟩))

▶ Suppose we implemented or-expressions in this way: What

would be the value of (or 2 3) ?

😟 It would be #t

▶ Scheme returns 2 Mayer Goldberg \ Ben-Gurion University Compiler Construction 49 / 112

slide-50
SLIDE 50

Macros & special forms (continued)

  • r (continued)

Let us consider a simpler version of our problem: How to macro-expand (or ⟨expr1⟩ ⟨expr2⟩)

▶ This is fjne, because or-expressions associate! ▶ What about this macro-expansion: (or ⟨expr1⟩ ⟨expr2⟩) =

(if ⟨expr1⟩ ⟨expr1⟩ ⟨expr2⟩)

▶ This macro-expansion is, of course, incorrect! (think why)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 50 / 112

slide-51
SLIDE 51

Macros & special forms (continued)

  • r (continued)

▶ Take another look at (or ⟨expr1⟩ ⟨expr2⟩) =

(if ⟨expr1⟩ ⟨expr1⟩ ⟨expr2⟩)

▶ Suppose we implemented or-expressions in this way: What

would be the output of (or (begin (display "*\n") #t) 'moshe)

😟 It would print * twice and return #t

▶ Scheme prints * once and returns #t

▶ We told you side-efgects were tricky! 😊

☞ How might we make sure to evaluate ⟨expr1⟩ only once?

Mayer Goldberg \ Ben-Gurion University Compiler Construction 51 / 112

slide-52
SLIDE 52

Macros & special forms (continued)

  • r (continued)

Suppose we used a let-expression to store the value of ⟨expr1⟩:

▶ What about the expansion: (or ⟨expr1⟩ ⟨expr2⟩) =

(let ((x ⟨expr1⟩)) (if x x ⟨expr2⟩))

▶ This macro-expansion is, of course, incorrect! (think why)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 52 / 112

slide-53
SLIDE 53

Macros & special forms (continued)

  • r (continued)

▶ Take another look at (or ⟨expr1⟩ ⟨expr2⟩) =

(let ((x ⟨expr1⟩)) (if x x ⟨expr2⟩))

▶ Suppose we implemented or-expressions in this way: What

would be the value of (let ((x 'ha-ha!)) (or #f x))

😟 It would be #f

▶ Scheme returns ha-ha!

☞ Why would this expansion evaluate to #f?

Mayer Goldberg \ Ben-Gurion University Compiler Construction 53 / 112

slide-54
SLIDE 54

Macros & special forms (continued)

  • r (continued)

Look at how macro-expansion proceeds with our example: (let ((x 'ha-ha!)) (or #f x)) = (let ((x 'ha-ha!)) (let ((x #f)) (if x x x)))

▶ Notice how the macro-expansion introduced a new let between

the original let and the or-expression

▶ This new let introduced a variable binding that just so

happens to use the same variable as in the outer let-binding

▶ This new variable binding contaminated the code: The value of

x was found in the wrong lexical environment!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 54 / 112

slide-55
SLIDE 55

Macros & special forms (continued)

  • r (continued)

Look at how macro-expansion proceeds with our example: (let ((x 'ha-ha!)) (or #f x)) = (let ((x 'ha-ha!)) (let ((x #f)) (if x x x)))

▶ Notice how the macro-expansion introduced a new let between

the original let and the or-expression

▶ This is known as a variable-name capture ▶ Macro-expansions that result in variable-name captures are said

to be unhygienic

Mayer Goldberg \ Ben-Gurion University Compiler Construction 55 / 112

slide-56
SLIDE 56

Macros & special forms (continued)

  • r (continued)

A hygienic macro-expansion for or would requires that no user-code may see any variables introduced by our macro-expansion

▶ This is often impossible to accomplish without resorting to tricks ▶ This often requires tricky, circuitous, counter-intuitive

expansions that incur great performance penalties

▶ This is the case with or ▶ This is why we support disjunctions directly as a core form in

  • ur compiler

☞ You should not macro-expand or-expressions in your compiler!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 56 / 112

slide-57
SLIDE 57

Macros & special forms (continued)

  • r (continued)

This is how to macro-expand or-expressions (if someone were to hold a gun to your head and force you): (or ⟨expr1⟩ ⟨expr2⟩) = (let((x ⟨expr1⟩) (y (lambda () ⟨expr2⟩))) (if x x (y))) = ((lambda (x y) (if x x (y))) ⟨expr1⟩ (lambda () ⟨expr2⟩))

☞ Notice that ⟨expr1⟩, ⟨expr2⟩ cannot access the variables x, y

introduced by the expansion!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 57 / 112

slide-58
SLIDE 58

Macros & special forms (continued)

  • r (continued)

The cost of the hygienic expansion of or-expressions is high:

▶ Two lambda-expressions, and hence the creation of two closures

▶ This is the same as allocating two objects in an OOPL

▶ Two applications

for each pair of two expressions

☞ For an or-expression that has n + 1 disjuncts, we would need:

▶ To create/allocate 2n closures ▶ To evaluate 2n applications

☞ By contrast, implementing the or-expression as a core form in

  • ur compiler requires no allocation of closures and no

applications, regardless of the number of disjuncts!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 58 / 112

slide-59
SLIDE 59

Macros & special forms (continued)

begin

▶ The general form: (begin ⟨expr1⟩ · · · ⟨exprn⟩) ▶ Sequences are associative, so we need only consider the binary

case: (begin ⟨expr1⟩ ⟨expr2⟩)

▶ How might we possibly expand it? How about

(begin ⟨expr1⟩ ⟨expr2⟩) = (let ((x ⟨expr1⟩)) ⟨expr2⟩)

▶ This macro-expansion is, of course, incorrect! (think why)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 59 / 112

slide-60
SLIDE 60

Macros & special forms (continued)

begin (continued)

▶ Take another look at

(begin ⟨expr1⟩ ⟨expr2⟩) = (let ((x ⟨expr1⟩)) ⟨expr2⟩)

▶ This expansion introduces the variable x

▶ Notice that ⟨expr2⟩ can access this variable! ▶ This means that this expansion is not hygienic!

▶ Suppose we implemented begin-expressions in this way: What

would be the value of (let ((x 3)) (begin 2 x))

😟 It would be 2

▶ Scheme returns 3 Mayer Goldberg \ Ben-Gurion University Compiler Construction 60 / 112

slide-61
SLIDE 61

Macros & special forms (continued)

begin (continued)

How about (begin ⟨expr1⟩ ⟨expr2⟩) = (if ⟨expr1⟩ ⟨expr2⟩ ⟨expr2⟩)

▶ We see that fjrst ⟨expr1⟩ evaluates, and then ⟨expr2⟩, so the

  • rder is correct

▶ No variables are introduced, so there’s no issue of lexical hygiene ▶ This expansion is actually correct, but bad! (think why)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 61 / 112

slide-62
SLIDE 62

Macros & special forms (continued)

begin (continued)

▶ Take another look at

(begin ⟨expr1⟩ ⟨expr2⟩) = (if ⟨expr1⟩ ⟨expr2⟩ ⟨expr2⟩)

▶ The text of ⟨expr2⟩ actually appears twice in the expanded

form!

▶ This means that a begin with n + 1 expressions will expand to

an expression of size O(2n)

▶ This is clearly not practical! Mayer Goldberg \ Ben-Gurion University Compiler Construction 62 / 112

slide-63
SLIDE 63

Macros & special forms (continued)

begin (continued)

How about (begin ⟨expr1⟩ ⟨expr2⟩) = (and (or ⟨expr1⟩ #t) ⟨expr2⟩)

▶ We see that (or ⟨expr1⟩ #t) evaluates fjrst, and that its

value is always #t

▶ So the and continues on to evaluate ⟨expr2⟩

▶ The ordering is correct!

▶ No variables are introduced so there’s no issue of lexical hygiene ▶ No expression is duplicated

👎 This expansion actually works!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 63 / 112

slide-64
SLIDE 64

Macros & special forms (continued)

begin (continued)

▶ How about

(begin ⟨expr1⟩ ⟨expr2⟩) = (or (and ⟨expr1⟩ #f) ⟨expr2⟩)

▶ We see that fjrst (and ⟨expr1⟩ #f) evaluates, and then

⟨expr2⟩, so the order is correct

▶ No variables are introduced, so there’s no issue of lexical hygiene

👎 This expansion actually works!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 64 / 112

slide-65
SLIDE 65

Macros & special forms (continued)

begin (continued)

How about (begin ⟨expr1⟩ ⟨expr2⟩) = (let ((x ⟨expr1⟩) (y (lambda () ⟨expr2⟩)) (y))

▶ We introduced two variables x & y:

▶ Neither ⟨expr1⟩ nor ⟨expr2⟩ can access these variables! ▶ The solution is hygienic!

▶ ⟨expr1⟩ evaluates in parallel with the creation of the closure

for (lambda () ⟨expr2⟩)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 65 / 112

slide-66
SLIDE 66

Macros & special forms (continued)

begin (continued)

How about (begin ⟨expr1⟩ ⟨expr2⟩) = (let ((x ⟨expr1⟩) (y (lambda () ⟨expr2⟩)) (y))

▶ Evaluating (lambda () ⟨expr2⟩) does not evaluate ⟨expr2⟩ ▶ ⟨expr2⟩ is evaluated only when the closure is applied!

👎 This expansion actually works!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 66 / 112

slide-67
SLIDE 67

Macros & special forms (continued)

begin (continued)

How about (begin ⟨expr1⟩ ⟨expr2⟩) = (let ((x ⟨expr1⟩) (y (lambda () ⟨expr2⟩)) (y))

▶ This expansion is actually more effjcient than the previous two

(which used and & or):

▶ and & or with more than one expression always involve a test

and a conditional jump

▶ Sequencing does not logically require tests or conditional jumps ▶ So this solution is less expensive Mayer Goldberg \ Ben-Gurion University Compiler Construction 67 / 112

slide-68
SLIDE 68

Macros & special forms (continued)

begin (continued)

How about (begin ⟨expr1⟩ ⟨expr2⟩) = (let ((x ⟨expr1⟩) (y (lambda () ⟨expr2⟩)) (y))

▶ This expansion is actually more effjcient than the previous two

(which used and & or):

▶ It’s still pretty horrible: Two applications, and two closures

created for every pair of expressions in a begin:

▶ Sequences of n + 1 expressions require 2n applications and the

creation of 2n closures, which are then garbage-collected

Mayer Goldberg \ Ben-Gurion University Compiler Construction 68 / 112

slide-69
SLIDE 69

Macros & special forms (continued)

begin (continued)

How about (begin ⟨expr1⟩ ⟨expr2⟩) = (let ((x ⟨expr1⟩) (y (lambda () ⟨expr2⟩)) (y))

▶ This is why we support sequences natively within our compiler

☞ You should not implement this macro-expansion in your

compilers!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 69 / 112

slide-70
SLIDE 70

Macros & special forms (continued)

let

▶ The let-expression is a way of defjning any number of local

variables, and assigning them initial values.

▶ Once the local variables have been initialized, they are accessible

to an implicit sequence of expressions that are evaluated in their lexical scope.

▶ The syntax looks like this:

(let ((v1 ⟨Expr1⟩) · · · (vn ⟨Exprn⟩)) ⟨expr1⟩ · · · ⟨exprm⟩)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 70 / 112

slide-71
SLIDE 71

Macros & special forms (continued)

let (continued)

We wish to macro-expand let-expressions:

▶ Local variables are parameters of lambda-expressions ▶ Expressions that can access local variables come from the bodies

  • f lambda-expressions

▶ The parameters of lambda-expressions get their values when

lambda-expressions are applied to arguments

▶ The values of the arguments are the initial values of the

parameters

Mayer Goldberg \ Ben-Gurion University Compiler Construction 71 / 112

slide-72
SLIDE 72

Macros & special forms (continued)

let (continued)

Putting it all together, we get the following macro-expansion: (let ((v1 ⟨Expr1⟩) · · · (vn ⟨Exprn⟩)) ⟨expr1⟩ · · · ⟨exprm⟩) = ( (lambda (v1 · · · vn) ⟨expr1⟩ · · · ⟨exprm⟩) ⟨Expr1⟩ · · · ⟨Exprn⟩)

▶ The expansion is hygienic (think why)

☞ You should implement this macro-expansion in your compiler!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 72 / 112

slide-73
SLIDE 73

Macros & special forms (continued)

let*

▶ Recall that the ordering of let-bindings is undefjned, and that

we may not assumed they take place in any particular sequence

▶ This follows from the fact that

▶ The asterisk in name of the let*-form is meant to suggest the

Kleene-star

▶ A let*-expression denotes nested let-expressions. The

following equations defjne the behaviour of the tag-parser on let*-expressions:

▶ ① This is the fjrst of the two base cases:

(let* () ⟨expr1⟩ · · · ⟨exprm⟩) = (let () ⟨expr1⟩ · · · ⟨exprm⟩)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 73 / 112

slide-74
SLIDE 74

Macros & special forms (continued)

let* (continued)

▶ ② This is the second base case:

(let* ((v ⟨Expr⟩)) ⟨expr1⟩ · · · ⟨exprm⟩) = (let ((v Expr)) ⟨expr1⟩ · · · ⟨exprm⟩)

☞ Think why two base cases are needed here!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 74 / 112

slide-75
SLIDE 75

Macros & special forms (continued)

let* (continued)

▶ ③ This is the inductive case:

(let* ((v1 ⟨Expr1⟩) (v2⟨Expr2⟩) · · · (vn ⟨Exprn⟩)) ⟨expr1⟩ · · · ⟨exprm⟩) = (let ((v1 ⟨Expr1⟩)) (let* ((v2 ⟨Expr2⟩) · · · (vn ⟨Exprn⟩)) ⟨expr1⟩ · · · ⟨exprm⟩))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 75 / 112

slide-76
SLIDE 76

Macros & special forms (continued)

let* (continued)

The expansion for let*-expressions seems terribly ineffjcient:

▶ A nested let-expression for every rib in the original

let*-expression

▶ This means

▶ One more closure created ▶ One application performed ▶ One closure garbage-collected

for each rib in the original let*-expression

Mayer Goldberg \ Ben-Gurion University Compiler Construction 76 / 112

slide-77
SLIDE 77

Macros & special forms (continued)

let* (continued)

In fact, this is all an illusion because each call is performed in tail position, so:

▶ The call is tail-call-optimized and replaced with a branch ▶ The allocation/creation of a new closure is avoided through a

simple analysis in the semantic analysis phase

▶ Rather than allocating new frames, the old frames are being

  • verwritten with the information of the new frames

▶ So in fact, this code is optimized into simple assignments

☞ The bottom line: The macro-expansion is effjcient when

compiled by a reasonably clever, optimizing compiler

☞ You should implement this macro-expansion in your compiler!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 77 / 112

slide-78
SLIDE 78

Macros & special forms (continued)

letrec

Let’s re-examine the special form let:

▶ We can use let to defjne local variables ▶ The value of these variables can be anything really, including

functions:

▶ Here’s how one might defjne local procedures using let:

(let ((square (lambda (x) (* x x)))) ;; here we can use the procedure square (sqrt (+ (square a) (square b) (* -2 a b (cos theta)))))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 78 / 112

slide-79
SLIDE 79

Macros & special forms (continued)

letrec (continued)

Nevertheless, let has one shortcoming when it comes to defjning local procedures: Recursive procedures cannot be defjned “as is”

▶ If we were to try to defjne and use the factorial function using

let, it might look like: (let ((fact (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))) (fact 5))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 79 / 112

slide-80
SLIDE 80

Macros & special forms (continued)

letrec (continued)

Which expands to ((lambda (fact) (fact 5)) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))

▶ Notice the body of fact is not able to access the parameter

fact of the procedure (lambda (fact) (fact 5))

▶ The parameter fact is only accessible in the body of this

procedure

▶ The reference to fact within the text of the body of the

factorial procedure is free, and refers to a global variable defjned at the top-level.

Mayer Goldberg \ Ben-Gurion University Compiler Construction 80 / 112

slide-81
SLIDE 81

Macros & special forms (continued)

letrec (continued)

To see things more clearly, we macro-expand the let. Note the parameter fact and whence it can be accessed: ((lambda (fact) (fact 5)) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))

😟 This just looks like an example of recursion, but in fact it isn’t!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 81 / 112

slide-82
SLIDE 82

Macros & special forms (continued)

letrec (continued)

An expansion that does work would be something like: (let ((fact 'whatever)) (set! fact (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (fact 5))

▶ Can you see why it works?

Mayer Goldberg \ Ben-Gurion University Compiler Construction 82 / 112

slide-83
SLIDE 83

Macros & special forms (continued)

letrec (continued)

Let’s expand the let-expression and see why this expansion works: ( (lambda (fact) (set! fact (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (fact 5)) 'whatever)

☞ The text of the body of the factorial procedure appears within

the body of the (lambda (fact) ...) procedure, which is why it may access the fact: This is recursion!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 83 / 112

slide-84
SLIDE 84

Macros & special forms (continued)

letrec (continued)

The general macro-expansion implied by the last example is presented below: (letrec ((f1 ⟨Expr1⟩) (f2 ⟨Expr2⟩) · · · (fn ⟨Exprn⟩)) ⟨expr1⟩ · · · ⟨exprm⟩) = (let ((f1 'whatever) (f2 'whatever) · · · (fn 'whatever)) (set! f1 ⟨Expr1⟩) (set! f2 ⟨Expr2⟩) · · · (set! fn ⟨Exprn⟩) ⟨expr1⟩ · · · ⟨exprm⟩)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 84 / 112

slide-85
SLIDE 85

Macros & special forms (continued)

letrec (continued)

This expansion is almost right:

▶ The main problem with the expansion has to do with nested

define-expressions

▶ It is possible to use define to defjne a local function within a

let or lambda form

▶ Several of such defjnitions may appear at the top of the body ▶ Several of such defjnitions may be grouped together within a

begin-expression

▶ All the nested define-expressions must appear at the top of

the body even if they are grouped in difgerent begin-expressions

▶ It is a syntax error to have a nested define after a

non-define-expression in the body of a lambda or let

Mayer Goldberg \ Ben-Gurion University Compiler Construction 85 / 112

slide-86
SLIDE 86

Macros & special forms (continued)

letrec (continued)

This means that such defjnitions are possible: (define f (lambda (a b c) (define g1 (lambda () ... )) (begin (define g2 (lambda (x) ... )) (begin (define g3 (lambda (x y) ... )) (define g4 (lambda (z) ... )))) ... ))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 86 / 112

slide-87
SLIDE 87

Macros & special forms (continued)

letrec (continued)

And now we have a problem:

💤 If nested define-expressions appear within the body of a

letrec-expression, and we macro-expand the letrec-expression into a let-expression with assignments at the top of its body, then the nested define-expressions will appear after the set!-expressions, and this would be syntactically illegal!

▶ In fact, we shall not support nested define-expressions in our

compiler

☞ You should implement this macro-expansion in your compilers!

▶ We still need to fjnd a macro-expansion for letrec that can live

with nested define-expressions…

Mayer Goldberg \ Ben-Gurion University Compiler Construction 87 / 112

slide-88
SLIDE 88

Macros & special forms (continued)

letrec (continued)

This macro-expansion does the trick: (letrec ((f1 ⟨Expr1⟩) (f2 ⟨Expr2⟩) · · · (fn ⟨Exprn⟩)) ⟨expr1⟩ · · · ⟨exprm⟩) = (let ((f1 'whatever) (f2 'whatever) · · · (fn 'whatever)) (set! f1 ⟨Expr1⟩) (set! f2 ⟨Expr2⟩) · · · (set! fn ⟨Exprn⟩) (let () ⟨expr1⟩ · · · ⟨exprm⟩))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 88 / 112

slide-89
SLIDE 89

Macros & special forms (continued)

letrec (continued)

Why does this macro-expansion work?

▶ Notice that the body of the original letrec-expression is now

wrapped within a (in (let () ...)), and any nested define-expressions will appear at the top of that let, which is perfectly acceptable

Mayer Goldberg \ Ben-Gurion University Compiler Construction 89 / 112

slide-90
SLIDE 90

Macros & special forms (continued)

letrec (continued)

There is something fundamentally unsavory about the last two macro-expansions for letrec:

▶ The letrec form has to do with defjning locally-recursive

procedures

▶ Recursion forms a cornerstone for functional programming ▶ In both cases, the macro-expanded code contains assignments,

which are side-efgects

▶ Side-efgects are specifjcally excluded in pure functional

programming

☞ It seems as if there is something about recursion that requires

side efgects, and this raises doubts about the entire functional programming agenda: Is functional programming not powerful enough to express one of its most basic ideas?

Mayer Goldberg \ Ben-Gurion University Compiler Construction 90 / 112

slide-91
SLIDE 91

Macros & special forms (continued)

letrec (continued)

▶ The short answer is that yes, functional programming can

express the idea of recursion in a way that is natural and native to functional programming, without any side-efgects

▶ To understand this answer, we will need to

▶ Study some fjxed-point theory ▶ Think harder about what recursion really means

▶ The full answer to this question is one of the most beautiful and

exciting topics in the foundations of computer science and in programming languages theory

▶ For the time being, we move on to further topics that are

necessary for you to work on your compiler projects

▶ We shall return to the topic in several weeks… Stay tuned!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 91 / 112

slide-92
SLIDE 92

Macros & special forms (continued)

cond

The cond form has the general form: (cond ⟨rib1⟩ · · · ⟨ribn⟩) There are 3 kinds of cond-ribs:

▶ ① The common form (⟨expr⟩ ⟨expr1⟩ · · · ⟨exprm⟩), where

⟨expr⟩ is the test-expression: It is evaluated, and if not false, the rib is satisfjed, all subsequent ribs are ignored, the corresponding implicit sequence is evaluated, and its fjnal expression is returned.

Mayer Goldberg \ Ben-Gurion University Compiler Construction 92 / 112

slide-93
SLIDE 93

Macros & special forms (continued)

cond (continued)

The cond form has the general form: (cond ⟨rib1⟩ · · · ⟨ribn⟩) There are 3 kinds of cond-ribs:

▶ ② The arrow form (⟨expr⟩ => ⟨exprf⟩), where ⟨expr⟩ is

evaluated: If non-false, the rib is satisfjed, and the return value is the application of ⟨exprf⟩ to the value of ⟨expr⟩.

Mayer Goldberg \ Ben-Gurion University Compiler Construction 93 / 112

slide-94
SLIDE 94

Macros & special forms (continued)

cond (continued)

The cond form has the general form: (cond ⟨rib1⟩ · · · ⟨ribn⟩) There are 3 kinds of cond-ribs:

▶ ③ The else-rib has the form (else ⟨expr1⟩ · · · ⟨exprm⟩). It is

satisfjed immediately, and all subsequent ribs are ignored. The implicit sequence is evaluated, and the value of its fjnal expression is returned.

Mayer Goldberg \ Ben-Gurion University Compiler Construction 94 / 112

slide-95
SLIDE 95

Macros & special forms (continued)

cond (continued)

The cond form macro-expands into nested if-expressions:

▶ ① The general form of the rib converts into an if-expression

with a condition and an explicit sequence for the then-clause. The else-clause of the if-expression continues the expansion of the cond:

Mayer Goldberg \ Ben-Gurion University Compiler Construction 95 / 112

slide-96
SLIDE 96

Macros & special forms (continued)

cond (continued)

The cond form macro-expands into nested if-expressions:

▶ ② The arrow-form of the rib converts into a let that captures

the value of the test, and if not false, passes it onto the

  • function. For test-expression ⟨expr⟩, and function-expression

⟨exprf⟩, the following expansion would do: (let ((value ⟨expr⟩) (f (lambda () ⟨exprf⟩)) (rest (lambda () ⟨continue with cond-ribs⟩))) (if value ((f) value) (rest)))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 96 / 112

slide-97
SLIDE 97

Macros & special forms (continued)

cond (continued)

The cond form macro-expands into nested if-expressions:

▶ ③ The else-form of the rib converts into a begin-expression,

and subsequent ribs are ignored

Mayer Goldberg \ Ben-Gurion University Compiler Construction 97 / 112

slide-98
SLIDE 98

An example of expanding cond

cond form

(cond ((zero? n) (f x) (g y)) ((h? x) => (p q)) (else (h x y) (g x)) ((q? y) (p x) (q y)))

Expanded form

(if (zero? n) (begin (f x) (g y)) (let ((value (h? x)) (f (lambda () (p q))) (rest (lambda () (begin (h x y) (g x))))) (if value ((f) value) (rest))))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 98 / 112

slide-99
SLIDE 99

Macros & special forms (continued)

The expansion of quasiquote-expressions

▶ Quasiquote-expressions are expanded twice:

▶ Once in the reader, when the forms `⟨sexpr⟩, ,⟨sexpr⟩,

,@⟨sexpr⟩, and in fact '⟨sexpr⟩ too, are converted to their list forms: (quasiquote ⟨sexpr⟩), (unquote ⟨sexpr⟩), (unquote-splicing ⟨sexpr⟩), and (quote ⟨sexpr⟩), respectively

▶ And a second time when quasiquote-expressions are expanded

away in the tag-parser

☞ This is what we’re focusing on here!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 99 / 112

slide-100
SLIDE 100

Macros & special forms (continued)

The expansion of quasiquote-expressions (cont)

▶ Since R6RS, quasiquote-expressions can be nested, which

means we can quasiquote quasiquoted expressions… This is complex, and not terribly useful, so we’re not going to support it: We assume ordinary quasiquote-expressions not to include quasiquote-expressions

▶ We assume we have already received the form (quasiquote

⟨sexpr⟩), and are now going to reason about ⟨sexpr⟩

Mayer Goldberg \ Ben-Gurion University Compiler Construction 100 / 112

slide-101
SLIDE 101

Macros & special forms (continued)

The expansion of quasiquote-expressions (cont)

▶ ① Upon receiving the expression (unquote ⟨sexpr⟩), we

return ⟨sexpr⟩

▶ ② Upon receiving the expression (unquote-splicing

⟨sexpr⟩), we return the same expression preceeded by a quote: (quote (unquote-splicing ⟨sexpr⟩))

▶ ③ Given either the empty list or a symbol, we wrap (quote

· · · ) around it

▶ ④ Given a vector, we convert it to a list, expand the list using

the quasiquote-expander, and convert it back to a vector using the built-in procedure list->vector

Mayer Goldberg \ Ben-Gurion University Compiler Construction 101 / 112

slide-102
SLIDE 102

Macros & special forms (continued)

The expansion of quasiquote-expressions (cont)

This is the heart of the algorithm:

▶ ⑤ Given a pair, let A be the car, and let B be the cdr

respectively.

▶ If A =(unquote-splicing ⟨sexpr⟩), then return (append

⟨sexpr⟩ B)

▶ Otherwise, return (cons A B)

☞ You should implement this quasiquote-expansion in your

tag-parser!

Mayer Goldberg \ Ben-Gurion University Compiler Construction 102 / 112

slide-103
SLIDE 103

Macros & special forms (continued)

The expansion of quasiquote-expressions (cont)

Some examples: sexpr expansion ,x x ,@x error (a b) (cons 'a (cons 'b '())) (,a b) (cons a (cons 'b '())) (a ,b) (cons 'a (cons b '())) (,@a b) (append a (cons 'b '())) (a ,@b) (cons 'a (append b '()))

Mayer Goldberg \ Ben-Gurion University Compiler Construction 103 / 112

slide-104
SLIDE 104

Macros & special forms (continued)

The expansion of quasiquote-expressions (cont)

Some examples: sexpr expansion (,a ,@b) (cons a (append b '())) (,@a ,@b) (append a (append b '())) (,@a . ,b) (append a b) (,a . ,b) (cons a b) (,a . ,@b) (list a 'unquote-splicing 'b) (((,@a))) (cons (cons (append a '()) '()) '()) #(a ,b c ,d) (vector 'a b 'c d)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 104 / 112

slide-105
SLIDE 105

Macros & special forms (continued)

The expansion of quasiquote-expressions (cont)

▶ The expansion is a trivial recursive procedure ▶ The expansion is hardly ever isn’t always the shortest ▶ The expansion can be made to compete with the best expanders,

if you optimize the output using a pattern-based post-processor that iterates over the output until it reaches a fjxed point

Mayer Goldberg \ Ben-Gurion University Compiler Construction 105 / 112

slide-106
SLIDE 106

Macros & special forms (continued)

The expansion of quasiquote-expressions (cont) ☞ The real question is what to optimize:

▶ Optimizing for time: The expanded form should call fewer

functions

▶ Optimizing for space: The expanded form should avoid, when

possible, to recreate constants

▶ Example: Consider `(a ,b c d e f) ▶ Naïve output: (cons 'a (cons b (cons 'c (cons 'd

(cons 'e (cons 'f '()))))))

▶ Optimized for time: (list 'a b 'c 'd 'e 'f) (save on

function calls)

▶ Optimized for space: (cons 'a (cons b '(c d e f))) (save

  • n pairs)

Mayer Goldberg \ Ben-Gurion University Compiler Construction 106 / 112

slide-107
SLIDE 107

The Tag-Parser (continued)

How to write a tag-parser (continued)

▶ For macro-expanded forms

▶ ① Use pattern-matching to match over the concrete syntax of

various syntactic forms

▶ Consult the list of special forms that need to be macro-expanded ▶ ② Perform any additional testing necessary ▶ ③ Call macro-expanders, which are functions that are specifjc

to each special form

▶ For example, expand_let, expand_cond, etc ▶ The macro-expanders should convert the concrete syntax of the

special form to the concrete syntax of the expanded form

▶ You may still use expandable special forms in the expansion ▶ For example, expanding letrec may result in a let-expression,

that will be expanded by the expand_let expander…

Mayer Goldberg \ Ben-Gurion University Compiler Construction 107 / 112

slide-108
SLIDE 108

The Tag-Parser (continued)

How to write a tag-parser (continued)

▶ For macro-expanded forms (continued)

▶ ④ Call the tag_parser recursively on what the

macro-expanders return

▶ For example, tag_parse(expand_let e) Mayer Goldberg \ Ben-Gurion University Compiler Construction 108 / 112

slide-109
SLIDE 109

Roadmap

▶ Expressions in Scheme

🗹 The expr datatype 🗹 The Tag-Parser 🗹 Macros & special forms

▶ Lexical hygiene Mayer Goldberg \ Ben-Gurion University Compiler Construction 109 / 112

slide-110
SLIDE 110

Lexical hygiene

We had to deal with lexical hygiene issues in some of our macro-expansions. So at the end of the chapter on macro-expansion, this is worth recapitulating:

▶ Hygiene problems occur when variables that are introduced

during macro-expansion are visible to user-code

▶ When user-code refers to such variables, this is always an error ▶ We refer to such an error variable-name captures

▶ We can avoid variable-name capture by

▶ Using gensym to generate an uninterned symbol ▶ By fjnding in the expansion a “safe” variable-name to use ▶ By changing the macro-expansion so that administrative

variables are invisible to user-code

Mayer Goldberg \ Ben-Gurion University Compiler Construction 110 / 112

slide-111
SLIDE 111

Roadmap

🗹 Expressions in Scheme 🗹 The expr datatype 🗹 The Tag-Parser 🗹 Macros & special forms 🗹 Lexical hygiene

Mayer Goldberg \ Ben-Gurion University Compiler Construction 111 / 112

slide-112
SLIDE 112

Further reading

🕯 The Structure and Interpretation of Computer Programs, by

Abelson and Sussman: A classical introductory text on programming, abstraction, interpretation. This is the textbook used for a legendary course at MIT by the same title.

🕯 Scheme and the Art of Programming, by Springer and

Friedman: One of the best books on Scheme and functional programming ever written. Very thorough and systematic.

Mayer Goldberg \ Ben-Gurion University Compiler Construction 112 / 112