Compiler Construction
Mayer Goldberg \ Ben-Gurion University Friday 23rd October, 2020
Mayer Goldberg \ Ben-Gurion University Compiler Construction 1 / 193
Compiler Construction Compiler Construction 1 / 193 Mayer Goldberg - - PowerPoint PPT Presentation
Compiler Construction Compiler Construction 1 / 193 Mayer Goldberg \ Ben-Gurion University Friday 23 rd October, 2020 Mayer Goldberg \ Ben-Gurion University Chapter 1 Goals Agenda Compiler Construction 2 / 193 Establishing common language
Mayer Goldberg \ Ben-Gurion University Compiler Construction 1 / 193
▶ Establishing common language & vocabulary ▶ Understanding the “big picture”
▶ Some background in programming languages
▶ Abstraction ▶ Dynamic vs Static ▶ Functional vs Imperative languages
▶ Introduction to compiler construction ▶ Introduction to the ocaml programming language
Mayer Goldberg \ Ben-Gurion University Compiler Construction 2 / 193
▶ Abstraction is a way of moving from a particular to the general ▶ Abstraction appears mathematics, logic, and in computer science ▶ Abstraction is a force-multiplier, and a great time-saver
Mayer Goldberg \ Ben-Gurion University Compiler Construction 3 / 193
▶ Abstraction in logic: going from propositions to quantifjed
▶ Specifjc: Hx0 → Ex0 means that if [the specifjc] x0 is hungry,
▶ General:
Mayer Goldberg \ Ben-Gurion University Compiler Construction 4 / 193
▶ Abstraction in mathematics: going from expression to function
▶ Consider the expression x · sin2(1 + x) ▶ This expression denotes a number; not a function ▶ Writing
▶ When we write
▶ f(x) = x · sin2(1 + x) ▶ f ′(x) = sin2(1 + x) + 2x · sin(1 + x) · cos(1 + x) Mayer Goldberg \ Ben-Gurion University Compiler Construction 5 / 193
▶ Abstraction in mathematics: going from expression to function
▶ The only thing “wrong” here is that we gave the function a
▶ This is “wrong” because the choice of the name is arbitrary ▶ Functions can be written anonymously using λ-notation: ▶ The symbol λ (“lambda”) just means “anonymous function” ▶ In theory (the λ-calculus): λx.(x · sin2(1 + x)) ▶ In programming (Scheme): (lambda (x) (* x (square
▶ The expression x · sin2(1 + x) is concrete, and for a specifjc x ▶ The expression λx.(x · sin2(1 + x)) is an abstraction: We say
Mayer Goldberg \ Ben-Gurion University Compiler Construction 6 / 193
▶ Abstraction in programming
▶ Functional programming: Similar to abstraction
▶ Expressions are abstracted into functions ▶ Functions are abstracted into higher order functions ▶ Collections of functions are abstracted into modules ▶ Modules are abstracted into functors ▶ Modules are abstracted into signatures ▶ Procedural programming ▶ Statements are abstracted into procedures Mayer Goldberg \ Ben-Gurion University Compiler Construction 7 / 193
▶ Abstraction in programming (continued)
▶ Object-oriented programming ▶ Objects are abstracted into classes ▶ Classes are abstracted into generics & templates ▶ Classes are abstracted into packages ▶ Logic programming ▶ Similar to abstraction in logic ▶ Propositions are abstracted into predicates ▶ Textual abstraction ▶ Text is abstracted into templates Mayer Goldberg \ Ben-Gurion University Compiler Construction 8 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 9 / 193
▶ Notice the text variables that are embedded within the template ▶ All junk mail, whether printed or electronic, is generated in this
▶ In word-processing, this functionality is known as mail merge:
Mayer Goldberg \ Ben-Gurion University Compiler Construction 10 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 11 / 193
▶ Dynamic & Static are used in many areas of computer science ▶ In programming languages theory, these terms have a very
Mayer Goldberg \ Ben-Gurion University Compiler Construction 12 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 13 / 193
▶ Programming languages where
▶ Types exist at compile-time only ▶ Terms exist at run-time only
Mayer Goldberg \ Ben-Gurion University Compiler Construction 14 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 15 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 16 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 17 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 18 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 19 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 20 / 193
▶ You’ve already been exposed to functional programming in your
▶ Functional languages are often described as languages lacking in
▶ Imperative languages have side efgects
▶ What can we say positively about functional programming?
Mayer Goldberg \ Ben-Gurion University Compiler Construction 21 / 193
▶ Computer science is the illegitimate child of mathematics and
▶ Electrical engineering gave us the hardware, the actual machine ▶ Nearly all ideas come from mathematics
▶ Digital electronics: Gates, fmip-fmops, latches, memory, etc ▶ Boolean functions that read their inputs from memory & write
▶ Reading & writing are synchronized using a clock (with a
▶ A fjnite-state machine
Mayer Goldberg \ Ben-Gurion University Compiler Construction 22 / 193
▶ We cannot design large software systems while thinking at the
▶ We need some theoretical foundations for programming &
Mayer Goldberg \ Ben-Gurion University Compiler Construction 23 / 193
▶ A language ▶ A set of ideas, notions, defjnitions, techniques, results, all
▶ Programming is based on a computable mathematics ▶ Theoretical computer science uses all of mathematics
Mayer Goldberg \ Ben-Gurion University Compiler Construction 24 / 193
▶ A language ▶ Notions such as functions, variables, expressions, evaluation,
▶ Operations such as arithmetic, Boolean, structural (e.g., on n
Mayer Goldberg \ Ben-Gurion University Compiler Construction 25 / 193
▶ Computers can only approximate real numbers ▶ Computers cannot implement infjnite tape (Turing machines) ▶ Mathematical objects are cheaper than objects created on a
▶ Functions are mappings; They take no time! ▶ Knowing that an object exists is often all we need! ▶ Bad things cannot happen: ▶ No exceptions, errors, incorrect results ▶ Nothing is lost, nothing is “too big” or “too much” Mayer Goldberg \ Ben-Gurion University Compiler Construction 26 / 193
▶ Closer to mathematics ▶ Easier to reason about ▶ Easier to transform ▶ Easier to generate automatically
▶ Farther from mathematics ▶ Harder to reason about ▶ Harder to transform ▶ Harder to generate automatically
Mayer Goldberg \ Ben-Gurion University Compiler Construction 27 / 193
▶ To simplify matters, let’s pretend there is a string type in C
▶ You teach them a simplifjed version of printf:
▶ Only takes a single string argument ▶ Returns an int: the number of characters in the string
▶ Roughly: printf : string -> int ▶ But the logician objects: He already knows of a function from
▶ strlen : string -> int ▶ He wants to know the difgerence between printf and strlen Mayer Goldberg \ Ben-Gurion University Compiler Construction 28 / 193
▶ You: “Simple, printf also prints the string to the screen!” ▶ Logician: “What does it mean to print??” ▶ You: “Seriously?? The printf function prints its argument to
▶ Logician: “But you said the domain of printf is string -> int,
▶ You: “Yes, so?” ▶ Logician: “Then where’s the screen??” ▶ You: “In front of you!” ▶ Logician: “Where’s the screen in the domain of the function
Mayer Goldberg \ Ben-Gurion University Compiler Construction 29 / 193
▶ You: “It isn’t in the domain. You can think of the screen as a
▶ Logician: “I have no idea what you mean: How can the screen
▶ You: “But that’s the whole point of this printing being a side
▶ Logician: “Well, then printf isn’t a function!” ▶ You: “Ummm…” ▶ Logician (having a Eureka!-moment): “I get it now! You got the
Mayer Goldberg \ Ben-Gurion University Compiler Construction 30 / 193
▶ The real type of printf is string × screen → int × screen ▶ The underlined parts of the type are implicit, i.e., they are not
▶ The implicit parts of the type form the environment ▶ The function call mentions only the explicit arguments ▶ Leaving out the implicit arguments in the function call creates
▶ In fact, nothing has changed: The screen in the domain has
Mayer Goldberg \ Ben-Gurion University Compiler Construction 31 / 193
▶ Introducing side efgects introduces discrete time ▶ Having introduced time, we must now introduce sequencing:
▶ The notion of sequencing is, like time, illusory:
▶ The screen object in the range of printf("Hello "); is the
▶ So the two printf expressions are nested, and this is why their
Mayer Goldberg \ Ben-Gurion University Compiler Construction 32 / 193
▶ The return value of the second call
▶ The screen after the second call to
Mayer Goldberg \ Ben-Gurion University Compiler Construction 33 / 193
▶ Closer to the mathematical notions of function, variable,
▶ Nothing is implicit ▶ Easier to reason about ▶ Side efgects are not an explicit part of the language (although
▶ Ofgers many other advantages
Mayer Goldberg \ Ben-Gurion University Compiler Construction 34 / 193
▶ Farther away from the mathematical notions such as function,
▶ Hides information through the use of implicit arguments (for
▶ Harder to reason about: Contains notions such as side efgects,
▶ Abstraction is harder, prone to errors ▶ Side efgects create implicit, knotty inter-dependencies between
Mayer Goldberg \ Ben-Gurion University Compiler Construction 35 / 193
▶ Values ⇒ Expressions ⇒ Functions ▶ Higher-order functions ▶ Mathematical operators: mapping, folding, fjltering, partitioning,
▶ The interpreter evaluates expressions
▶ State ⇒ Change ⇒ Commands ⇒ Procedures ▶ Object-oriented programming ▶ Imperative ≡ Based upon commands (imperare means to
▶ The interpreter performs commands
Mayer Goldberg \ Ben-Gurion University Compiler Construction 36 / 193
▶ There are very few strictly functional languages, i.e., languages
▶ Most languages are quasi-functional: They don’t make it
▶ Most new imperative languages do include features from
▶ anonymous functions (“lambda”) ▶ higher-order functions ▶ modules/namespaces/functors Mayer Goldberg \ Ben-Gurion University Compiler Construction 37 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 38 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 39 / 193
▶ L1: Language ▶ L2: Language ▶ PL: A program in language L ▶ Values: A set of values ▶ ·: Semantic brackets
▶ An interpreter is a function IntL : L → Values ▶ Interpreters map expressions to their values ▶ For example: In functional Scheme, we have
Mayer Goldberg \ Ben-Gurion University Compiler Construction 40 / 193
▶ An interpreter is a function
▶ Interpreters map the product of an expression and an
▶ The environments are implicit in the imperative language ▶ When the environment in the domain is not equal to the
▶ For example: In imperative Scheme, we have
Mayer Goldberg \ Ben-Gurion University Compiler Construction 41 / 193
▶ A compiler is a function CompL2 L1 : L1 → L2 ▶ Compilers translate programs from one language to another ▶ Let PL1 ∈ L1, then CompL2 L1PL1 ∈ L2 ▶ The correctness of the translation is established using
L1PL1
Mayer Goldberg \ Ben-Gurion University Compiler Construction 42 / 193
▶ We may chain any number of compilers together:
L1PL1
L2CompL2 L1PL1
L3CompL3 L2CompL2 L1PL1
Mayer Goldberg \ Ben-Gurion University Compiler Construction 43 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 44 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 45 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 46 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 47 / 193
▶ Returns the same value (as the original code) under
▶ The interpreter takes less resources to evaluate the code.
▶ Computer memory ▶ Network traffjc ▶ Microprocessor registers
Mayer Goldberg \ Ben-Gurion University Compiler Construction 48 / 193
▶ Compilers to g-code, a language for CNCs, will try to minimize
▶ Compilers of graphs to visual presentations will try to minimize
▶ Compilers of document-layout languages will try to minimize
Mayer Goldberg \ Ben-Gurion University Compiler Construction 49 / 193
▶ Good code is hard to write ▶ Some programmers are incompetent ▶ Compilers are faster at detecting opportunities for optimizations ▶ Consistent output: Once debugged, compilers make no mistakes
Mayer Goldberg \ Ben-Gurion University Compiler Construction 50 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 51 / 193
▶ Code I
▶ less general than Code III ▶ less effjcient than Code II ▶ less maintainable than Code III
▶ Code II
▶ less general than Code I ▶ more effjcient than Code I, Code III ▶ less maintainable than Code I
▶ Code III
▶ more general than Code I, Code II ▶ less effjcient than Code II ▶ more maintainable than Code I, Code II Mayer Goldberg \ Ben-Gurion University Compiler Construction 52 / 193
▶ Clearly Code III is better!
▶ It is more general ▶ It is more maintainable ▶ It is more fmexible ▶ Most compilers optimize away such ineffjciency
Mayer Goldberg \ Ben-Gurion University Compiler Construction 53 / 193
▶ Want to program in a more general, maintainable, fmexible way ▶ Compilation is a point-in-time where generality is traded for
▶ Compilers are opportunistic ▶ Compilers identify opportunities to trade generality for effjciency ▶ The resulting code is unreadable, unmaintainable,
▶ We [normally] don’t touch the compiled code: For
Mayer Goldberg \ Ben-Gurion University Compiler Construction 54 / 193
▶ Processors running programs written
▶ Interpreters written in L′, running
▶ Compilers written in L′, running on
▶ Programs written in L′, running on L
Mayer Goldberg \ Ben-Gurion University Compiler Construction 55 / 193
▶ Hardware-implementations of
▶ They have only 1 docking point:
provides lang Mayer Goldberg \ Ben-Gurion University Compiler Construction 56 / 193
▶ Programs have at least 1 docking
▶ Mach Lang programs are
▶ Other programs are written in a
machine- language program
runs
program
src lang runs
Mayer Goldberg \ Ben-Gurion University Compiler Construction 57 / 193
▶ Interpreters come with 3 docking
▶ The language they provide ▶ The language [interpreter] on
▶ The [source] language in which
src lang
provides lang runs
Mayer Goldberg \ Ben-Gurion University Compiler Construction 58 / 193
▶ Compilers come with 4 docking
▶ The language they compile from ▶ The language they compile to ▶ The language in which they were
▶ The language [interpreter] on
compiling program
compiler
src lang dst lang runs
Mayer Goldberg \ Ben-Gurion University Compiler Construction 59 / 193
▶ Interpreters & compilers are often composed in complex ways ▶ Diagrams provide a simple, visual way to make sure that the
Mayer Goldberg \ Ben-Gurion University Compiler Construction 60 / 193
▶ The program must be written in the
▶ The two blocks join naturally
machine- language program
runs
processor
provides lang
Mayer Goldberg \ Ben-Gurion University Compiler Construction 61 / 193
▶ The program must have been
▶ The two blocks join naturally ▶ We are not saying anything about
program
src lang runs
processor
provides lang
Mayer Goldberg \ Ben-Gurion University Compiler Construction 62 / 193
▶ Interpreters are similar to processors
▶ They execute/evaluate programs
▶ Interpreters are programs too!
▶ Written in some source language ▶ Compiled into some target
▶ They run on an interpreter: ▶ a processor (hardware) ▶ a program (another interpreter)
program
src lang runs
processor
provides lang src lang
interpreter
provides lang runs
Mayer Goldberg \ Ben-Gurion University Compiler Construction 63 / 193
▶ Interpreters can execute/evaluate
▶ takes the program (on top) as
▶ outputs the program (on the
▶ It is the target program that is
compiling program
compiler
src lang dst lang runs
program
src lang runs
processor
provides lang src lang
interpreter
provides lang runs
program
src lang runs
Mayer Goldberg \ Ben-Gurion University Compiler Construction 64 / 193
▶ We may add additional details
▶ the processor/interpreter on
▶ We are still missing details
▶ the compiler that compiled the
▶ the compiled that compiled the
▶ …this can go on!
compiling program
compiler
src lang dst lang runs
program
src lang runs
processor
provides lang src lang
interpreter
provides lang runs
program
src lang runs
processor
provides lang
Mayer Goldberg \ Ben-Gurion University Compiler Construction 65 / 193
▶ The processor on which the compiler runs is difgerent from the
▶ It crosses the boundaries of architectures. ▶ Java compiler javac is an example of a cross-compiler:
▶ It runs on [e.g.,] x86 ▶ It generates Java-byte-code that runs on the JVM ▶ The JVM is an interpreter (java) running on [e.g.,] x86 Mayer Goldberg \ Ben-Gurion University Compiler Construction 66 / 193
▶ Interpreters may be stacked up in a
▶ Towers of interpreters consume
▶ Unless there is a marked
▶ Virtual machines (VMs) can be
▶ IBM mainframe architecture
src lang
interpreter
provides lang runs
processor
provides lang src lang
interpreter
provides lang runs
src lang
interpreter
provides lang runs
program
src lang runs
Mayer Goldberg \ Ben-Gurion University Compiler Construction 67 / 193
▶ Using previously-written compilers and interpreters, of course! ▶ This process is known as bootstrapping, and it is a specifjc form
Mayer Goldberg \ Ben-Gurion University Compiler Construction 68 / 193
▶ c1 is an assembler acting as a cross-compiler ▶ c2 already runs on our PC, but it was created on an IBM
▶ All the efgort of writing an assembler (in Pascal) has to be
▶ Any updates, upgrades, bug fjxes, changes, require that we
▶ c3 is essentially c2 compiling itself
▶ With c3, we are free from our old environment: Pascal on IBM
Mayer Goldberg \ Ben-Gurion University Compiler Construction 69 / 193
▶ With c4 we’re diverging:
▶ c4 is a C compiler ▶ We don’t yet support many features
▶ c5 is a C compiler written in C! Notice that
▶ it is written in an older version of C (v. 0.1) ▶ it supports a newer version of C (v. 0.2)
▶ In writing c6, we fjnally get to use all the language features our
Mayer Goldberg \ Ben-Gurion University Compiler Construction 70 / 193
▶ Almost all of you shall use compilers ▶ Most of you shall never work on another compiler after this
Mayer Goldberg \ Ben-Gurion University Compiler Construction 71 / 193
▶ Better understanding of programming languages ▶ Reduce the learning-curve for learning a new programming
▶ Better understanding of what compilers can & cannot do ▶ Demystify the compilation process
Mayer Goldberg \ Ben-Gurion University Compiler Construction 72 / 193
▶ Like having a team of programmers working for you
▶ Save time; Write code quickly ▶ Gain consistency
▶ Spread your bugs everywhere 🎊
▶ A bug in the code generator will spread bugs to many places ▶ This actually makes the bugs easier to fjnd! ▶ Once debugged, the code is generated again, and the same
Mayer Goldberg \ Ben-Gurion University Compiler Construction 73 / 193
▶ Knowledge of how compilers work will make you a more efgective
▶ Learning to write code that writes code is a force-multiplying
Mayer Goldberg \ Ben-Gurion University Compiler Construction 74 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 75 / 193
▶ Abstraction ▶ Dynamic vs Static ▶ Functional vs Imperative languages
Mayer Goldberg \ Ben-Gurion University Compiler Construction 76 / 193
▶ The language in which to write the compiler: ocaml ▶ The language we shall be compiling: Scheme ▶ The language we shall be compiling to: x86/64 assembly
Mayer Goldberg \ Ben-Gurion University Compiler Construction 77 / 193
▶ ML is a family of statically-typed, quasi-functional programming
▶ The main members of ML are
▶ SML (Standard ML) ▶ ocaml ▶ In Microsoftese, ocaml is called F#… Mayer Goldberg \ Ben-Gurion University Compiler Construction 78 / 193
▶ is used all over the world ▶ is used in commercial and open source projects ▶ is powerful, effjcient, convenient, modern, elegant, and has a
▶ supports both functional and object-oriented programming
▶ The ocaml object system is very powerful!
▶ makes it very diffjcult to have run-time errors!
Mayer Goldberg \ Ben-Gurion University Compiler Construction 79 / 193
▶ Very rich language ▶ Great support for abstractions of various kinds ▶ Great library support: dbms, networking, web programming,
▶ Compiles effjciently, either to bytecode or native
Mayer Goldberg \ Ben-Gurion University Compiler Construction 80 / 193
▶ Pattern-matching, modules, object-orientation, and types make
▶ Easy to enforce an API
Mayer Goldberg \ Ben-Gurion University Compiler Construction 81 / 193
▶ https://ocaml.org/, or
Mayer Goldberg \ Ben-Gurion University Compiler Construction 82 / 193
▶ I run ocaml under GNU Emacs
▶ Create the fjle .ocamlinit in your home directory, and place it
Mayer Goldberg \ Ben-Gurion University Compiler Construction 83 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 84 / 193
▶ You are free to use ocaml under any editor/environment you like ▶ For example, for ocaml under Eclipse, try OcaIDE at
Mayer Goldberg \ Ben-Gurion University Compiler Construction 85 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 86 / 193
▶ Ocaml is a functional programming language ▶ Ocaml is interactive ▶ You enter expressions at the prompt, and get their values and
▶ Expressions are ended with ;;
Mayer Goldberg \ Ben-Gurion University Compiler Construction 87 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 88 / 193
▶ We shall learn about modules later on, as part of the module
▶ In the meantime,modules are ways of aggregating functions &
▶ Functionality in ocaml is managed via loading and using
Mayer Goldberg \ Ben-Gurion University Compiler Construction 89 / 193
▶ Commands that start with # ▶ Non-programmable ▶ Tell you about the run-time system ▶ Change the run-time system
▶ #list;; to list available modules ▶ #cd <string>;; to change to a directory ▶ #require <string>;; to specify that a module is required ▶ #show_module <module>;; to see the signature of the module ▶ #trace <function>;; to trace a function ▶ #untrace <function>;; to untrace a function
Mayer Goldberg \ Ben-Gurion University Compiler Construction 90 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 91 / 193
▶ The module Pervasives contains all the “builtin” procedures in
▶ Try executing #show_module Pervasives;; and see what you
Mayer Goldberg \ Ben-Gurion University Compiler Construction 92 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 93 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 94 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 95 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 96 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 97 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 98 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 99 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 100 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 101 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 102 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 103 / 193
▶ lists of integers ▶ lists of strings ▶ lists of user-defjned data-types
Mayer Goldberg \ Ben-Gurion University Compiler Construction 104 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 105 / 193
▶ 'a is called alpha and is often written using α ▶ 'b is called beta and is often written using β ▶ 'c is called gamma and is often written using γ ▶ … etc.
Mayer Goldberg \ Ben-Gurion University Compiler Construction 106 / 193
▶ With non-empty lists, ocaml can fjgure out the type of the list
▶ With empty lists, ocaml is unable to fjgure out the type of the
▶ You may specify the type of α:
Mayer Goldberg \ Ben-Gurion University Compiler Construction 107 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 108 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 109 / 193
▶ Syntax for named functions ▶ Syntax for anonymous functions ▶ Syntax for functions with pattern-matching ▶ Syntax for recursive functions ▶ Syntax for mutually recursive functions
Mayer Goldberg \ Ben-Gurion University Compiler Construction 110 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 111 / 193
▶ You don’t ordinarily need parenthesis ▶ It’s not an error to have unneeded parenthesis ▶ Sometimes it’s really needed!
Mayer Goldberg \ Ben-Gurion University Compiler Construction 112 / 193
▶ Syntax for anonymous functions ▶ Syntax for functions with pattern-matching ▶ Syntax for recursive functions ▶ Syntax for mutually recursive functions
Mayer Goldberg \ Ben-Gurion University Compiler Construction 113 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 114 / 193
▶ Syntax for functions with pattern-matching ▶ Syntax for recursive functions ▶ Syntax for mutually recursive functions
Mayer Goldberg \ Ben-Gurion University Compiler Construction 115 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 116 / 193
▶ Syntax for recursive functions ▶ Syntax for mutually recursive functions
Mayer Goldberg \ Ben-Gurion University Compiler Construction 117 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 118 / 193
a)
1 logb(a)
▶ You can carry this algorithm to any number of steps. The
▶ This algorithm can be implemented using a simple recursive
Mayer Goldberg \ Ben-Gurion University Compiler Construction 119 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 120 / 193
▶ Syntax for mutually recursive functions
Mayer Goldberg \ Ben-Gurion University Compiler Construction 121 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 122 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 123 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 124 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 125 / 193
▶ A function is a subset of the Cartesian product of a domain-set
▶ Suppose the domain is itself a Cartesian product:
▶ Then f ⊆ ((D1 × · · · × Dn) × R). ▶ The structure ((D1 × · · · × Dn) × R) is isomorphic to
▶ The structure D1 × (D2 · · · × (Dn × R) · · · ) is the domain of a
Mayer Goldberg \ Ben-Gurion University Compiler Construction 126 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 127 / 193
▶ In applications, Ocaml Curries arguments naturally. This means
▶ Parameters to named functions Curry naturally. For example:
▶ To avoid Currying, you may pass tuples: In C, f(x, y, z) is a
Mayer Goldberg \ Ben-Gurion University Compiler Construction 128 / 193
▶ Boolean Connectives ▶ Predicates ▶ Quantifjers ▶ Syllogisms ▶ Formalization using FOPL
Mayer Goldberg \ Ben-Gurion University Compiler Construction 129 / 193
▶ The goal of this tutorial is to make you profjcient in the topic of
▶ You’ve studied some basic FOPL in your freshman course Logic
▶ Traditionally, formalization of sentences from a natural language
▶ By the end of this tutorial —
▶ You should be able to read sentences in FOPL and grasp their
▶ You should be able to translate sentences from a natural
Mayer Goldberg \ Ben-Gurion University Compiler Construction 130 / 193
▶ FOPL is the language of the exact sciences (mathematics &
▶ It is precise ▶ It is concise ▶ It is unambiguous ▶ It is language-neutral ▶ It is easy to check Mayer Goldberg \ Ben-Gurion University Compiler Construction 131 / 193
▶ FOPL is used in your courses on calculus, discrete mathematics,
▶ FOPL is also used in the Compiler-Construction Course
▶ In recent years, FOPL emerged as a great vehicle for testing
▶ This does not mean that questions that use FOPL are
▶ We only use the language of FOPL as a way of expressing
Mayer Goldberg \ Ben-Gurion University Compiler Construction 132 / 193
▶ In principle, most of this tutorial should be familiar to you, if you
▶ Which is why this is a self-study tutorial!
▶ You may think of this tutorial as a refresher ▶ The aim of this tutorial is to ensure that all students have
Mayer Goldberg \ Ben-Gurion University Compiler Construction 133 / 193
▶ Predicates ▶ Quantifjers ▶ Syllogisms ▶ Formalization using FOPL
Mayer Goldberg \ Ben-Gurion University Compiler Construction 134 / 193
▶ If α is a proposition, then ¬α is a proposition ▶ If α is true, then ¬α is false, and vice versa ▶ The truth-table for negation is given by
▶ The property of double negation states that for any proposition
Mayer Goldberg \ Ben-Gurion University Compiler Construction 135 / 193
▶ If α, β are propositions, then α ∧ β is a proposition ▶ For α ∧ β to be true, both α, β must be true ▶ The truth-table for conjunction is given by:
Mayer Goldberg \ Ben-Gurion University Compiler Construction 136 / 193
▶ If α, β are propositions, then α ∨ β is a proposition ▶ For α ∧ β to be true, either α, β must be true ▶ The truth-table for disjunction is given by:
Mayer Goldberg \ Ben-Gurion University Compiler Construction 137 / 193
▶ α ∧ β ⇔ β ∧ α ▶ α ∨ β ⇔ β ∨ α ▶ (α ∧ (β ∧ γ)) ⇔ ((α ∧ β) ∧ γ) ▶ (α ∨ (β ∨ γ)) ⇔ ((α ∨ β) ∨ γ)
▶ (α ∧ (β ∨ γ)) ⇔ ((α ∧ β) ∨ (α ∧ γ)) ▶ (α ∨ (β ∧ γ)) ⇔ ((α ∨ β) ∧ (α ∨ γ))
Mayer Goldberg \ Ben-Gurion University Compiler Construction 138 / 193
▶ (¬(α ∨ β)) ⇔ ((¬α) ∧ (¬β)) ▶ (¬(α ∧ β)) ⇔ ((¬α) ∨ (¬β))
Mayer Goldberg \ Ben-Gurion University Compiler Construction 139 / 193
▶ If α, β are propositions, then α → β is a proposition ▶ Material-Implication captures the idea of if-then-else:
▶ Read α → β as ▶ If α then β ▶ If α is true, then β is true ▶ α entails β ▶ From α being true it follows that β is true ▶ In α → β, α is called the antecedent of the implication, and β
▶ Material-Implication is sometimes written as α ⊃ β
▶ We’ll get to that soon! Mayer Goldberg \ Ben-Gurion University Compiler Construction 140 / 193
▶ Material-implication fails to hold only when the antecedent is
▶ The truth-table for Material-Implication is given by:
Mayer Goldberg \ Ben-Gurion University Compiler Construction 141 / 193
▶ “If dogs are green, then cats are green too”
▶ This is a true statement ▶ Neither the antecedent nor the conclusion hold true ▶ Therefore the implication does hold true!
▶ In natural language, when we use conditional statements
▶ We hardly ever use them vacuously ▶ When we do, it’s only for comic efgect ▶ We normally intend to underscore some deep, often causal
▶ “If you drop the clock, then it surely shall break”
▶ None of these are captured by the material-implication!
Mayer Goldberg \ Ben-Gurion University Compiler Construction 142 / 193
▶ “If 1 + 2 = 3 then n! = Γ(n + 1)”
▶ Both the antecedent and the conclusion hold true ▶ Therefore the implication does hold true! ▶ There is no obvious way of getting from the antecedent to the
▶ These are two, unrelated, true mathematical propositions ▶ This antecedent cannot serve as evidence in establishing the
Mayer Goldberg \ Ben-Gurion University Compiler Construction 143 / 193
▶ There is something unnatural and superfjcial about the
▶ Philosophers have concerned themselves with many other kinds
▶ Modal ▶ Counterfactual ▶ Temporal ▶ Causal
▶ The word “material” in the term “material-implication” is meant
Mayer Goldberg \ Ben-Gurion University Compiler Construction 144 / 193
▶ Material-implication has one advantage though: It is the only
▶ The truth-functionality of material-implication means that the
▶ Any deeper implicative relation between antecedent and
Mayer Goldberg \ Ben-Gurion University Compiler Construction 145 / 193
▶ Material-implication can be written in terms of negation and
▶ Material implication naturally associates to the right:
Mayer Goldberg \ Ben-Gurion University Compiler Construction 146 / 193
▶ If α, β are propositions, then α ↔ β is a proposition ▶ Bi-implication captures the idea of if-and-only-if:
▶ Read α ↔ β as ▶ α if-and-only-if (ifg) β ▶ If α is true, then β is true, and vice versa ▶ Either α, β are both true, or both false ▶ α is equivalent to β
▶ Bi-implication is also known by the names equivalence (≡), and
Mayer Goldberg \ Ben-Gurion University Compiler Construction 147 / 193
▶ Bi-implication captures the idea of two propositions having the
▶ Here are two ways to defjne bi-implication:
▶ (α ↔ β) ⇔ ((α → β) ∧ (β → α))
▶ (α ↔ β) ⇔ ((α ∧ β) ∨ ((¬α) ∧ (¬β)))
Mayer Goldberg \ Ben-Gurion University Compiler Construction 148 / 193
▶ Let Bool = {F, T} be the set of Boolean values ▶ The set of all Boolean functions is the set of all functions in
▶ The set of Boolean functions {¬, ∧} can be used to express any
▶ The property of being able to express any function is called
▶ The set {¬, ∧} are said to be functionally complete
▶ Another set of functionally-complete Boolean functions is {¬, ∨}
Mayer Goldberg \ Ben-Gurion University Compiler Construction 149 / 193
▶ The functions nand, nor are also functionally-complete:
▶ Hint: Try to defjne negation, conjunction, disjunction using
▶ Nand is also known as the Shefger stroke, and is written as
Mayer Goldberg \ Ben-Gurion University Compiler Construction 150 / 193
▶ Quantifjers ▶ Syllogisms ▶ Formalization using FOPL
Mayer Goldberg \ Ben-Gurion University Compiler Construction 151 / 193
▶ You may think of predicates as functions from some type α to
▶ The number of arguments taken by a predicate is its arity ▶ We speak of ▶ 1-place, or unary predicates ▶ 2-place, or binary predicates ▶ n-place, or n-ary predicates Mayer Goldberg \ Ben-Gurion University Compiler Construction 152 / 193
▶ Predicates extend our language to express properties and
▶ When α is the type of an object in our domain of discourse, we
▶ When α is a product-type, populated by tuples of objects in our
▶ Predicates are written as Px or P(x), where x : α
Mayer Goldberg \ Ben-Gurion University Compiler Construction 153 / 193
▶ Example:
▶ We defjne the following predicates: ▶ Let Bx denote that x is a boy ▶ Let Gx denote that x is a girl ▶ Let Lxy denote that x loves y ▶ We can now express simple propositions using these predicates: ▶ “Tarzan is a boy”: B(Tarzan) ▶ “Jane is a girl”: G(Jane) ▶ “Tarzan loves Jane”: L(Tarzan, Jane) ▶ “Jane does not love Tarzan”: ¬L(Jane, Tarzan)
Mayer Goldberg \ Ben-Gurion University Compiler Construction 154 / 193
▶ Quantifjers extend our language so we can talk about the
▶ The universal quantifjer ∀ is used to assert that some
▶ the existential quantifjer ∃ is used to assert that some
Mayer Goldberg \ Ben-Gurion University Compiler Construction 155 / 193
▶ Example:
▶ Continuing our previous predicates: ▶ Let Bx denote that x is a boy ▶ Let Gx denote that x is a girl ▶ Let Lxy denote that x loves y ▶ How would we formalize the following: ▶ “There exist at least two boys” ▶ Answer: ∃x∃y(Bx ∧ By ∧ x ̸= y)
Mayer Goldberg \ Ben-Gurion University Compiler Construction 156 / 193
▶ Continuing the example with B, G, L:
▶ How would we formalize the following: ▶ “Not all girls love themselves” ▶ Answer: ¬∀x(Gx → Lxx) ▶ Later on, we shall see other ways of encoding this proposition ▶ How would we formalize the following: ▶ “All boys like Mary” ▶ Answer: ∀x(Bx → L(x, Mary)) Mayer Goldberg \ Ben-Gurion University Compiler Construction 157 / 193
▶ Rather than writing ∀x∀y (α), we can write ∀x, y (α) ▶ Rather than writing ∃x∃y (α), we can write ∃x, y (α) ▶ The order of quantifjers can be switched among their own kind:
▶ Universal: ∀x, y (α) is equivalent to ∀y, x (α) ▶ Existential: ∃x, y (α) is equivalent to ∃y, x (α)
▶ But the order cannot be swapped when the quantifjers are
▶ ∀x∃y (α) is not equivalent to ∃y∀x (α) ▶ Example: “For every person, there exists a sandwich, such that
Mayer Goldberg \ Ben-Gurion University Compiler Construction 158 / 193
▶ Formalization using FOPL
Mayer Goldberg \ Ben-Gurion University Compiler Construction 159 / 193
▶ A syllogism is an argument-form; A way of reasoning deductively ▶ Many of the syllogisms we present were discovered by the Greek,
▶ Although this is a refresher tutorial with the aim of
▶ We present syllogisms in the form:
▶ The symbol ∴ should be read as “therefore”
Mayer Goldberg \ Ben-Gurion University Compiler Construction 160 / 193
▶ It’s raining; If it’s raining, John carries an umbrella =
Mayer Goldberg \ Ben-Gurion University Compiler Construction 161 / 193
▶ John is not currently carrying an umbrella; If it’s raining, John
Mayer Goldberg \ Ben-Gurion University Compiler Construction 162 / 193
▶ Modus Tollendo Ponens is also known as Disjunctive Syllogism
▶ The cofgee is not black; Cofgee is either black, or with milk =
Mayer Goldberg \ Ben-Gurion University Compiler Construction 163 / 193
▶ This food contains sugar; A food cannot both contain sugar and
Mayer Goldberg \ Ben-Gurion University Compiler Construction 164 / 193
▶ If it’s raining, John carries an umbrella =
Mayer Goldberg \ Ben-Gurion University Compiler Construction 165 / 193
▶ It is not the case that this is not complicated =
Mayer Goldberg \ Ben-Gurion University Compiler Construction 166 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 167 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 168 / 193
▶ We have now gone through the preliminaries, and we are ready
▶ Formalization is the subject of translating faithfully propositions
▶ You have seen some examples of formalization in your calculus
▶ For example, the limit of a sequence {an}∞ n=0 is defjned as a
▶ This description contains many ideas that are not rigorous
Mayer Goldberg \ Ben-Gurion University Compiler Construction 169 / 193
▶ What we mean to say is that the distance between an and L can
▶ But this description is still problematic:
▶ What does it mean “as small as we like”? ▶ How large is “suffjciently large”??
▶ In encoding the idea of the limit of a sequence, we must clarify
Mayer Goldberg \ Ben-Gurion University Compiler Construction 170 / 193
▶ The distance between an and L is given by |an − L| ▶ To say that an can be made arbitrarily close to L means that
▶ To say that something can be made arbitrarily small means that
▶ We now need to relate this closeness to n
Mayer Goldberg \ Ben-Gurion University Compiler Construction 171 / 193
▶ What we are trying to say is that from a certain value of n,
▶ The way of saying this precisely is to say that there exists an
▶ “When” means “for all” ▶ Because the specifjc value of N depends upon the choice of ε, it
Mayer Goldberg \ Ben-Gurion University Compiler Construction 172 / 193
▶ The real-valued sequence {an}∞ n=0 has a limit L, if
▶ It was harder to teach, learn, and understand ▶ It was harder to verify proofs ▶ It was harder to detect subtle fmaws in arguments ▶ It was harder to see the limitations of various results ▶ It was hard to see the possibilities, let alone advance into
Mayer Goldberg \ Ben-Gurion University Compiler Construction 173 / 193
▶ The language of FOPL, like any other language, takes time and
▶ Our goal for the remainder of this tutorial is to take you through
▶ Some of these examples will be taken from mathematics, some
Mayer Goldberg \ Ben-Gurion University Compiler Construction 174 / 193
▶ The 2-place predicate P is called refmexive if it holds for all
▶ The 2-place predicate P is called symmetric if the order of its
▶ The 2-place predicate P is called antisymmetric if whenever the
▶ The 2-place predicate P is called transitive if it composes:
Mayer Goldberg \ Ben-Gurion University Compiler Construction 175 / 193
▶ We assume a membership predicate (∈)
▶ You can always name it Member(x, y) ▶ The predicate ̸∈ can be defjned as
▶ The existence of the empty set: ∃∅∀x (x ̸∈ ∅) ▶ Subset of x, y (⊆): ∀z (z ∈ x → z ∈ y) ▶ Set-equality of x, y: ∀z (z ∈ x ↔ z ∈ y) ▶ Intersection (z = x ∩ y): ∀u (u ∈ z ↔ (u ∈ x ∧ u ∈ y)) ▶ Union (z = x ∪ y): ∀u (u ∈ z ↔ (u ∈ x ∨ u ∈ y)) ▶ Power-set (y = ℘(x)):
▶ Using the defjnition of ⊆: ∀z (z ∈ y → z ⊆ x) ▶ Expanding the defjnition of ⊆: ∀z (z ∈ y → ∀u (u ∈ z → u ∈ x)) Mayer Goldberg \ Ben-Gurion University Compiler Construction 176 / 193
▶ “*Nothing* satisfjes P”
▶ Answer: ¬∃x Px, or its equivalent: ▶ Answer: ∀x ¬Px
▶ “There is something that satisfjes P”
▶ Answer: ∃x Px ▶ There may be more than one thing!
▶ “There is exactly one thing that satisfjes P”
▶ Answer: ∃x (Px ∧ ∀y (Py → x = y)) ▶ Reading: “Something exists, that satisfjes P, and we shall call it
Mayer Goldberg \ Ben-Gurion University Compiler Construction 177 / 193
▶ “There is at most one thing that satisfjes P”
▶ Approach 1: “Either nothing satisfjes P or one thing only
▶ Approach 2: “It is not the case that at least two things satisfy
▶ “There are at least two distinct things that satisfy P”
▶ Answer: ∃x, y (x ̸= y ∧ Px ∧ Py) ▶ Reading: “There exist two things x, y that are distinct, that
Mayer Goldberg \ Ben-Gurion University Compiler Construction 178 / 193
▶ “There are exactly two things that satisfy P”
▶ Answer: ∃x, y (x ̸= y ∧ Px ∧ Py ∧ ∀z (Pz → (z = x ∨ z = y))) ▶ Reading: “There exist two things x, y that are distinct, that
▶ “There are at most two things that satisfy P”
▶ Approach 1: “Either nothing satisfjes P, or exactly one thing
▶ Approach 2: “It is not the case that at least three things satisfy
Mayer Goldberg \ Ben-Gurion University Compiler Construction 179 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 180 / 193
▶ Let
▶ Ux denote: x is thoroughly miserable and has not been
▶ The encoding of Swit’s saying is ¬∃xUx
Mayer Goldberg \ Ben-Gurion University Compiler Construction 181 / 193
▶ Let
▶ Mx denote: x is thoroughly miserable ▶ Cx denote: x has been condemned to live in Ireland
▶ The encoding of Swift’s saying is ¬∃x(Mx ∧ ¬Cx), which after
Mayer Goldberg \ Ben-Gurion University Compiler Construction 182 / 193
▶ Let
▶ Mxt denote: x is miserable at time t ▶ This lets us formalize “thoroughly miserable” as “miserable at
▶ Cxy denote: x has condemned y to live in Ireland
▶ The encoding of Swift’s saying is ∀x, t (Mxt → ∃yCyz)
Mayer Goldberg \ Ben-Gurion University Compiler Construction 183 / 193
▶ The meaning of “thoroughly miserable” has been reduced to
▶ The meaning of “condemned” has been reduced to “condemned
▶ We may have lost more meaning than we managed to express! Mayer Goldberg \ Ben-Gurion University Compiler Construction 184 / 193
▶ The language of numbers, pairs, and the empty list in Scheme:
▶ Nil(x) denotes that x is the empty list () ▶ Num(x) denotes that x is a number ▶ Pair(x, y, z) denotes that x is a pair with car-fjeld y and
▶ Sum(x, y, z) denotes that x = y + z Mayer Goldberg \ Ben-Gurion University Compiler Construction 185 / 193
▶ Using this language, here are some problems to formalize: ▶ P1(x) denotes that x is a proper-list of length 2 ▶ Answer: P1(x) ≡ ∃y, z, t, w (Pair(x, y, z) ∧ Pair(z, t, w) ∧ Nil(w)) ▶ Reading: x is a pair consisting of the car y, and the cdr z; z is a
Mayer Goldberg \ Ben-Gurion University Compiler Construction 186 / 193
▶ P2(x, n) denotes that x is a list of 3 numbers, the sum of which
▶ Answer:
▶ Reading: x consists of 3 nested pairs, the innermost ending with
▶ The car-fjelds of the 3 nested pairs are y, w, p ▶ n = y + w + p Mayer Goldberg \ Ben-Gurion University Compiler Construction 187 / 193
▶ P3(x) denotes that x is a proper list, i.e., a list the rightmost
▶ Answer:
▶ P3(x) ≡ Nil(x) ∨ ∃y, z (Pair(x, y, z) ∧ P3(z))
▶ Reading:
▶ x is a proper list if it is either the empty list, or of it a pair the
Mayer Goldberg \ Ben-Gurion University Compiler Construction 188 / 193
▶ P4(x, n) denotes that x has n parentheses in its canonical form
▶ Remember that the canonical form ▶ …of (1 . (2 . (3 . ()))) is (1 2 3) ▶ …or (1 . (2 . 3)) is (1 2 . 3) ▶ Answer: We introduce the auxiliary predicate P5:
Mayer Goldberg \ Ben-Gurion University Compiler Construction 189 / 193
▶ Reading:
▶ The number of parentheses in the empty list is 2 ▶ The number of parentheses in a number is 0 ▶ The number of parentheses in a pair with car-fjeld y and
▶ The number of parentheses in the car-fjeld, and ▶ The number of parentheses in the cdr-fjeld
Mayer Goldberg \ Ben-Gurion University Compiler Construction 190 / 193
▶ The language of FOPL can be used to formalize many problem
▶ You should be able to recognize and use the various Boolean
▶ You should be familiar with the most commonly-used syllogisms ▶ You should be familiar with the language of predicates &
▶ We have seen examples from set theory, natural language,
▶ We are ready to use the language of FOPL in the compilers
Mayer Goldberg \ Ben-Gurion University Compiler Construction 191 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 192 / 193
Mayer Goldberg \ Ben-Gurion University Compiler Construction 193 / 193