Aslan Askarov aslan@cs.au.dk acknowledgments: E.Ernst, - - PowerPoint PPT Presentation

aslan askarov aslan cs au dk
SMART_READER_LITE
LIVE PREVIEW

Aslan Askarov aslan@cs.au.dk acknowledgments: E.Ernst, - - PowerPoint PPT Presentation

____ _ _ _ _ / ___|___ _ __ ___ _ __ (_) | __ _| |_(_) ___ _ __ | | / _ \| '_ ` _ \| '_ \| | |/ _` | __| |/ _ \| '_ \ | |__| (_) | | | | | | |_) | | | (_| | |_| | (_) | | | | \____\___/|_| |_| |_|


slide-1
SLIDE 1

Aslan Askarov aslan@cs.au.dk

acknowledgments: E.Ernst, M.I.Schwartzbach, J. Midtgaard, G. Morrisett, S. Zdancewic

____ ___ _ __ |___ \ / _ \/ |/ /_ __) | | | | | '_ \ / __/| |_| | | (_) | |_____|\___/|_|\___/ ____ _ _ _ _ / ___|___ _ __ ___ _ __ (_) | __ _| |_(_) ___ _ __ | | / _ \| '_ ` _ \| '_ \| | |/ _` | __| |/ _ \| '_ \ | |__| (_) | | | | | | |_) | | | (_| | |_| | (_) | | | | \____\___/|_| |_| |_| .__/|_|_|\__,_|\__|_|\___/|_| |_| |_|

slide-2
SLIDE 2

What is a compiler?

slide-3
SLIDE 3

What is a compiler?

A program that translates … A) human-readable program into machine code 
 B) programs in a source language into programs in a target language C) programs in a source language into programs in a target language, while preserving semantics D) a source language into a target language while preserving the semantics

slide-4
SLIDE 4

What is a compiler?

  • Translator from one programming language (source) into

another (target)

  • preserves the semantics
  • the compiler also implicitly defines the semantics, though it’s harder to reason about programs

with compiler-defined semantics

  • Typically:
  • the source language is high-level
  • the target language is low-level
  • Not always:
  • Java compiler: Java to interpretable bytecode
  • Java JIT: bytecode to executable
slide-5
SLIDE 5

Why use compilers?

  • Economy
  • takes care of hundreds of low-level micro

decisions that would otherwise need to be handled by programmers

  • Performance
  • best compilers generate better code than most

programmers

  • e.g.: automatic parallelization on multi-core
  • Safety & Security
slide-6
SLIDE 6

First compilers

1952: Grace Hopper introduces the term “Compiler” for A-0 programming language

slide-7
SLIDE 7

1957: Fortran – first real compiler

“We went on to raise the question “…can a machine translate a sufficiently rich mathematical language into a sufficiently economical program at a sufficiently low cost to make the whole affair feasible?” — J. Backus The History of Fortran I, II, and III (1978)

slide-8
SLIDE 8

1957: Fortran – first real compiler

  • Lead by John Backus at IBM
  • Motivated by the economics of programming
  • Had to overcome deep skepticism
  • Focused on efficiency of the generated code
  • Pioneered many concepts and techniques
  • Revolutionized computer programming
slide-9
SLIDE 9

How good are today’s compilers?

#include <stdio.h> #include <stdlib.h> long factorial(long X) { if (X == 0) return 1; return X*factorial(X-1); } int main(int argc, char **argv) { printf("%ld\n", factorial(10)); return 0; }

$ clang factorial.c -S -O3 -o-

… Ltmp9: .cfi_def_cfa_register %rbp leaq L_.str(%rip), %rdi movl $3628800, %esi ## imm = 0x375F00 xorl %eax, %eax callq _printf xorl %eax, %eax popq %rbp ret .cfi_endproc .section __TEXT,__cstring,cstring_literals L_.str: ## @.str .asciz "%ld\n"

Source C program Compiled assembly

slide-10
SLIDE 10

Basic phases of a compiler

High-level source code Low-level target code Lexing/Parsing Lowering Code generation Elaboration Optimization

Compiler phases suggest modular design 1 phase = 1 module

slide-11
SLIDE 11

Front end

  • Lexing & Parsing
  • From strings to data structures
  • First two steps in processing from raw data to

structured information

  • Elegant application of CS theory
  • Regular expressions (finite state automata)
  • Context-free grammars (push-down automata)
  • Established & streamlined tool support

String/Files Abstract Syntax Tree Tokens

Lexing Parsing

slide-12
SLIDE 12

Example: function in Tiger language

function printint(i: int) = let function f(i: int) = if i>0 then ( f(i/10); print(chr(i-i/10*10+ord("0"))) ) in if i<0 then (print("-"); f(-i)) else if i>0 then f(i) else print("0") end

slide-13
SLIDE 13

Stream of tokens

keyword: "function" identifier: "printint" symbol: "(" identifier: "i" symbol: ":" identifier: "int" symbol: ")" symbol: "=" keyword: "let" keyword: "function" identifier: "f" symbol: "(" identifier: "i" symbol: ":" identifier: "int" symbol: ")" symbol: "=" keyword: "if" identifier: "i" symbol: ">" intliteral: "0" keyword: "then" symbol: "(" identifier: "f" symbol: "(" identifier: "i" symbol: "/" intliteral: "10" symbol: ")" symbol: ";" identifier: "print" symbol: "(" identifier: "chr" symbol: "(" identifier: "i" symbol: "-" identifier: "i" symbol: "/" intliteral: "10" symbol: "*" intliteral: "10" symbol: "+" identifier: "ord" symbol: "(" symbol: "\"" stringliteral: "0" symbol: "\"" symbol: ")" symbol: ")" symbol: ")" symbol: ")" keyword: "in" ... keyword: "end"

slide-14
SLIDE 14

Abstract syntax

FunctionDec[ (printint,[ (i,true,int)], NONE, LetExp([ FunctionDec[ (f,[ (i,true,int)], NONE, IfExp( OpExp(GtOp, VarExp( SimpleVar(i)), IntExp(0)), SeqExp[ CallExp(f,[ OpExp(DivideOp, VarExp( SimpleVar(i)), IntExp(10))]), CallExp(print,[ CallExp(chr,[ OpExp(PlusOp, OpExp(MinusOp, VarExp( SimpleVar(i)), OpExp(TimesOp, OpExp(DivideOp, VarExp( SimpleVar(i)), IntExp(10)), IntExp(10))), CallExp(ord,[ StringExp("0")]))])])]))]], SeqExp[ IfExp( OpExp(LtOp, VarExp( SimpleVar(i)), IntExp(0)), SeqExp[ CallExp(print,[ StringExp("-")]), CallExp(f,[ OpExp(MinusOp, IntExp(0), VarExp( SimpleVar(i)))])], IfExp( OpExp(GtOp, VarExp( SimpleVar(i)), IntExp(0)), CallExp(f,[ VarExp( SimpleVar(i))]), CallExp(print,[ StringExp("0")])))]))]

slide-15
SLIDE 15

Abstract Syntax Tree

function printint type NONE args int i let function f type NONE if args if call print string "-" if < simplevar i int call f var call print simplevar i string “0” int i > var int var seq call f call print / simplevar i simplevar i int 10 call chr +

  • call
  • rd

string “0” var simplevar i * / int 10 int 10 var simplevar i seq var seq call f

  • int

var simplevar i > var simplevar i int

slide-16
SLIDE 16

Elaboration

  • Resolving scope
  • Type checking
  • Resolving variable types, modules, etc
  • Check that operators and function calls are given

the values of the right types

  • Infer types for sub-expressions
  • Most errors are reported to the user by the end of

this phase

Typed Abstract Syntax Tree Untyped Abstract Syntax Tree

slide-17
SLIDE 17

Lowering

  • Translate high-level features into a small number of

target-like constructs

  • while, for - loops are all compiled to code using

jumps

  • embed array-bound checks, etc.

Intermediate Representation Typed Abstract Syntax Tree

slide-18
SLIDE 18

Intermediate Representation

slide-19
SLIDE 19

Optimization

  • Detect expensive sequences of operations that can

be rewritten into less expensive

  • Ex:
  • constant folding: 2 + 2 → 4
  • lifting invariant computation out of a loop
  • parallelize a loop

Optimization Intermediate Representation Intermediate Representation

slide-20
SLIDE 20

Code generation

  • Translate intermediate representation into target

code

  • Register assignment
  • Instruction selection
  • Instruction scheduling
  • Machine-specific optimizations

Machine Code Intermediate Representation

slide-21
SLIDE 21

x86 Instructions

.text # PROCEDURE tigermain .globl tigermain .func tigermain .type tigermain, @function tigermain: # FRAME tigermain(1 formals, 4 locals) pushl %ebp movl %esp, %ebp subl $20, %esp # SP, FP, calleesaves, argregs have values L16_blocks: movl -4(%ebp), %ebx movl $123, %ebx movl %ebx, -4(%ebp) movl -4(%ebp), %ebx pushl %ebx pushl %ebp call L2_printint jmp L15_block_done L15_block_done: # FP, SP, RV, calleesaves still live leave ret .size tigermain, .-tigermain ...

slide-22
SLIDE 22

Binary code

Hex contents of .o file:

0000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 0000020 01 00 03 00 01 00 00 00 00 00 00 00 00 00 00 00 0000040 84 04 00 00 00 00 00 00 34 00 00 00 00 00 28 00 0000060 0f 00 0c 00 55 89 e5 83 ec 14 8b 5d fc bb 7b 00 0000100 00 00 89 5d fc 8b 5d fc 53 55 e8 fc ff ff ff eb 0000120 00 c9 c3 55 89 e5 83 ec 40 8b 5d 0c 89 5d fc 8b 0000140 5d fc 83 fb 00 7c 3b eb 00 8b 5d 0c 89 5d f8 8b 0000160 5d f8 83 fb 00 7f 72 eb 00 8b 5d f4 bb 00 00 00 0000200 00 89 5d f4 8b 5d f4 53 8b 5d 08 89 5d f0 8b 5d 0000220 f0 8b 4b 08 89 4d ec 8b 5d ec 53 e8 fc ff ff ff ... 0005140 06 00 00 00 01 16 00 00 10 00 00 00 01 01 00 00 0005160

slide-23
SLIDE 23

Bootstrapping compilers (1/5)

  • We have
  • a source programming language L
  • a target machine language M
  • We want a compiler from L to M

implemented in M

  • so we can compile natively on M–

architecture

  • Implementing this directly in M is hard
  • Idea: introduce auxiliary intermediate

languages for which the task of compilation is more practical

L M M

source lang target lang implementation lang (T-diagram)

slide-24
SLIDE 24

Bootstrapping compilers (2/5)

  • We define:
  • L↓ is a simple subset of L
  • M↓ is a naive and inefficient M

code

  • Step 1: Implement L↓ to M↓

compiler in M↓:

  • Step 2: Implement L to M

compiler in L↓ (can be done in parallel to Step 1):

L↓ M↓ M↓ L M L↓

slide-25
SLIDE 25

Bootstrapping compilers (3/5)

  • Step 3: combine the two compilers

L↓ M↓ M↓ L M L↓ L M M↓

a compiler from full source lang L to M that produces efficient programs, but is inefficient itself

slide-26
SLIDE 26

Bootstrapping compilers (4/5)

  • Step 4: feed upon the previous compiler to produce

the efficient compiler (as we wanted in first place!)

L M L↓ L M M↓ L M M

slide-27
SLIDE 27

Bootstrapping compilers (5/5)

  • Say we now want a compiler from another language K to M
  • We start by implementing
  • Then we get the desired compiler by combining
  • And so on…

K M L L M M K M L K M M

slide-28
SLIDE 28

Summary

  • Compiler: source/target translation
  • Many languages, very different, DSLs
  • First compiler: FORTRAN
  • Today: compilers are good!
  • Compiler architecture – phases
  • Example: Tiger printint translation
  • Bootstrapping