Compiler Development (CMPSC 401) Janyl Jumadinova January 17, 2018 - - PowerPoint PPT Presentation

compiler development cmpsc 401
SMART_READER_LITE
LIVE PREVIEW

Compiler Development (CMPSC 401) Janyl Jumadinova January 17, 2018 - - PowerPoint PPT Presentation

Compiler Development (CMPSC 401) Janyl Jumadinova January 17, 2018 Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 1 / 34 What is a compiler? Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 2 / 34


slide-1
SLIDE 1

Compiler Development (CMPSC 401)

Janyl Jumadinova January 17, 2018

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 1 / 34

slide-2
SLIDE 2

What is a compiler?

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 2 / 34

slide-3
SLIDE 3

What is a compiler?

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 3 / 34

slide-4
SLIDE 4

Compilers are translators

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 4 / 34

slide-5
SLIDE 5

Compilers are optimizers

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 5 / 34

slide-6
SLIDE 6

Why Study Compilers?

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 6 / 34

slide-7
SLIDE 7

Why Study Compilers?

Compilers provide portability Compilers enable high performance and productivity Techniques used for compiler design are also useful for other things Compilers and interpreters for domain-specific languages are popular

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 7 / 34

slide-8
SLIDE 8

Why Study Compilers?

Compilers provide portability Compilers enable high performance and productivity Techniques used for compiler design are also useful for other things Compilers and interpreters for domain-specific languages are popular Bring together: Data structures and Algorithms, Formal Languages, Computer Architecture

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 7 / 34

slide-9
SLIDE 9

Why Study Compilers?

Compilers provide portability Compilers enable high performance and productivity Techniques used for compiler design are also useful for other things Compilers and interpreters for domain-specific languages are popular Bring together: Data structures and Algorithms, Formal Languages, Computer Architecture Influence: Language Design, Architecture (influence is bi-directional), Techniques used influence other areas (program analysis, testing, ...)

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 7 / 34

slide-10
SLIDE 10

Common compiler types

High level language → assembly language (e.g., gcc)

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 8 / 34

slide-11
SLIDE 11

Common compiler types

High level language → assembly language (e.g., gcc) High level language → machine independent bytecode (e.g., javac)

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 8 / 34

slide-12
SLIDE 12

Common compiler types

High level language → assembly language (e.g., gcc) High level language → machine independent bytecode (e.g., javac) Bytecode → native machine code (e.g., java’s JIT compiler)

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 8 / 34

slide-13
SLIDE 13

Common compiler types

High level language → assembly language (e.g., gcc) High level language → machine independent bytecode (e.g., javac) Bytecode → native machine code (e.g., java’s JIT compiler) High level language → high level language (e.g., domain specific languages, many research languages)

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 8 / 34

slide-14
SLIDE 14

View from 50,000 feet

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 9 / 34

slide-15
SLIDE 15

Analysis-Synthesis Model of Compilation

There are two parts to compilation:

1 Analysis 2 Synthesis Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 10 / 34

slide-16
SLIDE 16

Analysis-Synthesis Model of Compilation

There are two parts to compilation:

1 Analysis: determines the operations implied by the source program

which are recorded in a tree structure

2 Synthesis Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 11 / 34

slide-17
SLIDE 17

Analysis-Synthesis Model of Compilation

There are two parts to compilation:

1 Analysis: determines the operations implied by the source program

which are recorded in a tree structure

2 Synthesis: takes the tree structure and translates the operations

therein into the target program

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 12 / 34

slide-18
SLIDE 18

Common Compiler Phases

Lexical Analysis (scanning) Syntax Analysis (parsing) Semantic Analysis (type checking) Intermediate Code Generation Machine Code Generation Optimization Memory Management

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 13 / 34

slide-19
SLIDE 19

Grouping of phases

Compiler front and back ends: Front-end: analysis (machine independent) Back-end: synthesis (machine dependent)

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 14 / 34

slide-20
SLIDE 20

Grouping of phases

Compiler front and back ends: Front-end: analysis (machine independent) Back-end: synthesis (machine dependent) Compiler passes:

  • A collection of phases is done only once (single pass) or multiple

times (multi pass)

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 14 / 34

slide-21
SLIDE 21

Grouping of phases

Compiler front and back ends: Front-end: analysis (machine independent) Back-end: synthesis (machine dependent) Compiler passes:

  • A collection of phases is done only once (single pass) or multiple

times (multi pass) Single pass: usually requires everything to be defined before being used in source program

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 14 / 34

slide-22
SLIDE 22

Grouping of phases

Compiler front and back ends: Front-end: analysis (machine independent) Back-end: synthesis (machine dependent) Compiler passes:

  • A collection of phases is done only once (single pass) or multiple

times (multi pass) Single pass: usually requires everything to be defined before being used in source program Multi pass: compiler may have to keep entire program representation in memory

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 14 / 34

slide-23
SLIDE 23

Compiler Construction Tools

Software development tools are available to implement one or more compiler phases

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 15 / 34

slide-24
SLIDE 24

Compiler Construction Tools

Software development tools are available to implement one or more compiler phases Scanner generators Parser generators Syntax-directed translation engines Automatic code generators Data-flow engines

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 15 / 34

slide-25
SLIDE 25

Traditional Two-Pass Compiler

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 16 / 34

slide-26
SLIDE 26

The Front-End

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 17 / 34

slide-27
SLIDE 27

Scanner

Compiler starts by seeing only program text

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 18 / 34

slide-28
SLIDE 28

Scanner

Compiler starts by seeing only program text Scanner converts program text into string of tokens

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 19 / 34

slide-29
SLIDE 29

Scanner

Compiler starts by seeing only program text Scanner converts program text into string of tokens But we still don’t know what the syntactic structure of the program is

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 20 / 34

slide-30
SLIDE 30

Parser

Converts string of tokens into a parse tree or an abstract syntax tree

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 21 / 34

slide-31
SLIDE 31

Parser

Converts string of tokens into a parse tree or an abstract syntax tree Captures syntactic structure of the code

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 21 / 34

slide-32
SLIDE 32

Parser

Converts string of tokens into a parse tree or an abstract syntax tree Captures syntactic structure of the code

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 21 / 34

slide-33
SLIDE 33

Parser

Converts string of tokens into a parse tree or an abstract syntax tree Captures syntactic structure of code

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 22 / 34

slide-34
SLIDE 34

Semantic actions

Interpret the semantics of syntactic constructs

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 23 / 34

slide-35
SLIDE 35

Semantic actions

Interpret the semantics of syntactic constructs Up to now we have been only concerned with what the syntax of the code is

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 23 / 34

slide-36
SLIDE 36

Semantic actions

Interpret the semantics of syntactic constructs Up to now we have been only concerned with what the syntax of the code is What’s the difference?

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 23 / 34

slide-37
SLIDE 37

Syntax vs. Semantics

Syntax: “grammatical” structure of a language

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 24 / 34

slide-38
SLIDE 38

Syntax vs. Semantics

Syntax: “grammatical” structure of a language What symbols, in what order, is a legal part of the language?

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 24 / 34

slide-39
SLIDE 39

Syntax vs. Semantics

Syntax: “grammatical” structure of a language What symbols, in what order, is a legal part of the language? But something that is syntactically correct may not mean anything!

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 24 / 34

slide-40
SLIDE 40

Syntax vs. Semantics

Syntax: “grammatical” structure of a language What symbols, in what order, is a legal part of the language? But something that is syntactically correct may not mean anything! “colorless green ideas sleep furiously”

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 24 / 34

slide-41
SLIDE 41

Syntax vs. Semantics

Semantics: meaning of the language

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 25 / 34

slide-42
SLIDE 42

Syntax vs. Semantics

Semantics: meaning of the language What does a particular set of symbols, in a particular order, mean?

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 25 / 34

slide-43
SLIDE 43

Syntax vs. Semantics

Semantics: meaning of the language What does a particular set of symbols, in a particular order, mean? What does it mean to be an if statement?

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 25 / 34

slide-44
SLIDE 44

Syntax vs. Semantics

Semantics: meaning of the language What does a particular set of symbols, in a particular order, mean? What does it mean to be an if statement? evaluate the conditional, if the conditional is true, execute the then clause, otherwise execute the else clause

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 25 / 34

slide-45
SLIDE 45

Semantic Actions

Actions taken by compiler based on the semantics of program statements:

Building a symbol table Generating intermediate representations

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 26 / 34

slide-46
SLIDE 46

Symbol tables

A list of every declaration in the program, along with other information Variable declarations: types, scope Function declarations: return types, number and type of arguments

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 27 / 34

slide-47
SLIDE 47

Intermediate representation

Also called IR

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 28 / 34

slide-48
SLIDE 48

Intermediate representation

Also called IR A (relatively) low-level representation of the program But not machine-specific! One example: three address code

bge a, 4, done mov 5, b done: // done!

Each instruction can take at most three operands (variables, literals,

  • r labels)

Note: no registers!

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 28 / 34

slide-49
SLIDE 49

Optimizer

Transforms code to make it more efficient Different kinds, operating at different levels

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 29 / 34

slide-50
SLIDE 50

Optimizer

Transforms code to make it more efficient Different kinds, operating at different levels High-level optimizations

Loop interchange, parallelization Operates at level of AST, or even source code

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 29 / 34

slide-51
SLIDE 51

Optimizer

Transforms code to make it more efficient Different kinds, operating at different levels High-level optimizations

Loop interchange, parallelization Operates at level of AST, or even source code

Scalar optimizations

Dead code eliminations, common sub-expressions eliminations Operates on IR

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 29 / 34

slide-52
SLIDE 52

Optimizer

Transforms code to make it more efficient Different kinds, operating at different levels High-level optimizations

Loop interchange, parallelization Operates at level of AST, or even source code

Scalar optimizations

Dead code eliminations, common sub-expressions eliminations Operates on IR

Local optimizations

Strength reduction, constant folding Operates on small sequences of instructions

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 29 / 34

slide-53
SLIDE 53

Code generation

Generate assembly from intermediate representation Select which instructions to use Schedule instructions Decide which registers to use

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 30 / 34

slide-54
SLIDE 54

Overall structure of a compiler

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 31 / 34

slide-55
SLIDE 55

Overall structure of a compiler

Many of these can be combined!

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 32 / 34

slide-56
SLIDE 56

Front-end vs. Back-end

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 33 / 34

slide-57
SLIDE 57

Design Considerations

Compiler and language designs influence each other

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 34 / 34

slide-58
SLIDE 58

Design Considerations

Compiler and language designs influence each other High-level languages are harder to compile

  • More work to bridge the gap between language and assembly

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 34 / 34

slide-59
SLIDE 59

Design Considerations

Compiler and language designs influence each other High-level languages are harder to compile

  • More work to bridge the gap between language and assembly

Flexible languages are often harder to compile

  • Dynamic typing (Ruby, Python) makes a language very flexible, but

it is hard for a compiler to catch errors Compiler design is influenced by architectures

Janyl Jumadinova Compiler Development (CMPSC 401) January 17, 2018 34 / 34