Compiler Construction Lecture 12: Intermediate representations and - PowerPoint PPT Presentation

Compiler Construction Lecture 12: Intermediate representations   and three-address code 2020-02-18 Michael Engel

Overview • Intro to Intermediate representations • Classification of IRs • Graphical IRs: from parse tree to AST • Linear IRs • Example: LLVM IR • Implementation • Three-address code • Stack machines • Hybrid approaches Compiler Construction 12: IRs and TAC � 2

What is missing? Intermediate code Source code Lexical Semantic Syntax Code Code analysis analysis analysis optimization generation syntax tree Semantic analysis: attributed syntax tree Name analysis (check def. & scope of symbols) • machine-level program Type analysis (check correct type of expressions) • Creation of symbol tables (map identifiers to their   • types and positions in the source code) Compiler Construction 12: IRs and TAC � 3

Code generation Intermediate code • A syntax tree is a representation of the syntactic structure of a given program • we want to execute the program, i.e. control and data flow • Different levels of abstraction required • representation for all of the knowledge the compiler derives about the program being compiled • Most passes in the compiler consume IR • the scanner is an exception • Most passes in the compiler produce IR • passes in the code generator can be exceptions • Many optimizations work for different processors • optimizations on IR level can be reused • IR serves as primary & definitive representation of the code [1] Compiler Construction 12: IRs and TAC � 4

A compiler using an IR Intermediate code Source code syntax tree IR Lexical Syntax Semantic IR IR analysis analysis analysis generation optimization IR IR generation machine-level program Transform syntax tree into   • Code generation intermediate representation IR optimization Perform generic (non target-specific) optimizations on IR level • Compilers support many different optimizations, executed in sequence on the IR • Compiler Construction 12: IRs and TAC � 5

Types of IR Intermediate code • Graphical IRs encode the compiler’s knowledge in a graph • algorithms are expressed in terms of graphical objects: nodes, edges, lists, or trees • Our parse trees are a graphical IR • Linear IRs resemble pseudo-code for an abstract machine • algorithms iterate over simple, linear operation sequences • Hybrid IRs combine elements of graphical and linear IRs • attempt to capture their strengths and avoid their weaknesses • low-level linear IR used to represent blocks of straightline code and a graph to represent the flow of control Compiler Construction 12: IRs and TAC � 6

Graphical IRs: syntax tree → AST Intermediate code • So far, we have just talked about syntax trees • To be precise, the syntax tree is simply the parse tree generated by the parser • The abstract syntax tree (AST) is an optimized form • Uses less memory, faster to process Parse tree for   Start 1 Start → Expr   a × 2 + a × 2 × b 2 Expr → Expr + Term   Expr 3 | Expr - Term   Expr Term + 4 | Term   5 Term → Term × Factor   Term × Term Factor 6 | Term ÷ Factor   7 | Factor   Term × Factor i den t (b) Term × Factor 8 Factor → "(" Expr ")"   Factor numbe r (2) 9 | numbe r   Factor numbe r (2) 10 | i den t i den t (a) i den t (a) Compiler Construction 12: IRs and TAC � 7

Graphical IRs: syntax tree → AST Intermediate code • The abstract syntax tree (AST) … • retains the essential structure of the parse tree • but eliminates the extraneous (nonterminal symbol) nodes • Precedence and meaning of the expression remain AST for   Parse tree for   Start a × 2 + a × 2 × b a × 2 + a × 2 × b Expr + Expr Term + × × × Term Term Factor × a 2 b Term × Factor i den t (b) Term × Factor a 2 Factor numbe r (2) Factor numbe r (2) i den t (a) i den t (a) Compiler Construction 12: IRs and TAC � 8

From source to machine code level Intermediate code • ASTs are a near-source-level representation • Because of its rough correspondence to a parse tree, the parser can built an AST directly • Trees provide a natural representation for the grammatical structure of the source code discovered by parsing • their rigid structure makes them less useful for representing other properties of programs • Idea: model these aspects of program behavior differently • Different types of IR used in one compiler for different tasks • Compilers often use more general graphs as IRs • Control-flow graphs • Dependence graphs Compiler Construction 12: IRs and TAC � 9

Directed acyclic graphs (DAGs) Intermediate code • DAGs can represent code duplications in the tree • DAG = contraction of the AST that avoids duplications • DAG nodes can have multiple parents, identical subtrees are reused • sharing makes a DAG more compact than its corresponding AST • Example: a × 2 + a × 2 × b • Here, the expression " a × 2 " occurs twice AST for   DAG for   • DAG can share a single copy of the   a × 2 + a × 2 × b a × 2 + a × 2 × b subtree for this expression + • The DAG encodes an explicit hint for   + evaluating the expression: × × × • If the value of a cannot change between   × a 2 b the two uses of a, then the compiler   × b should generate code to evaluate a × 2   a 2 once and use the result twice a 2 Compiler Construction 12: IRs and TAC � 10

The level of abstraction Intermediate code Source-level   • Still, the AST here is close to the source code AST for   w ← a-2 × b • Compilers need additional details, e.g. for tree- ← based optimization and code generation • Source-level tree lacks much of the detail needed - w to translate statements into assembly code a × Low-level   ← AST for   b 2 w ← a-2 × b Low-level ASTs add this information: - + ◆ • v a l node: value already in a register × num   v a l • num node: known constant 4 r a r p • l ab node: assembly-level label ◆ ◆ num • typically a relocatable symbol 2 • ◆ : operator that dereferences a value + + • treats value as a memory address and returns the contents of memory l abe l   num r a r p -16 at that address (in C: "*" operator) @G 12 Compiler Construction 12: IRs and TAC � 11

Graphs: control-flow graph Intermediate code • Simplest unit of control flow in a program is a basic block ( BB ) • maximal length sequence of straightline (branch-free) code • sequence of operations that always execute together • unless an operation raises an exception • control always enters a basic block at its first operation and exits at its last operation • A control-flow graph ( CFG ) models the flow of control between the basic blocks in a program • A CFG is a directed graph, G = ( N, E ) • each node n ∈ N corresponds to a basic block • each edge e = ( n i , n j ) ∈ E corresponds to a possible transfer of control from block n i to block n j Compiler Construction 12: IRs and TAC � 12

CFG example Intermediate code • CFG provides a graphical representation of the possible runtime control-flow paths • The CFG differs from syntax-oriented IRs , such as an AST,   in which the edges show grammatical structure The AST for this loop would be acyclic! wh il e ( i < 100) CFG for a while loop: wh il e ( i < 100) {   stmt1 ;   stmt1 }   stmt2 stmt2 ; CFG for if-then-else: if (x == y ) {   if (x == y ) stmt1 ;   } e l se {   stmt1 stmt2 stmt2 ;   }   Control always flows stmt3 from stmt1 and stmt2 stmt3 ; to stmt3 Compiler Construction 12: IRs and TAC � 13

Use of CFGs Intermediate code • Compilers typically use a CFG in conjunction with another IR • The cfg represents the relationships among blocks • operations inside a block are represented with another IR, such as an expression-level AST, a DAG, or one of the linear IRs. • The resulting combination is a hybrid IR • Many parts of the compiler rely on a CFG, either explicitly or implicitly • optimization generally begins with control-flow analysis and CFG construction • Instruction scheduling needs a CFG to understand how the scheduled code for individual blocks flows together • Global register allocation relies on a CFG to understand how often each operation might execute and where to insert loads and stores for spilled values Compiler Construction 12: IRs and TAC � 14

Graphs: dependence graph Intermediate code • Compilers also use graphs to encode the flow of values • from the point where a value is created, a definition ( def ) • …to any point where it is used, a use • Data-dependence graph embody this relationship • Nodes represent operations • Most operations contain both definitions and uses • Edges connect two nodes • one that defines a value and another that uses it • Dependence graphs are drawn with edges that run from definition to use Compiler Construction 12: IRs and TAC � 15

Compiler Construction Lecture 12: Intermediate representations and - PowerPoint PPT Presentation

Compiler Construction Lecture 12: Intermediate representations and three-address code 2020-02-18 Michael Engel Overview Intro to Intermediate representations Classification of IRs Graphical IRs: from parse tree to AST

Compiler Construction Chapter 11 1 Compiler Construction Compiler Construction A New Compiler

Compiler Construction Compiler Construction 1 / 111 Mayer Goldberg \ Ben-Gurion University

Compiler Construction November 21, 2018 Compiler Construction November 21, 2018 1 / 102 Mayer

Compiler Construction Compiler Construction 1 / 54 Mayer Goldberg \ Ben-Gurion University Tuesday

Compiler Construction Compiler Construction 1 / 193 Mayer Goldberg \ Ben-Gurion University Friday

Compiler Construction October 20, 2018 Compiler Construction October 20, 2018 1 / 115 Mayer

Compiler Construction Compiler Construction 1 / 177 Mayer Goldberg \ Ben-Gurion University

Compiler Construction Compiler Construction 1 / 87 Mayer Goldberg \ Ben-Gurion University

Compiler Construction Compiler Construction 1 / 88 Mayer Goldberg \ Ben-Gurion University Tuesday

Compiler Construction Compiler Construction 1 / 104 Mayer Goldberg \ Ben-Gurion University Friday

Compiler Construction Compiler Construction 1 / 104 Mayer Goldberg \ Ben-Gurion University Monday

Compiler Construction October 31, 2018 Compiler Construction October 31, 2018 1 / 175 Mayer

Compiler Construction Compiler Construction 1 / 114 Mayer Goldberg \ Ben-Gurion University

Compiler Construction Compiler Construction 1 / 112 Mayer Goldberg \ Ben-Gurion University

Compiler Construction Christian Rinderknecht 31 October 2008 1 Why study compiler construction?

Compiler Construction Lecture 19: Code Generation V (Compiler Backend) Winter Semester 2018/19

Compiler Optimisation 6 Instruction Scheduling Hugh Leather IF 1.18a hleather@inf.ed.ac.uk

TRILL: ARP/ND Optimization draft-yizhou-trill-arp-optimization-00 Yizhou Li Donald Eastlake

HPC storage benchmarking Mike Mesnier (Intel/CMU) James Hendricks, Raja R. Sambasivan, Brock

University of Helsinki Pen and Paper Dice, Cards, Rough boards, stand-in play elements Excel

Network Slicing: Predictable Performance in Unpredictable Environment? Stefan Schmid (University

Adilson Aparecido Floren/no Network Specialist Who am I??? Adilson Aparecido Florentino

Exerccios de Fixao TCP/IP Captulo 2 Endereo de Rede 1) Marque Falso ou

Virtual Machine Migration Pierre Riteau University of Rennes 1, IRISA Inria Rennes - Bretagne