Compiler Development (CMPSC 401) Intermediate Representations Janyl - PowerPoint PPT Presentation

Compiler Development (CMPSC 401) Intermediate Representations Janyl Jumadinova March 28, 2019 Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 1 / 27

Compiler Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 2 / 27

Intermediate Representation Generation The final phase of the compiler front-end. Goal : Translate the program into the format expected by the compiler back-end. Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 3 / 27

Intermediate Representation Generation The final phase of the compiler front-end. Goal : Translate the program into the format expected by the compiler back-end. Generated code need not be optimized; that’s handled by later passes. Generated code need not be in assembly; that can also be handled by later passes. Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 3 / 27

Intermediate Representation Generation Why do IR Generation ? Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 4 / 27

Intermediate Representation Generation Why do IR Generation ? Simplify certain optimizations: - Machine code has many constraints that inhibit optimization. - Working with an intermediate language makes optimizations easier and clearer. Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 4 / 27

Intermediate Representation Generation Why do IR Generation ? Simplify certain optimizations: - Machine code has many constraints that inhibit optimization. - Working with an intermediate language makes optimizations easier and clearer. Have many front-ends into a single back-end: - gcc can handle C, C++, Java, Fortran, Ada, and many other languages. - Each front-end translates source to the GENERIC language. Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 4 / 27

Intermediate Representation Generation Why do IR Generation ? Simplify certain optimizations: - Machine code has many constraints that inhibit optimization. - Working with an intermediate language makes optimizations easier and clearer. Have many front-ends into a single back-end: - gcc can handle C, C++, Java, Fortran, Ada, and many other languages. - Each front-end translates source to the GENERIC language. Have many back-ends from a single front-endl - Do most optimization on intermediate representation before emitting code targeted at a single machine. Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 4 / 27

Designing a Good IR IRs are like type systems they are extremely hard to get right. Need to balance needs of high-level source language and low-level target language. Too high level : can’t optimize certain implementation details. Too low level : can’t use high-level knowledge to perform aggressive optimizations. Often have multiple IRs in a single compiler. Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 5 / 27

Architecture of gcc Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 6 / 27

Survey of Intermediate Representations Graphical Representations Control Flow Graph Dependence Graph Concrete/Abstract Syntax Trees (ASTs) Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 7 / 27

Survey of Intermediate Representations Graphical Representations Control Flow Graph Dependence Graph Concrete/Abstract Syntax Trees (ASTs) Linear Representations Stack based Three-Address Code Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 7 / 27

IR In most compilers, the parser builds an intermediate representation of the program, typically an AST. Rest of the compiler transforms the IR to improve (“optimize”) it and eventually translates it to final code. Typically will transform initial IR to one or more lower level IRs along the way. Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 8 / 27

IR Design Consideration Decisions affect speed and efficiency of the rest of the compiler General rule: Compile time is important, but performance of the executable is more important. Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 9 / 27

IR Design Consideration Decisions affect speed and efficiency of the rest of the compiler General rule: Compile time is important, but performance of the executable is more important. Typical case: compile few times, run many times. So make choices that improve compile time, as long as they don’t impact performance of generated code. Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 9 / 27

IR Design Desirable properties: Easy to generate Easy to manipulate Expressive Appropriate level of abstraction Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 10 / 27

IR Design Dimensions Structure : - Graphical (trees, graphs, etc.) - Linear (code for some abstract machine) - Hybrids are common (e.g., control-flow graphs with linear code in basic blocks) Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 11 / 27

IR Design Dimensions Structure : - Graphical (trees, graphs, etc.) - Linear (code for some abstract machine) - Hybrids are common (e.g., control-flow graphs with linear code in basic blocks) Abstraction Level : - High-level, near to source language - Low-level, closer to machine, more exposed to compiler Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 11 / 27

Survey of Intermediate Representations Graphical Representations Control Flow Graph Dependence Graph Concrete/Abstract Syntax Trees (ASTs) Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 12 / 27

Survey of Intermediate Representations Graphical Representations Control Flow Graph Dependence Graph Concrete/Abstract Syntax Trees (ASTs) Linear Representations Stack based Three-Address Code Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 12 / 27

Graphical IRs IRs represented as a graph (or tree) Nodes and edges typically reflect some structure of the program – E.g., source, control flow, data dependence Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 13 / 27

Graphical IRs IRs represented as a graph (or tree) Nodes and edges typically reflect some structure of the program – E.g., source, control flow, data dependence May be large (especially syntax trees) Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 13 / 27

Graphical IRs IRs represented as a graph (or tree) Nodes and edges typically reflect some structure of the program – E.g., source, control flow, data dependence May be large (especially syntax trees) High-level examples : Syntax trees, DAGs – Generally used in early phases of compilers Other examples : Control flow graphs and data dependence graphs – Often used in optimization and code generation Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 13 / 27

Graphical IR: Concrete Syntax Trees Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 14 / 27

Graphical IR: Concrete Syntax Trees The full grammar is needed to guide the parser, but contains many extraneous details – E.g., syntactic tokens, rules that control precedence Typically the full syntax tree does not need to be used explicitly Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 14 / 27

Graphical IR: Abstract Syntax Trees Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 15 / 27

Graphical IR: Abstract Syntax Trees Want only essential structural information (omit extra junk) Can be represented explicitly as a tree or in a linear form, e.g., in the order of a depth-first traversal. Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 15 / 27

Graphical IR: Abstract Syntax Trees Want only essential structural information (omit extra junk) Can be represented explicitly as a tree or in a linear form, e.g., in the order of a depth-first traversal. For a[i+j] , this might be: Subscript Id(A) Plus Id(i) Id(j) Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 15 / 27

Graphical IR: Abstract Syntax Trees Want only essential structural information (omit extra junk) Can be represented explicitly as a tree or in a linear form, e.g., in the order of a depth-first traversal. For a[i+j] , this might be: Subscript Id(A) Plus Id(i) Id(j) Common output from parser; used for static semantics (type checking, etc.) and sometimes high-level optimization Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 15 / 27

Graphical IR: DAG DAG = Directed Acyclic Graph In compilers, typically used to refer to an AST like structure, where common components may be reused. E.g, the 2*a in 2*a + 2*a*b (above). Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 16 / 27

Graphical IR: DAG DAG = Directed Acyclic Graph In compilers, typically used to refer to an AST like structure, where common components may be reused. E.g, the 2*a in 2*a + 2*a*b (above). Pros : Saves space, makes common subexpressions explicit. Cons : If want to change just one occurrence, need to split off. If variable value may change between evaluations, may not want to treat as common Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 16 / 27

Compiler Development (CMPSC 401) Intermediate Representations Janyl - PowerPoint PPT Presentation

Compiler Development (CMPSC 401) Intermediate Representations Janyl Jumadinova March 28, 2019 Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 1 / 27 Compiler Janyl Jumadinova Compiler Development (CMPSC 401) March 28,

Compiler Development (CMPSC 401) Janyl Jumadinova January 17, 2018 Janyl Jumadinova Compiler

Compiler Development (CMPSC 401) Syntax Analysis Janyl Jumadinova February 14, 2019 Janyl

Compiler Development (CMPSC 401) Type Checking Janyl Jumadinova March 14, 2019 Janyl Jumadinova

Compiler Development (CMPSC 401) ARM Architecture Janyl Jumadinova April 4, 2019 Janyl

Compiler Development (CMPSC 401) Code Generation, ARM, x86 Janyl Jumadinova April 11, 2019

Compiler Development (CMPSC 401) Semantic Analysis Janyl Jumadinova March 12, 2019 Janyl

Compiler Development (CMPSC 401) Syntax Analysis Bottom-Up Parsing Janyl Jumadinova March 5,

Compiler Development (CMPSC 401) Lexical Analysis Janyl Jumadinova January 24, 2019 Janyl

Compiler Development (CMPSC 401) Code Optimization Janyl Jumadinova April 15, 2019 Janyl

Compiler Development (CMPSC 401) Lexical Analysis Janyl Jumadinova January 29, 2019 Janyl

Compiler Development (CMPSC 401) Lexical Analysis: JFlex Janyl Jumadinova February 5, 2019

Compiler Development (CMPSC 401) Three Address Code Janyl Jumadinova April 2, 2019 Janyl

Compiler Construction Chapter 11 1 Compiler Construction Compiler Construction A New Compiler

Operating Systems Operating Systems CMPSC 473 CMPSC 473 Input/Output Input/Output April 22,

Operating Systems Operating Systems CMPSC 473 CMPSC 473 File System Implementation

Operating Systems Operating Systems CMPSC 473 CMPSC 473 Operating Systems Structure Operating

Intermediate Representaions Concepts of Programming Languages (CoPL) Malte Skambath

SIMULTANEOUS MIGRATION OF CODECS, FORMATS AND DRM Jason Burgess June 12, 2018 BACKGROUND

1 The http protocol: more WWW: the http protocol http is stateless http: hypertext

Internet Technologies 5-Dynamic Web F. Ricci 2010/2011 Content The "meanings" of

An introduction to dates in R Lore Dirick Instructor, DataCamp Intermediate R for Finance What

Syscall Proxying Simulating Remote Execution Maximiliano Cceres maximiliano.caceres@corest.com

Concepts Introduced in Chapter 6 types of intermediate code representations translation of

The COIN-OR Optimization Suite: Open Source Tools for Optimization Part 4: Modeling with COIN

Compiler Development (CMPSC 401) Intermediate Representations Janyl - PowerPoint PPT Presentation

Compiler Development (CMPSC 401) Intermediate Representations Janyl Jumadinova March 28, 2019 Janyl Jumadinova Compiler Development (CMPSC 401) March 28, 2019 1 / 27 Compiler Janyl Jumadinova Compiler Development (CMPSC 401) March 28,

Compiler Development (CMPSC 401) Janyl Jumadinova January 17, 2018 Janyl Jumadinova Compiler

Compiler Development (CMPSC 401) Syntax Analysis Janyl Jumadinova February 14, 2019 Janyl

Compiler Development (CMPSC 401) Type Checking Janyl Jumadinova March 14, 2019 Janyl Jumadinova

Compiler Development (CMPSC 401) ARM Architecture Janyl Jumadinova April 4, 2019 Janyl

Compiler Development (CMPSC 401) Code Generation, ARM, x86 Janyl Jumadinova April 11, 2019

Compiler Development (CMPSC 401) Semantic Analysis Janyl Jumadinova March 12, 2019 Janyl

Compiler Development (CMPSC 401) Syntax Analysis Bottom-Up Parsing Janyl Jumadinova March 5,

Compiler Development (CMPSC 401) Lexical Analysis Janyl Jumadinova January 24, 2019 Janyl

Compiler Development (CMPSC 401) Code Optimization Janyl Jumadinova April 15, 2019 Janyl

Compiler Development (CMPSC 401) Lexical Analysis Janyl Jumadinova January 29, 2019 Janyl

Compiler Development (CMPSC 401) Lexical Analysis: JFlex Janyl Jumadinova February 5, 2019

Compiler Development (CMPSC 401) Three Address Code Janyl Jumadinova April 2, 2019 Janyl

Compiler Construction Chapter 11 1 Compiler Construction Compiler Construction A New Compiler

Operating Systems Operating Systems CMPSC 473 CMPSC 473 Input/Output Input/Output April 22,

Operating Systems Operating Systems CMPSC 473 CMPSC 473 File System Implementation

Operating Systems Operating Systems CMPSC 473 CMPSC 473 Operating Systems Structure Operating

Intermediate Representaions Concepts of Programming Languages (CoPL) Malte Skambath

SIMULTANEOUS MIGRATION OF CODECS, FORMATS AND DRM Jason Burgess June 12, 2018 BACKGROUND

1 The http protocol: more WWW: the http protocol http is stateless http: hypertext

Internet Technologies 5-Dynamic Web F. Ricci 2010/2011 Content The &quot;meanings&quot; of

An introduction to dates in R Lore Dirick Instructor, DataCamp Intermediate R for Finance What

Syscall Proxying Simulating Remote Execution Maximiliano Cceres maximiliano.caceres@corest.com

Concepts Introduced in Chapter 6 types of intermediate code representations translation of

The COIN-OR Optimization Suite: Open Source Tools for Optimization Part 4: Modeling with COIN

Internet Technologies 5-Dynamic Web F. Ricci 2010/2011 Content The "meanings" of