Undergraduate Compilers Review cont... Structure of a Typical Compiler Analysis Synthesis Announcements – Advice for the project writeups was posted on the mailing list character stream – SVN log will need more than one entry for a one day extension lexical analysis IR code generation Today tokens “words” IR – Generating intermediate representations syntactic analysis optimization – AST – 3-address code AST “sentences” IR – Trees semantic analysis code generation – Assem annotated AST – Generating MIPS assembly target language interpreter CS553 Lecture Undergraduate Compilers Review 1 CS553 Lecture Undergraduate Compilers Review 2 Program Representations IR Code Generation AST Goal – usually language dependent – Transforms AST into low-level intermediate representation (IR) Intermediate Representation (IR) Simplifies the IR – Usually a language independent and target independent representation – Removes high-level control structures: for , while , do , switch – Examples – Removes high-level data structures: arrays, structs, unions, enums – 3-address code – RTL used in GCC (like 3-address code) Results in assembly-like code – LLVM used in the LLVM compiler (like 3-address code but typed) – Semantic lowering – Tree data structure in the MiniJava Compiler (a little different) – Control-flow expressed in terms of “gotos” – Each expression is very simple (three-address code) AST ==> IR ==> target code e.g., x := a * b * c t := a * b x := t * c CS553 Lecture Undergraduate Compilers Review 3 CS553 Lecture Undergraduate Compilers Review 4 1
A Low-Level IR Example Register Transfer Language (RTL) Source code High-level IR (AST) – Linear representation for i = 1 to 10 do for – Typically language-independent a[i] = x * 5; – Nearly corresponds to machine instructions i 1 10 asg Low-level IR (RTL) Example operations i := 1 arr tms – Assignment x := y loop1: t1 := x * 5 i 5 – Unary op x := op y a x t2 := &a – Binary op x := y op z t3 := sizeof(int) – Address of p := & y t4 := t3 * i – Load x := *(p+4) t5 := t2 + t4 – Store *(p+4) := y *t5 := t1 – Call x := f() i := i + 1 – Branch goto L1 if i <= 10 goto loop1 – Cbranch if (x==3) goto L1 CS553 Lecture Undergraduate Compilers Review 5 CS553 Lecture Undergraduate Compilers Review 6 Compiling Control Flow Compiling Arrays Switch statements Array declaration – Convert switch into low-level IR – Store name, size, and type in symbol table if (c!=0) goto next1 e.g., switch (c) { f () case 0: f(); Array allocation goto done break; – Call malloc() or create space on the runtime stack next1: if (c!=1) goto next2 case 1: g(); g() break; Array referencing goto done case 2: h(); next2: if (c!=3) goto done – e.g., A[i] *(&A + i * sizeof(A_elem)) break; h() } done: t1 := &A – Optimizations (depending on size and density of cases) t2 := sizeof(A_elem) – Create a jump table (store branch targets in table) t3 := i * t2 – Use binary search t4 := t1 + t3 *t4 CS553 Lecture Undergraduate Compilers Review 7 CS553 Lecture Undergraduate Compilers Review 8 2
MiniJava Compiler Tree Language (Array Example) MiniJava Compiler Tree Language (IF Example) if (p<3) { a2[0] = 3; System.out.println(p); x = a2[0]; } else { System.out.println(3); } CS553 Lecture Undergraduate Compilers Review 9 CS553 Lecture Undergraduate Compilers Review 10 Structure of a Typical Compiler Compiling Procedures Analysis Synthesis Properties of procedures higher addresses character stream – Procedures define scopes AR: zoo – Procedure lifetimes are nested lexical analysis IR code generation – Can store information related to dynamic invocation of a procedure on a call stack ( activation record or AR or AR: goo tokens “words” IR stack frame): syntactic analysis optimization – Space for saving registers AR: foo – Space for passing parameters and returning values AST “sentences” IR – Space for local variables semantic analysis code generation – Return address of calling instruction AR: foo annotated AST target language Stack management lower addresses stack – Push an AR on procedure entry interpreter – Pop an AR on procedure exit – Why do we need a stack? CS553 Lecture Undergraduate Compilers Review 11 CS553 Lecture Undergraduate Compilers Review 12 3
Compiling Procedures (cont) Code Generation Code generation for procedures Conceptually easy – Emit code to manage the stack – Three address code is a generic machine language – Are we done? – Instruction selection converts the low-level IR to real machine instructions Translate procedure body The source of heroic effort on modern architectures – References to local variables must be translated to refer to the current – Alias analysis activation record – Instruction scheduling for ILP – References to non-local variables must be translated to refer to the – Register allocation appropriate activation record or global data space – More later. . . CS553 Lecture Undergraduate Compilers Review 13 CS553 Lecture Undergraduate Compilers Review 14 .text main: main_framesize=32 MIPS code generation in MiniJava compiler Example MIPs program sw $fp, -4($sp) move $fp, $sp subu $sp, $sp, main_framesize class MultipleParams { Assem data structure L1: public static void main(String[] a){ sw $ra, -8($fp) – has string with source and destination spots to represent assembly instruction System.out.println(new Foo().testing());}} ... L0: – has list of uses, defs, and jump targets # sink statement class Foo { addu $sp, $sp, main_framesize public int testing() { lw $fp, -4($sp) add rd, rs, rt int lo local1; j $ra int lo local2; “add `d0, `s0, `s1” .text int lo local3; testing: local1 = 1; testing_framesize=64 sw $fp, -4($sp) beq rs, rt, label local2 = 2; move $fp, $sp local3 = 3; “beq `s0, `s0, `j0” return this.foobar(local1, local2+local3, subu $sp, $sp, testing_framesize local3); L3: } lw rt, address sw $ra, -8($fp) “lw `d0, #(`s0)” public int foobar(int param1, int param2, int li $t0, 1 param3){ sw $t0, -24($fp) return param1 - param2 * param3 ; lw $t0, -24($fp) sw rt, address } sw $t0, -12($fp) } “sw `s0, #(`s1)” ... CS553 Lecture Undergraduate Compilers Review 15 CS553 Lecture Undergraduate Compilers Review 16 4
Concepts Next Time Representations Reading – AST, low-level IR (RTL) – Ch 10 Code Generation – 3-address code Lecture – IR Trees in MiniJava Compiler – Control Flow Graphs – Assumes infinite temporaries – Liveness Analysis – Mips – Requires mapping of all temporaries to an actual register CS553 Lecture Undergraduate Compilers Review 17 CS553 Lecture Undergraduate Compilers Review 18 5
Recommend
More recommend