INF5110 – Compiler Construction
Spring 2017
1 / 97
INF5110 Compiler Construction Spring 2017 1 / 97 Outline 1. - - PowerPoint PPT Presentation
INF5110 Compiler Construction Spring 2017 1 / 97 Outline 1. Intermediate code generation Intro Intermediate code Three-address code P-code Generating P-code Generation of three address code Basic: From P-code to TA-Code and back:
1 / 97
2 / 97
3 / 97
4 / 97
5 / 97
aThis section is based on slides from Stein Krogdahl, 2015.
6 / 97
7 / 97
1.exe-files include more, and “assembly” in .NET even more 8 / 97
9 / 97
10 / 97
11 / 97
12 / 97
13 / 97
14 / 97
15 / 97
16 / 97
17 / 97
read x ; { i n p u t an i n t e g e r } i f 0<x then f a c t := 1 ; r e p e a t f a c t := f a c t ∗ x ; x := x −1 u n t i l x = 0 ; w r i t e f a c t {
f a c t o r i a l
x } end
r e a d x t1 = x > 0 i f _ f a l s e t1 goto L1 f a c t = 1 l a b e l = L2 t2 = f a c t ∗ x f a c t = t2 t3 = x −1 x = t3 t4 = x == 0 i f _ f a l s e t4 goto L2 w r i t e f a c t l a b e l L1 h a l t 18 / 97
19 / 97
20 / 97
21 / 97
2There’s also two-address codes, but those have fallen more or less in disuse. 22 / 97
23 / 97
24 / 97
read x ; { i n p u t an i n t e g e r } i f 0<x then f a c t := 1 ; r e p e a t f a c t := f a c t ∗ x ; x := x −1 u n t i l x = 0 ; w r i t e f a c t {
f a c t o r i a l
x } end 25 / 97
26 / 97
27 / 97
3one can use the a-grammar formalism also to describe the treatment of
28 / 97
29 / 97
4So, the result is not 100% linear. In general, one should not produce a flat
30 / 97
result lod x ldc 3 lod x ldc 3 adi ldc 4 lda x lod x ldc 3 adi 3 stn
31 / 97
t y p e symbol = s t r i n g t y p e e xpr = | Var
symbol | Num
i n t | Plus
e xp r ∗ e xp r | A s s i g n
symbol ∗ e xp r t y p e i n s t r = (∗ p−code i n s t r u c t i o n s ∗) LDC
i n t | LOD o f symbol | LDA
symbol | ADI | STN | STO t y p e t r e e = O n e l i n e
i n s t r | Seq
t r e e ∗ t r e e t y p e program = i n s t r l i s t
32 / 97
l e t r e c to_tree ( e : e xp r ) = match e w i t h | Var s −> ( O n e l i n e (LOD s ) ) | Num n −> ( O n e l i n e (LDC n ) ) | Plus ( e1 , e2 ) −> Seq ( to_tree e1 , Seq ( to_tree e2 , O n e l i n e ADI ) ) | A s s i g n ( x , e ) −> Seq ( O n e l i n e (LDA x ) , Seq ( to_tree e , O n e l i n e STN) ) l e t r e c l i n e a r i z e ( t : t r e e ) : program = match t w i t h O n e l i n e i −> [ i ] | Seq ( t1 , t2 ) −> ( l i n e a r i z e t1 ) @ ( l i n e a r i z e t2 ) ; ; l e t to_program e = l i n e a r i z e ( to_tree e ) ; ; 33 / 97
34 / 97
p r o c e d u r e genCode (T: t r e e n o d e ) b e g i n i f T / = n i l then ‘ ‘ g e n e r a t e code to p r e p a r e f o r code f o r l e f t c h i l d ’ ’ // p r e f i x genCode ( l e f t c h i l d
T ) ; // p r e f i x
‘ ‘ g e n e r a t e code to p r e p a r e f o r code f o r r i g h t c h i l d ’ ’ // i n f i x genCode ( r i g h t c h i l d
T ) ; // i n f i x
‘ ‘ g e n e r a t e code to implement a c t i o n ( s ) f o r T’ ’ // p o s t f i x end ; 35 / 97
36 / 97
37 / 97
38 / 97
read x ; { i n p u t an i n t e g e r } i f 0<x then f a c t := 1 ; r e p e a t f a c t := f a c t ∗ x ; x := x −1 u n t i l x = 0 ; w r i t e f a c t {
f a c t o r i a l
x } end
r e a d x t1 = x > 0 i f _ f a l s e t1 goto L1 f a c t = 1 l a b e l = L2 t2 = f a c t ∗ x f a c t = t2 t3 = x −1 x = t3 t4 = x == 0 i f _ f a l s e t4 goto L2 w r i t e f a c t l a b e l L1 h a l t 39 / 97
40 / 97
t y p e symbol = s t r i n g t y p e e xpr = | Var
symbol | Num
i n t | Plus
e xp r ∗ e xp r | A s s i g n
symbol ∗ e xp r t y p e mem = Var
symbol | Temp
symbol | Addr
symbol (∗ &x ∗) t y p e
i n t | Mem
t y p e cond = Bool
| Not
| Eq
∗
| Leq
∗
| Le
∗
t y p e r h s = Plus
∗
| Times
∗
| I d
t y p e i n s t r = Read
symbol | Write
symbol | Lab
symbol (∗ pseudo i n s t r u c t i o n ∗) | A s s i g n
symbol ∗ r h s | A s s i g n R I
∗
∗
(∗ a := b [ i ] ∗) | A s s i g n L I
∗
∗
(∗ a [ i ] := b ∗) | BranchComp
cond ∗ l a b e l | Halt | Nop t y p e t r e e = O n e l i n e
i n s t r | Seq
t r e e ∗ t r e e 41 / 97
42 / 97
5That’s one possibility of a semantics of assignment (C, Java) 6In the p-code, the result of evaluating expression (also assignments) ends
43 / 97
44 / 97
45 / 97
7Whether it is a good design from the perspective of modular compiler
46 / 97
47 / 97
48 / 97
49 / 97
50 / 97
51 / 97
52 / 97
53 / 97
54 / 97
55 / 97
56 / 97
57 / 97
result lod x ldc 3 lod x ldc 3 adi ldc 4 lda x lod x ldc 3 adi 3 stn 58 / 97
59 / 97
60 / 97
61 / 97
62 / 97
63 / 97
8In C, arrays start at an 0-offset as the first array index is 0. Details may
64 / 97
9Still in TAC format. Apart from the “readable” notation, it’s just two
65 / 97
66 / 97
l d a a l o d i l d c 1 a d i i x a elem_size ( a ) l d a a l o d j l d c 2 mpi i x a elem_size ( a ) i n d l d c 3 a d i s t o 67 / 97
68 / 97
69 / 97
70 / 97
71 / 97
72 / 97
73 / 97
74 / 97
75 / 97
76 / 97
77 / 97
10“inside” a procedure. Inter-procedural control-flow refers to calls and
11gotos are almost trivial in code generation, as they are basically available
78 / 97
79 / 97
80 / 97
81 / 97
82 / 97
83 / 97
84 / 97
85 / 97
86 / 97
87 / 97
88 / 97
89 / 97
90 / 97
91 / 97
92 / 97
93 / 97
94 / 97
95 / 97
96 / 97
[Louden, 1997] Louden, K. (1997). Compiler Construction, Principles and Practice. PWS Publishing. 97 / 97