CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis - - PowerPoint PPT Presentation

β–Ά
cse443 compilers
SMART_READER_LITE
LIVE PREVIEW

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis - - PowerPoint PPT Presentation

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Phases of a Syntactic compiler structure Figure 1.6, page 5 of text Bottom-up parsing Top-down predictive parsing gave us a quick overview of issues related to


slide-1
SLIDE 1

CSE443 Compilers

  • Dr. Carl Alphonce

alphonce@buffalo.edu 343 Davis Hall

slide-2
SLIDE 2

Phases of a compiler

Figure 1.6, page 5 of text

Syntactic structure

slide-3
SLIDE 3

Bottom-up parsing

Top-down predictive parsing gave us a quick overview of issues related to parsing. With the context we can more easily describe bottom-up parsing.

slide-4
SLIDE 4

Example grammar

E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id

Same expression grammar we used for top-down presentation.

slide-5
SLIDE 5

Terminology

If S β‡’*lm 𝛽 then we call 𝛽 a left- sentential form of the grammar (lm means leftmost) If S β‡’*rm 𝛽 then we call 𝛽 a right- sentential form of the grammar (rm means rightmost)

slide-6
SLIDE 6

handle

"Informally, a 'handle' is a substring that matches the body of a production and whose reduction represents

  • ne step along the reverse of a rightmost

derivation." [p. 235] "Formally, if S β‡’*rm 𝛽Aπœ• β‡’rm π›½π›Ύπœ•, then the production A

  • > 𝛾 in the position following 𝛽 is a handle of π›½π›Ύπœ•" [p.

235] " Alternatively, a handle of a right-sentential form 𝛿 is a production A -> 𝛾 and a position of 𝛿 where the string 𝛾 may be found, such that replacing 𝛾 at that position by A produces the previous right-sentential form in a rightmost derivation of 𝛿." [p. 235]

slide-7
SLIDE 7

As a picture

" A handle A -> 𝛾 in the parse tree for π›½π›Ύπœ•" Fig 4.27 [p. 236]

S 𝛽 𝛾 πœ• A

slide-8
SLIDE 8

A rightmost derivation of the string id * id

[p.235]

Rightmost derivation Production E β‡’ T E -> T β‡’ T * F T -> T * F β‡’ T * id F -> id β‡’ F * id T -> F β‡’ id * id F -> id

E -> E + T E -> T Recall grammar T -> T * F T -> F F -> ( E ) F -> id

slide-9
SLIDE 9

A bottom-up parse: what we're aiming for! Table is reverse of that on previous slide.

figure 4.26 [p.235]

Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E

slide-10
SLIDE 10

id * id has handle id (or more formally F -> id is a handle)

figure 4.26 [p.235]

Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E

slide-11
SLIDE 11

F * id has handle F (or more formally T -> F is a handle)

figure 4.26 [p.235]

Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E

slide-12
SLIDE 12

T * id has handle id (or more formally F -> id is a handle after T *)

figure 4.26 [p.235]

Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E

slide-13
SLIDE 13

T * F has handle T * F (or more formally T -> T * F is a handle)

figure 4.26 [p.235]

Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E

slide-14
SLIDE 14

T has handle T (or more formally E -> T is a handle)

figure 4.26 [p.235]

Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E

slide-15
SLIDE 15

What happens if we reduce something that's not a handle?

slide-16
SLIDE 16

Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id

T * id has handle id (or more formally F -> id is a handle after T *)

figure 4.26 [p.235]

Consider this point in the previous table. We identified F -> id as a handle.

slide-17
SLIDE 17

Example - figure 4.26 [p.235]

Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id T E -> T What if … … we made a difference choice?

slide-18
SLIDE 18

Example - figure 4.26 [p.235]

T * id could be reduced to E * id using production E -> T, but E -> T is not a handle since that reduction does not represent "one step along the reverse of a rightmost derivation."

Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id T E -> T E * id id F -> id E * F F T -> F E * T T E -> T E * E *FAIL*

E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id

slide-19
SLIDE 19

Basic idea

If we know what the handle is for each right sentential form, we can run the rightmost derivation in reverse!

slide-20
SLIDE 20

Handle pruning [p 235]

" A rightmost derivation in reverse can be

  • btained by 'handle pruning' "

If πœ• ∈ π“œ(G): S = 𝛿0 β‡’rm 𝛿1 β‡’rm 𝛿2 β‡’rm … β‡’rm 𝛿n-1 β‡’rm 𝛿n = πœ•

Rightmost derivation Handle pruning

slide-21
SLIDE 21

Big question

How do we figure out the handles?

slide-22
SLIDE 22

Big question

How do we figure out the handles? We'll answer this in a bit, but first let's consider how a parse will proceed in a bit more detail.

slide-23
SLIDE 23

Shift-reduce parsing

STACK [Bottom…Top] INPUT $ πœ• $ $ S $

slide-24
SLIDE 24

[modified from fig 4.28, p 237] Revisit example, with input: id * id $

Stack Lookahead Handle Action $ id * id $ Shift $ id * id $ id Reduce F -> id $ F * id $ F Reduce T -> F $ T * id $ Shift $ T * id $ Shift $ T * id $ id Reduce F -> id $ T * F $ T * F Reduce T -> T * F $ T $ T Reduce E -> T $ E $ Accept

slide-25
SLIDE 25

Observations [p 235]

πœ•, the string after the handle, must be ∈ T* We say "a handle" rather than "the handle" since the grammar may be ambiguous and may therefore allow more than one rightmost derivation of π›½π›Ύπœ•. If a grammar is unambiguous, then every right-sentential form of the grammar has exactly one handle.

slide-26
SLIDE 26

"How does a shift-reduce parser know when to shift and when to reduce?" [p 242] "…by maintaining states to keep track of where we are in a parse." Each state is a set of items. An item is a grammar rule annotated with a dot, β€’, somewhere on the RHS.

Items

slide-27
SLIDE 27

Rules and items

A -> X Y Z A -> β€’ X Y Z A -> X β€’ Y Z A -> X Y β€’ Z A -> X Y Z β€’

The β€’ shows where in a rule we might be during a parse.

A -> 𝜁 A -> β€’

slide-28
SLIDE 28

Building the finite control for a bottom-up parser

Build a finite state machine, whose states are sets of items Build a table (M) incorporating shift/reduce decisions

slide-29
SLIDE 29

Augment grammar

Given a grammar G = (N,T,P,S) we augment to a grammar G' = (Nβˆͺ{S'},T,Pβˆͺ{S'->S},S'), where S'βˆ‰N G' has exactly one rule with S' on left.

slide-30
SLIDE 30

We need two operations to build our finite state machine

CLOSURE(I) GOTO(I,X)

slide-31
SLIDE 31

CLOSURE(I)

I is a set of items CLOSURE(I) fixed point construction

CLOSURE0(I) = I repeat { CLOSUREi+1(I) = CLOSUREi(I) βˆͺ { B->‒𝛿 | A -> 𝛽‒B𝛾 ∈ CLOSUREi(I) and B -> 𝛿 ∈ P } } until CLOSUREi+1(I) = CLOSUREi(I)

slide-32
SLIDE 32

CLOSURE(I)

I is a set of items CLOSURE(I) fixed point construction

CLOSURE0(I) = I repeat { CLOSUREi+1(I) = CLOSUREi(I) βˆͺ { B->‒𝛿 | A -> 𝛽‒B𝛾 ∈ CLOSUREi(I) and B -> 𝛿 ∈ P } } until CLOSUREi+1(I) = CLOSUREi(I)

Intuition: an item like A -> X β€’ Y Z conveys that we've already seen X, and we're expecting to see a Y followed by a Z. The closure of this item is all the other items that are relevant at this point in the parse. For example, if Y -> R S T is a production, then Y -> β€’ R S T is in the closure because if the next thing in the input can derive from Y, it can derive from R.

slide-33
SLIDE 33

GOTO(I,X)

GOTO(I,X) is the closure of the set of items A -> 𝛽X‒𝛾 s.t. A -> 𝛽‒X𝛾 ∈ I GOTO(I,X) construction for G' (figure 4.32) void items(G') { C = { CLOSURE( { S' -> β€’S } ) } repeat { for each set of items I ∈ C for each grammar symbols X ∈ (NUT) if ( GOTO(I,X) is not empty and not already in C ) add GOTO(I,X) to C } until no new sets of items are added to C }

slide-34
SLIDE 34

Example [p 245]

Grammar G Augmented Grammar G' S' -> E E -> E + T E -> E + T E -> T E -> T T -> T * F T -> T * F T -> F T -> F F -> ( E ) F -> ( E ) F -> id F -> id

slide-35
SLIDE 35

SET OF ITEMS (I) i CLOSUREi(I) { S' -> β€’ E } { S' -> β€’ E }

Compute items(G')

S' -> E E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id

slide-36
SLIDE 36

Compute items(G')

SET OF ITEMS (I) i CLOSUREi(I) { S' -> β€’ E } { S' -> β€’ E } 1 CLOSURE0(I) βˆͺ { E -> β€’ E + T , E -> β€’ T }

S' -> E E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id

slide-37
SLIDE 37

SET OF ITEMS (I) i CLOSUREi(I) { S' -> β€’ E } { S' -> β€’ E } 1 CLOSURE0(I) βˆͺ { E -> β€’ E + T , E -> β€’ T } 2 CLOSURE1(I) βˆͺ { T -> β€’ T * F , T -> β€’ F }

Compute items(G')

S' -> E E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id

slide-38
SLIDE 38

SET OF ITEMS (I) i CLOSUREi(I) { S' -> β€’ E } { S' -> β€’ E } 1 CLOSURE0(I) βˆͺ { E -> β€’ E + T , E -> β€’ T } 2 CLOSURE1(I) βˆͺ { T -> β€’ T * F , T -> β€’ F } 3 CLOSURE2(I) βˆͺ { F -> β€’ ( E ) , F -> β€’ id }

Compute items(G')

S' -> E E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id

slide-39
SLIDE 39

SET OF ITEMS (I) i CLOSUREi(I) { S' -> β€’ E } { S' -> β€’ E } 1 CLOSURE0(I) βˆͺ { E -> β€’ E + T , E -> β€’ T } 2 CLOSURE1(I) βˆͺ { T -> β€’ T * F , T -> β€’ F } 3 CLOSURE2(I) βˆͺ { F -> β€’ ( E ) , F -> β€’ id } 4 CLOSURE3(I) βˆͺ βˆ…

Compute items(G')

S' -> E E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id

slide-40
SLIDE 40

Terminology

Kernel items: S' -> β€’ S and all items with β€’ not at left edge Non-kernel items: all items with β€’ at left edge, except S' -> β€’ S

slide-41
SLIDE 41

This gives us the first state of the finite state machine, I0

I0 S' -> β€’ E E -> β€’ E + T E -> β€’ T T -> β€’ T * F T -> β€’ F F -> β€’ ( E ) F -> β€’ id kernel item non-kernel items are computed from CLOSURE(kernel), and therefore do not need to be explicitly stored

slide-42
SLIDE 42

Next we compute GOTO(I0,X) βˆ€ X ∈ N βˆͺ T N βˆͺ T = { E , T , F , + , * , ( , ) , id }

N.B. - augmented start symbol S' can be ignored

I1 S' -> E β€’ E -> E β€’ + T

GOTO(I0,E) = CLOSURE( { S' -> E β€’ , E -> E β€’ + T } ) = { S' -> E β€’ , E -> E β€’ + T }

  • nly kernel items

N.B. there is no non-terminal after the β€’, so no new items are added by CLOSURE operation

slide-43
SLIDE 43

I2 E -> T β€’ T -> T β€’ * F

GOTO(I0,T) = CLOSURE( { E -> T β€’ , T -> T β€’ * F } ) = { E -> T β€’ , T -> T β€’ * F }

  • nly kernel items

N.B. there is no non-terminal after the β€’, so no new items are added by CLOSURE operation

slide-44
SLIDE 44

I3 T -> F β€’

GOTO(I0,F) = CLOSURE( { T -> F β€’ } ) = { T -> F β€’ }

  • nly kernel items

N.B. there is no non-terminal after the β€’, so no new items are added by CLOSURE operation

slide-45
SLIDE 45

GOTO(I0, '(' ) = CLOSURE( { F -> ( β€’ E ) } ) = { F -> ( β€’ E ) } βˆͺ { E -> β€’ E + T , E -> β€’ T } βˆͺ { T -> β€’ T * F , T -> β€’ F } βˆͺ { F -> β€’ ( E ) , F -> β€’ id }

N.B. there is a non-terminal after the β€’, so new items are added by CLOSURE operation

I4 F -> ( β€’ E ) E -> β€’ E + T E -> β€’ T T -> β€’ T * F T -> β€’ F F -> β€’ ( E ) F -> β€’ id kernel item non-kernel items

slide-46
SLIDE 46

I5 F -> id β€’

GOTO(I0,id) = CLOSURE( { F -> id β€’ } ) = { F -> id β€’ }

  • nly kernel items

N.B. there is no non-terminal after the β€’, so no new items are added by CLOSURE operation

GOTO( I0 , ')' ) = GOTO( I0 , + ) = GOTO( I0 , * ) = GOTO( I0 , $ ) = βˆ…

slide-47
SLIDE 47

I0 S' -> β€’ E E -> β€’ E + T E -> β€’ T T -> β€’ T * F T -> β€’ F F -> β€’ ( E ) F -> β€’ id I1 S' -> E β€’ E -> E β€’ + T I2 E -> T β€’ T -> T β€’ * F I3 T -> F β€’ I5 F -> id β€’ I4 F -> ( β€’ E ) E -> β€’ E + T E -> β€’ T T -> β€’ T * F T -> β€’ F F -> β€’ ( E ) F -> β€’ id E T F ( id The finite state machine as at this point. EXERCISE: complete the machine by computing GOTO(Ik,X) until no new states are added.