CSE443 Compilers
- Dr. Carl Alphonce
CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis - - PowerPoint PPT Presentation
CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Phases of a Syntactic compiler structure Figure 1.6, page 5 of text Bottom-up parsing Top-down predictive parsing gave us a quick overview of issues related to
Figure 1.6, page 5 of text
Same expression grammar we used for top-down presentation.
"Informally, a 'handle' is a substring that matches the body of a production and whose reduction represents
derivation." [p. 235] "Formally, if S β*rm π½Aπ βrm π½πΎπ, then the production A
235] " Alternatively, a handle of a right-sentential form πΏ is a production A -> πΎ and a position of πΏ where the string πΎ may be found, such that replacing πΎ at that position by A produces the previous right-sentential form in a rightmost derivation of πΏ." [p. 235]
" A handle A -> πΎ in the parse tree for π½πΎπ" Fig 4.27 [p. 236]
A rightmost derivation of the string id * id
Rightmost derivation Production E β T E -> T β T * F T -> T * F β T * id F -> id β F * id T -> F β id * id F -> id
E -> E + T E -> T Recall grammar T -> T * F T -> F F -> ( E ) F -> id
A bottom-up parse: what we're aiming for! Table is reverse of that on previous slide.
Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E
id * id has handle id (or more formally F -> id is a handle)
Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E
F * id has handle F (or more formally T -> F is a handle)
Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E
T * id has handle id (or more formally F -> id is a handle after T *)
Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E
T * F has handle T * F (or more formally T -> T * F is a handle)
Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E
T has handle T (or more formally E -> T is a handle)
Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id T * F T * F T -> T * F T T E -> T E
Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id id F -> id
T * id has handle id (or more formally F -> id is a handle after T *)
Consider this point in the previous table. We identified F -> id as a handle.
Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id T E -> T What if β¦ β¦ we made a difference choice?
T * id could be reduced to E * id using production E -> T, but E -> T is not a handle since that reduction does not represent "one step along the reverse of a rightmost derivation."
Right sentential form Handle Reducing production id * id id F -> id F * id F T -> F T * id T E -> T E * id id F -> id E * F F T -> F E * T T E -> T E * E *FAIL*
E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id
Rightmost derivation Handle pruning
STACK [Bottomβ¦Top] INPUT $ π $ $ S $
Stack Lookahead Handle Action $ id * id $ Shift $ id * id $ id Reduce F -> id $ F * id $ F Reduce T -> F $ T * id $ Shift $ T * id $ Shift $ T * id $ id Reduce F -> id $ T * F $ T * F Reduce T -> T * F $ T $ T Reduce E -> T $ E $ Accept
π, the string after the handle, must be β T* We say "a handle" rather than "the handle" since the grammar may be ambiguous and may therefore allow more than one rightmost derivation of π½πΎπ. If a grammar is unambiguous, then every right-sentential form of the grammar has exactly one handle.
"How does a shift-reduce parser know when to shift and when to reduce?" [p 242] "β¦by maintaining states to keep track of where we are in a parse." Each state is a set of items. An item is a grammar rule annotated with a dot, β’, somewhere on the RHS.
A -> X Y Z A -> β’ X Y Z A -> X β’ Y Z A -> X Y β’ Z A -> X Y Z β’
A -> π A -> β’
Given a grammar G = (N,T,P,S) we augment to a grammar G' = (Nβͺ{S'},T,Pβͺ{S'->S},S'), where S'βN G' has exactly one rule with S' on left.
CLOSURE0(I) = I repeat { CLOSUREi+1(I) = CLOSUREi(I) βͺ { B->β’πΏ | A -> π½β’BπΎ β CLOSUREi(I) and B -> πΏ β P } } until CLOSUREi+1(I) = CLOSUREi(I)
CLOSURE0(I) = I repeat { CLOSUREi+1(I) = CLOSUREi(I) βͺ { B->β’πΏ | A -> π½β’BπΎ β CLOSUREi(I) and B -> πΏ β P } } until CLOSUREi+1(I) = CLOSUREi(I)
Intuition: an item like A -> X β’ Y Z conveys that we've already seen X, and we're expecting to see a Y followed by a Z. The closure of this item is all the other items that are relevant at this point in the parse. For example, if Y -> R S T is a production, then Y -> β’ R S T is in the closure because if the next thing in the input can derive from Y, it can derive from R.
GOTO(I,X) is the closure of the set of items A -> π½Xβ’πΎ s.t. A -> π½β’XπΎ β I GOTO(I,X) construction for G' (figure 4.32) void items(G') { C = { CLOSURE( { S' -> β’S } ) } repeat { for each set of items I β C for each grammar symbols X β (NUT) if ( GOTO(I,X) is not empty and not already in C ) add GOTO(I,X) to C } until no new sets of items are added to C }
Grammar G Augmented Grammar G' S' -> E E -> E + T E -> E + T E -> T E -> T T -> T * F T -> T * F T -> F T -> F F -> ( E ) F -> ( E ) F -> id F -> id
SET OF ITEMS (I) i CLOSUREi(I) { S' -> β’ E } { S' -> β’ E }
S' -> E E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id
SET OF ITEMS (I) i CLOSUREi(I) { S' -> β’ E } { S' -> β’ E } 1 CLOSURE0(I) βͺ { E -> β’ E + T , E -> β’ T }
S' -> E E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id
SET OF ITEMS (I) i CLOSUREi(I) { S' -> β’ E } { S' -> β’ E } 1 CLOSURE0(I) βͺ { E -> β’ E + T , E -> β’ T } 2 CLOSURE1(I) βͺ { T -> β’ T * F , T -> β’ F }
S' -> E E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id
SET OF ITEMS (I) i CLOSUREi(I) { S' -> β’ E } { S' -> β’ E } 1 CLOSURE0(I) βͺ { E -> β’ E + T , E -> β’ T } 2 CLOSURE1(I) βͺ { T -> β’ T * F , T -> β’ F } 3 CLOSURE2(I) βͺ { F -> β’ ( E ) , F -> β’ id }
S' -> E E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id
SET OF ITEMS (I) i CLOSUREi(I) { S' -> β’ E } { S' -> β’ E } 1 CLOSURE0(I) βͺ { E -> β’ E + T , E -> β’ T } 2 CLOSURE1(I) βͺ { T -> β’ T * F , T -> β’ F } 3 CLOSURE2(I) βͺ { F -> β’ ( E ) , F -> β’ id } 4 CLOSURE3(I) βͺ β
S' -> E E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> id
This gives us the first state of the finite state machine, I0
Next we compute GOTO(I0,X) β X β N βͺ T N βͺ T = { E , T , F , + , * , ( , ) , id }
N.B. - augmented start symbol S' can be ignored
GOTO(I0,E) = CLOSURE( { S' -> E β’ , E -> E β’ + T } ) = { S' -> E β’ , E -> E β’ + T }
N.B. there is no non-terminal after the β’, so no new items are added by CLOSURE operation
GOTO(I0,T) = CLOSURE( { E -> T β’ , T -> T β’ * F } ) = { E -> T β’ , T -> T β’ * F }
N.B. there is no non-terminal after the β’, so no new items are added by CLOSURE operation
GOTO(I0,F) = CLOSURE( { T -> F β’ } ) = { T -> F β’ }
N.B. there is no non-terminal after the β’, so no new items are added by CLOSURE operation
GOTO(I0, '(' ) = CLOSURE( { F -> ( β’ E ) } ) = { F -> ( β’ E ) } βͺ { E -> β’ E + T , E -> β’ T } βͺ { T -> β’ T * F , T -> β’ F } βͺ { F -> β’ ( E ) , F -> β’ id }
N.B. there is a non-terminal after the β’, so new items are added by CLOSURE operation
GOTO(I0,id) = CLOSURE( { F -> id β’ } ) = { F -> id β’ }
N.B. there is no non-terminal after the β’, so no new items are added by CLOSURE operation
GOTO( I0 , ')' ) = GOTO( I0 , + ) = GOTO( I0 , * ) = GOTO( I0 , $ ) = β
I0 S' -> β’ E E -> β’ E + T E -> β’ T T -> β’ T * F T -> β’ F F -> β’ ( E ) F -> β’ id I1 S' -> E β’ E -> E β’ + T I2 E -> T β’ T -> T β’ * F I3 T -> F β’ I5 F -> id β’ I4 F -> ( β’ E ) E -> β’ E + T E -> β’ T T -> β’ T * F T -> β’ F F -> β’ ( E ) F -> β’ id E T F ( id The finite state machine as at this point. EXERCISE: complete the machine by computing GOTO(Ik,X) until no new states are added.