outline lr parsing review of bottom up parsing lalr
play

Outline LR Parsing Review of bottom-up parsing LALR Parser - PowerPoint PPT Presentation

Outline LR Parsing Review of bottom-up parsing LALR Parser Generators Computing the parsing DFA Using parser generators 2 Bottom-up Parsing (Review) The Shift and Reduce Actions (Review) A bottom-up parser rewrites the


  1. Outline LR Parsing • Review of bottom-up parsing LALR Parser Generators • Computing the parsing DFA • Using parser generators 2 Bottom-up Parsing (Review) The Shift and Reduce Actions (Review) • A bottom-up parser rewrites the input string • Recall the CFG: E → int | E + (E) to the start symbol • A bottom-up parser uses two kinds of actions: • The state of the parser is described as • Shift pushes a terminal from input on the α I γ stack – is a stack of terminals and non-terminals α E + ( I int ) ⇒ E + ( int I ) – is the string of terminals not yet examined γ • Reduce pops 0 or more symbols off of the • Initially: I x 1 x 2 . . . x n stack (production RHS) and pushes a non- terminal on the stack (production LHS) E + (E + ( E ) I ) ⇒ E + ( E I ) 3 4

  2. Key Issue: When to Shift or Reduce? LR(1) Parsing: An Example int I int + (int) + (int)$ shift 0 1 • Idea: use a deterministic finite automaton E int I + (int) + (int)$ E → int E → int (DFA) to decide when to shift or reduce on $, + ( E I + (int) + (int)$ shift (x3) + – The input is the stack 2 3 4 E + (int I ) + (int)$ E → int accept – The language consists of terminals and non-terminals E int on $ E + (E I ) + (int)$ shift ) E → int E + (E) I + (int)$ E → E+(E) • We run the DFA on the stack and we examine 7 6 5 on ), + E I + (int)$ shift (x3) the resulting state X and the token tok after I E → E + (E) + int E + (int I )$ E → int on $, + – If X has a transition labeled tok then shift E + (E I )$ shift ( – If X is labeled with “A → β on tok” then reduce 8 9 E + (E) I $ E → E+(E) E + E I $ accept 10 11 E → E + (E) ) on ), + 5 Representing the DFA Representing the DFA: Example • Parsers represent the DFA as a 2D table The table for a fragment of our DFA: – Recall table-driven lexical analysis int + ( ) $ E ( • Lines correspond to DFA states 3 4 … • Columns correspond to terminals and non- int 3 s4 E terminals 4 s5 g6 5 r E r E 5 • Typically columns are split into: 6 → → int int 6 s8 s7 E → int – Those for terminals: the action table ) on ), + 7 r E r E → → – Those for non-terminals: the goto table E+(E) E+(E) … 7 sk is shift and goto state k r X is reduce E → E + (E) → α gk is goto state k on $, + 7 8

  3. The LR Parsing Algorithm The LR Parsing Algorithm • After a shift or reduce action we rerun the let I = w$ be initial input DFA on the entire stack let j = 0 – This is wasteful, since most of the work is repeated let DFA state 0 be the start state let stack = 〈 dummy, 0 〉 • Remember for each stack element on which repeat state it brings the DFA case action[top_state(stack), I[j]] of shift k: push 〈 I[j++], k 〉 reduce X → A: • LR parser maintains a stack pop |A| pairs, sym 1 , state 1 . . . 〈 sym n , state n 〈 〉 〉 push 〈 X, goto[top_state(stack), X] 〉 state k is the final state of the DFA on sym 1 … sym k accept : halt normally error : halt and report error 9 10 Key Issue: How is the DFA Constructed? LR(0) Items • An LR(0) item is a production with a “ I ” • The stack describes the context of the parse somewhere on the RHS – What non-terminal we are looking for – What production RHS we are looking for • The items for T → (E) are – What we have seen so far from the RHS T → I (E) T → ( I E) • Each DFA state describes several such T → (E I ) contexts T → (E) – E.g., when we are looking for non-terminal E, we I might be looking either for an int or an E + (E) RHS • The only item for X → ε is X → I 11 12

  4. LR(0) Items: Intuition LR(1) Items • An item [X → α I β ] says that • An LR(1) item is a pair: – the parser is looking for an X X → α I β , a – it has an α on top of the stack – X → αβ is a production – Expects to find a string derived from β next in the – a is a terminal (the lookahead terminal) input – LR(1) means 1 lookahead terminal • [X → α I β , a] describes a context of the parser • Notes: a , and – We are trying to find an X followed by an – [X → α I a β ] means that a should follow. Then we – We have (at least) α already on top of the stack can shift it and still have a viable prefix Thus we need to see next a prefix derived from β a – – [X →α I ] means that we could reduce X • But this is not always a good idea ! 13 14 Note Convention • The symbol I was used before to separate the • We add to our grammar a fresh new start stack from the rest of input symbol S and a production S → E – I γ , where α is the stack and γ is the remaining – Where E is the old start symbol α string of terminals • In items I is used to mark a prefix of a • The initial parsing context contains: production RHS: S → I E , $ X → α I β , a – Trying to find an S as a string derived from E$ – Here β might contain terminals as well – The stack is empty • In both case the stack is on the left of I 15 16

  5. LR(1) Items (Cont.) LR(1) Items (Cont.) • In context containing • Consider the item E → E + I ( E ) , + E → E + ( I E ) , + – If ( follows then we can perform a shift to context • We expect a string derived from E ) + containing • There are two productions for E E → E + ( I E ) , + E → int and E → E + ( E) • In context containing • We describe this by extending the context E → E + ( E ) I , + with two more items: – We can perform a reduction with E → E + ( E ) E → I int , ) – But only if a + follows E → I E + ( E ) , ) 17 18 The Closure Operation Constructing the Parsing DFA (1) • The operation of extending the context with E → E + ( E ) | int • Construct the start context: items is called the closure operation Closure({ S → I E, $}) S → I E , $ Closure (Items) = E → I E+(E), $ repeat E → I int , $ for each [X → I Y β , a] in Items α E → I E+(E), + for each production Y → γ E → I int , + for each b in First( β a) • We abbreviate as: add [Y → I γ , b] to Items until Items is unchanged S → I E , $ E → I E+(E) , $/+ E → I int , $/+ 19 20

  6. Constructing the Parsing DFA (2) The DFA Transitions • A DFA state is a closed set of LR(1) items • A state “State” that contains [X → α I y β , b] has a transition labeled y to a state that contains the items “ Transition (State, y)” • The start state contains [S → I E , $] – y can be a terminal or a non-terminal • A state that contains [X → α I , b] is labelled Transition (State, y) with “reduce with X → α on b” Items = ∅ for each [X → I y β , b] in State α • And now the transitions … add [X → α y I β , b] to Items return Closure(Items) 21 22 Constructing the Parsing DFA: Example LR Parsing Tables: Notes • Parsing tables (i.e., the DFA) can be 0 1 S → I E , $ E → int E → int I , $/+ constructed automatically for a CFG E → I E+(E), $/+ on $, + int E → I int , $/+ E → E+ I (E), $/+ 3 E • But we still need to understand the + 2 ( construction to work with parser generators S → E I , $ E → E I +(E), $/+ E → E+( I E) , $/+ 4 – E.g., they report errors in terms of sets of items E → I E+(E) , )/+ accept E on $ E → I int , )/+ • What kind of errors can we expect? int E → E+(E I ) , $/+ 5 6 E → E I +(E) , )/+ E → int I , )/+ E → int + ) on ), + and so on… 23 24

  7. Shift/Reduce Conflicts Shift/Reduce Conflicts • If a DFA state contains both • Typically due to ambiguities in the grammar [X → α I a β , b] and [Y → γ I , a] • Classic example: the dangling else S → if E then S | if E then S else S | OTHER • Will have DFA state containing • Then on input “a” we could either [S → if E then S I , else] – Shift into state [X → α a I β , b], or [S → if E then S I else S, x] – Reduce with Y → γ • If else follows then we can shift or reduce • Default (yacc, ML-yacc, etc.) is to shift • This is called a shift-reduce conflict – Default behavior is as needed in this case 25 26 More Shift/Reduce Conflicts More Shift/Reduce Conflicts • Consider the ambiguous grammar • In yacc declare precedence and associativity: E → E + E | E * E | int %left + • We will have the states containing %left * • Precedence of a rule = that of its last terminal [E → E * I E, +] [E → E * E I , +] See yacc manual for ways to override this default [E → I E + E, +] ⇒ E [E → E I + E, +] … … • Resolve shift/reduce conflict with a shift if: • Again we have a shift/reduce on input + – no precedence declared for either rule or terminal – We need to reduce (* binds more tightly than +) – input terminal has higher precedence than the rule – Recall solution: declare the precedence of * and + – the precedences are the same and right associative 27 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend