bottom up parsing lr parsing
play

Bottom-up parsing LR parsing Construct parse tree for input from - PDF document

Bottom-up parsing LR parsing Construct parse tree for input from leaves up LR( k ) parsing reducing a string of tokens to single start symbol L eft-to-right scan of input, R ightmost derivation (inverse of deriving a string of tokens from


  1. Bottom-up parsing LR parsing Construct parse tree for input from leaves up LR( k ) parsing • reducing a string of tokens to single start symbol • L eft-to-right scan of input, R ightmost derivation (inverse of deriving a string of tokens from start symbol) • k tokens of lookahead “Shift-reduce” strategy: Strictly more general than LL( k ) • read (“shift”) tokens until seen r.h.s. of “correct” production • gets to look at whole rhs of production before deciding what • reduce handle to l.h.s. nonterminal, then continue to do, not just first k tokens of rhs • done when all input read and reduced to start nonterminal • can handle left recursion and common prefixes fine Still as efficient as any top-down or bottom-up parsing method Complex to implement • need automatic tools to construct parser from grammar Craig Chambers 68 CSE 401 Craig Chambers 69 CSE 401 LR parsing tables LR(0) parser generation Construct parsing tables implementing a FSA with a stack Example grammar: • rows: states of parser P ::= S $ // always add this production • columns: token(s) of lookahead S ::= beep | { L } • entries: action of parser L ::= S | L ; S • shift, goto state X • reduce production “ X ::= RHS ” • accept Key idea: • error simulate where input might be in grammar as it reads tokens Algorithm to construct FSA similar to "Where input might be in grammar" captured by set of items , algorithm to build DFA from NFA which forms a state in the parser’s FSA • each state represents set of possible places in parsing • LR(0) item: lhs ::= rhs production, with dot in rhs somewhere marking what’s been read (shifted) so far • LR( k ) item: also add k tokens of lookahead to each item LR( k ) algorithm builds huge tables LALR( k ) algorithm has fewer states ⇒ smaller tables Initial item: • less general than LR( k ), but still good in practice P ::= . S $ • size of tables acceptable in practice k == 1 in practice • most parser generators, including yacc and jflex , are LALR(1) Craig Chambers 70 CSE 401 Craig Chambers 71 CSE 401

  2. Closure State transitions Initial state is closure of initial item Given set of items, compute new state(s) for each symbol (terminal and non-terminal) after dot • closure: if dot before non-terminal, add all productions for non-terminal with dot at the start • state transitions correspond to shift actions • "epsilon transitions" New item derived from old item by shifting dot over symbol Initial state (1): • do closure to compute new state P ::= . S $ S ::= . beep Initial state (1): S ::= . { L } P ::= . S $ S ::= . beep S ::= . { L } State (2) reached on transition that shifts S : P ::= S . $ State (3) reached on transition that shifts beep : S ::= beep . State (4) reached on transition that shifts { : S ::= { . L } L ::= . S L ::= . L ; S S ::= . beep S ::= . { L } Craig Chambers 72 CSE 401 Craig Chambers 73 CSE 401 Accepting transitions Reducing states If state has P ::= ... . $ item, If state has lhs ::= rhs . item, then add transition labeled $ to the accept action then it has a reduce lhs ::= rhs action Example: Example: P ::= S . $ S ::= beep . has transition labeled $ to accept action has reduce S ::= beep action No label; this state always reduces this production • what if other items in this state shift, or accept? • what if other items in this state reduce differently? Craig Chambers 74 CSE 401 Craig Chambers 75 CSE 401

  3. Rest of the states (part 1) Rest of the states (part 2) State (4): if shift beep , goto State (3) State (8): State (4): if shift { , goto State (4) L ::= L ; . S S ::= . beep State (4): if shift S , goto State (5) S ::= . { L } State (4): if shift L , goto State (6) State (8): if shift beep , goto State (3) State (5): State (8): if shift { , goto State (4) L ::= S . State (8): if shift S , goto State (9) State (6): State (9): S ::= { L . } L ::= L ; S . L ::= L . ; S State (6): if shift } , goto State (7) (whew) State (6): if shift ; , goto State (8) State (7): S ::= { L } . Craig Chambers 76 CSE 401 Craig Chambers 77 CSE 401 Building table from the states & transitions Table for this grammar Create a row for each state Create a column for each terminal, non-terminal, and $ State { } beep ; S L $ 1 s,g4 s,g3 g2 For every "state ( i ): if shift X goto state ( j )" transition: 2 a! • if X is a terminal, put "shift, goto j " action in row i , column X 3 reduce S ::= beep • if X is a non-terminal, put "goto j " action in row i , column X 4 s,g4 s,g3 g5 g6 5 reduce L ::= S For every "state ( i ): if $ accept" transition: 6 s,g7 s,g8 • put "accept" action in row i , column $ 7 reduce S ::= { L } 8 s,g4 s,g3 g9 For every "state ( i ): reduce lhs ::= rhs " action: reduce L ::= L ; S 9 • put "reduce lhs ::= rhs " action in all columns of row i Craig Chambers 78 CSE 401 Craig Chambers 79 CSE 401

  4. Example Problems in shift-reduce parsing Input: { beep ; { beeep } } $ Can write grammars that cannot be handled with shift-reduce parsing Shift/reduce conflict: • state has both shift action(s) and reduce actions Reduce/reduce conflict: • state has more than one reduce action Craig Chambers 80 CSE 401 Craig Chambers 81 CSE 401 Shift/reduce conflicts Avoiding shift/reduce conflicts LR(0) example: Can rewrite grammar to remove conflict E ::= E + T | T • E.g. MatchedStmt vs. UnmatchedStmt State: Can resolve in favor of shift action E ::= E . + T • tries to find longest r.h.s. before reducing E ::= T . • works well in practice • yacc , jflex , et al. do this Can shift + Can reduce E ::= T LR( k ) example: S ::= if E then S | if E then S else S | ... State: S ::= if E then S . S ::= if E then S . else S Can shift else Can reduce S ::= if E then S Craig Chambers 82 CSE 401 Craig Chambers 83 CSE 401

  5. Reduce/reduce conflicts Avoiding reduce/reduce conflicts Example: Can rewrite grammar to remove conflict Stmt ::= Type id ; | LHS = Expr ; | ... • can be hard ... • e.g. C/C++ declaration vs. expression problem LHS ::= id | LHS [ Expr ] | ... • e.g. MiniJava array declaration vs. array store problem ... Type ::= id | Type [ ] | ... Can resolve in favor of one of the reduce actions • but which? • yacc , jflex , et al. pick reduce action for production listed State: textually first in specification Type ::= id . LHS ::= id . Can reduce Type ::= id Can reduce LHS ::= id Craig Chambers 84 CSE 401 Craig Chambers 85 CSE 401

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend