bottom up syntax analysis
play

Bottom-Up Syntax Analysis Wilhelm/Maurer: Compiler Design, Chapter - PowerPoint PPT Presentation

Bottom-Up Syntax Analysis Bottom-Up Syntax Analysis Wilhelm/Maurer: Compiler Design, Chapter 8 Reinhard Wilhelm Universitt des Saarlandes wilhelm@cs.uni-sb.de and Mooly Sagiv Tel Aviv University sagiv@math.tau.ac.il Bottom-Up


  1. Bottom-Up Syntax Analysis Bottom-Up Syntax Analysis – Wilhelm/Maurer: Compiler Design, Chapter 8 – Reinhard Wilhelm Universität des Saarlandes wilhelm@cs.uni-sb.de and Mooly Sagiv Tel Aviv University sagiv@math.tau.ac.il

  2. Bottom-Up Syntax Analysis Subjects ◮ Functionality and Method ◮ Example Parsers ◮ Derivation of a Parser ◮ Conflicts ◮ LR ( k ) –Grammars ◮ LR ( 1 ) –Parser Generation ◮ Bison

  3. Bottom-Up Syntax Analysis Bottom-Up Syntax Analysis Input: A stream of symbols (tokens) Output: A syntax tree or error Method: until input consumed or error do ◮ shift next symbol or reduce by some production ◮ decide what to do by looking one symbol ahead Properties ◮ Constructs the syntax tree in a bottom-up manner ◮ Finds the rightmost derivation (in reversed order) ◮ Reports error as soon as the already read part of the input is not a prefix of a program (valid prefix property)

  4. Bottom-Up Syntax Analysis Parsing aabb in the grammar G ab with S → aSb | ǫ Stack Input Action Dead ends reduce S → ǫ $ aabb # shift $ a abb # shift reduce S → ǫ reduce S → ǫ $ aa bb # shift $ aaS bb # shift reduce S → ǫ $ aaSb b # reduce S → aSb shift , reduce S → ǫ $ aS b # shift reduce S → ǫ $ aSb # reduce S → aSb reduce S → ǫ $ S # accept reduce S → ǫ Issues: ◮ Shift vs. Reduce ◮ Reduce A → β , Reduce B → αβ

  5. Bottom-Up Syntax Analysis Parsing aa in the grammar S → AB , S → A , A → a , B → a Stack Input Action Dead ends $ aa # shift $ a a # reduce A → a reduce B → a , shift $ A a # shift reduce S → A $ Aa # reduce B → a reduce A → a $ AB # reduce S → AB $ S # accept Issues: ◮ Shift vs. Reduce ◮ Reduce A → β , Reduce B → αβ

  6. Bottom-Up Syntax Analysis Shift-Reduce Parsers ◮ The bottom–up Parser is a shift–reduce parser, each step is a shift: consuming the next input symbol or a reduction: reducing a suffix of the stack contents by some production. ◮ the problem is to decide when to stop shifting and make a reduction instead. ◮ a next right side to reduce is called a “handle”, reducing too early: dead end, reducing too late: burying the handle.

  7. Bottom-Up Syntax Analysis LR-Parsers – Deterministic Shift–Reduce Parsers Parser decides whether to shift or to reduce based on ◮ the contents of the stack and ◮ k symbols lookahead into the rest of the input Property of the LR–Parser: it suffices to consider the topmost state on the stack instead of the whole stack contents.

  8. Bottom-Up Syntax Analysis From P G to LR–Parsers for G ◮ P G has non-deterministic choice of expansions, ◮ LL–parsers eliminate non–determinism by looking ahead at expansions, ◮ LR–parsers follow all possibilities in parallel (corresponds to the subset–construction in NFA → DFA ). Derivation 1. Characteristic finite automaton of P G , a description of P G 2. Make deterministic 3. Interpret as control of a push down automaton 4. Check for “inedaquate” states

  9. Bottom-Up Syntax Analysis From P G to LR–Parsers for G ◮ P G has non-deterministic choice of expansions, ◮ LL–parsers eliminate non–determinism by looking ahead at expansions, ◮ LR–parsers follow all possibilities in parallel (corresponds to the subset–construction in NFA → DFA ). Derivation 1. Characteristic finite automaton of P G , a description of P G 2. Make deterministic 3. Interpret as control of a push down automaton 4. Check for “inedaquate” states

  10. Bottom-Up Syntax Analysis Characteristic Finite Automaton of P G NFA char ( P G ) = ( Q c , V c , ∆ c , q c , F c ) — the characteristic finite automaton of P G : ◮ Q c = It G — states: the items of G ◮ V c = V T ∪ V N — input alphabet: the sets of term. and non-term. symbols ◮ q c = [ S ′ → . S ] — start state ◮ F c = { [ X → α. ] | X → α ∈ P } — final states: the complete items ◮ ∆ c = { ([ X → α. Y β ] , Y , [ X → α Y .β ]) | X → α Y β ∈ P and Y ∈ V N ∪ V T }∪ { ([ X → α. Y β ] , ε, [ Y → .γ ]) | X → α Y β ∈ P and Y → γ ∈ P }

  11. Bottom-Up Syntax Analysis Item PDA for G ab : S → aSb | ǫ P G ab Stack Input New Stack [ S ′ → . S ] [ S ′ → . S ] [ S → . aSb ] ǫ [ S ′ → . S ] [ S ′ → . S ] [ S → . ] ǫ [ S → . aSb ] a [ S → a . Sb ] [ S → a . Sb ] ǫ [ S → a . Sb ] [ S → . aSb ] [ S → a . Sb ] ǫ [ S → a . Sb ] [ S → . ] [ S → aS . b ] b [ S → aSb . ] [ S → a . Sb ] [ S → . ] [ S → aS . b ] ǫ [ S → a . Sb ] [ S → aSb . ] [ S → aS . b ] ǫ [ S ′ → . S ] [ S → aSb . ] [ S ′ → S . ] ǫ [ S ′ → . S ] [ S → . ] [ S ′ → S . ] ǫ

  12. Bottom-Up Syntax Analysis The Characteristic NFA char ( P G ab ) S [S’ → . S] [S’ → S.] ǫ a S b ǫ [S → .aSb] [S → a.Sb] [S → aS.b] [S → aSb.] ǫ [S → . ] ǫ

  13. Bottom-Up Syntax Analysis Characteristic NFA for G 0 E [ S → . E ] [ S → E . ] ε ε ε E + T ε [ E → . E + T ] [ E → E . + T ] [ E → E + . T ] [ E → E + T . ] ε ε T S → E [ E → . T ] [ E → T . ] ε ε E → E + T | T ε T ∗ F ε T → T ∗ F | F [ T → . T ∗ F ] [ T → T . ∗ F ] [ T → T ∗ . F ] [ T → T ∗ F . ] ε ε F → ( E ) | id F [ T → . F ] [ T → F . ] ε ε ( ) E ε [ F → . ( E )] [ F → ( . E )] [ F → ( E . )] [ F → ( E ) . ] ε id [ F → . id ] [ F → id . ]

  14. Bottom-Up Syntax Analysis Interpreting char ( P G ) State of char ( P G ) is the current state of P G , i.e. the state on top of P G ’s stack. Adding actions to the transitions and states of char ( P G ) to describe P G : ε –transitions: push new state of char ( P G ) onto stack of P G : new current state. reading transitions: reading transitions of P G : replace current state of P G by the shifted one. final state: Actions in P G : ◮ pop final state [ X → α. ] from the stack, ◮ do a transition from the new topmost state under X , ◮ push the new state onto the stack.

  15. Bottom-Up Syntax Analysis The Handle Revisited ◮ The bottom up–Parser is a shift–reduce–parser, each step is a shift: consuming the next input symbol, making a transition under it from the current state, pushing the new state onto the stack. a reduction: reducing a suffix of the stack contents by some production, making a transition under the left side non–terminal from the new current state, pushing the new state. ◮ the problem is the localization of the “handle”, the next right side to reduce. reducing too early: dead end, reducing too late: burying the handle.

  16. Bottom-Up Syntax Analysis Handles and Viable Prefixes Some Abbreviations: RMD – rightmost derivation RSF – right sentential form ∗ S ′ = rm β Xu = ⇒ rm βα u – a RMD of cfg G . ⇒ ◮ α is a handle of βα u . The part of a RSF next to be reduced. ◮ Each prefix of βα is a viable prefix . A prefix of a RSF stretching at most up to the end of the handle, i.e. reductions if possible then only at the end.

  17. Bottom-Up Syntax Analysis Examples in G 0 RSF handle viable prefix Reason E + F F E , E + , E + F S = rm E = ⇒ rm E + T = ⇒ rm E + F ⇒ 3 T ∗ id id T , T ∗ , T ∗ id S rm T ∗ F = rm T ∗ id = ⇒ ⇒ 4 F ∗ id F F S = rm T ∗ id = ⇒ rm F ∗ id ⇒

  18. Bottom-Up Syntax Analysis Valid Items [ X → α.β ] is valid for the viable prefix γα , if there exists a ∗ RMD S ′ ⇒ ⇒ = rm γ Xw = rm γαβ w . An item valid for a viable prefix gives one interpretation of the parsing situation. Some viable prefixes of G 0 Viable Valid Items Reason γ w X α β Prefix E + [ E → E + . T ] S = rm E = rm E + T ε ε E E + T ⇒ ⇒ ∗ [ T → . F ] S rm E + T = rm E + F E + ε T ε F = ⇒ ⇒ ∗ [ F → . id ] S rm E + F = rm E + id E + ε F ε id = ⇒ ⇒ ∗ ( E + ( [ F → ( . E )] S rm ( E + F ) ( E + ) F ( E ) = ⇒ rm ( E + ( E )) = ⇒

  19. Bottom-Up Syntax Analysis Valid Items and Parsing Situations Given some input string xuvw . The RMD ∗ ∗ ∗ ∗ S ′ = rm γ Xw = ⇒ rm γαβ w ⇒ = rm γα vw ⇒ = rm γ uvw ⇒ = rm xuvw ⇒ describes the following sequence of partial derivations: ∗ ∗ ∗ rm x ⇒ rm u ⇒ rm v ⇒ X = ⇒ γ = α = β = rm αβ ∗ S ′ = rm γ Xw ⇒ executed by the bottom-up parser in this order. The valid item [ X → α . β ] for the viable prefix γα describes the situation after partial derivation 2.

  20. Bottom-Up Syntax Analysis Theorems char ( P G ) = ( Q c , V c , ∆ c , q c , F c ) Theorem For each viable prefix there is at least one valid item. Every parsing situation is described by at least one valid item. Theorem Let γ ∈ ( V T ∪ V N ) ∗ and q ∈ Q c . ∗ ( q c , γ ) ⊢ char ( PG ) ( q , ε ) iff γ is a viable prefix and q is a valid item for γ . A viable prefix brings char ( P G ) from its initial state to all its valid items. Theorem The language of viable prefixes of a cfg is regular.

  21. Bottom-Up Syntax Analysis Making char ( P G ) deterministic Apply NFA → DFA to char ( P G ) : Result LR-DFA( G ). Example: char ( P G ab ) S [S’ → . S] [S’ → S.] ǫ a S b ǫ [S → .aSb] [S → a.Sb] [S → aS.b] [S → aSb.] ǫ [S → . ] ǫ LR-DFA( G ab ):

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend