cse443 compilers
play

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis - PowerPoint PPT Presentation

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Phases of a Syntactic compiler structure Figure 1.6, page 5 of text TOOLS Lexical analysis: LEX/FLEX (regex -> lexer) Syntactic analysis: YACC/BISON (grammar


  1. CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall

  2. Phases of a Syntactic compiler structure Figure 1.6, page 5 of text

  3. TOOLS Lexical analysis: LEX/FLEX (regex -> lexer) Syntactic analysis: YACC/BISON (grammar -> parser)

  4. Top-down & bottom-up A top-down parser builds a parse tree from root to the leaves easier to construct by hand A bottom-up parser builds a parse tree from leaves to root Handles a larger class of grammars tools (yacc/bison) build bottom-up parsers

  5. Our presentation First top-down, then bottom-up Present top-down parsing first. Introduce necessary vocabulary and data structures. Move on to bottom-up parsing second.

  6. vocab: look-ahead The current symbol being scanned in the input is called the lookahead symbol. PARSER token token token token token token

  7. Top-down parsing

  8. Top-down parsing Start from grammar's start symbol Build parse tree so its yield matches input predictive parsing: a simple form of recursive descent parsing

  9. FIRST( 𝛽 ) If 𝛽∈ (NUT)* then FIRST( 𝛽 ) is "the set of terminals that appear as the first symbols of one or more strings of terminals generated from 𝛽 ." [p. 64] Ex: If A -> a 𝛾 then FIRST(A) = {a} Ex. If A -> a 𝛾 | B then FIRST(A) = {a} ∪ FIRST(B)

  10. FIRST( 𝛽 ) First sets are considered when there are two (or more) productions to expand A ∈ N: A -> 𝛽 | 𝛾 Predictive parsing requires that FIRST( 𝛽 ) ∩ FIRST( 𝛾 ) = ∅

  11. 𝜁 productions If lookahead symbol does not match first set, use 𝜁 production not to advance lookahead symbol but instead "discard" non-terminal: optexpt -> expr | 𝜁 "While parsing optexpr, if the lookahead symbol is not in FIRST(expr), then the 𝜁 production is used" [p. 66]

  12. Left recursion Grammars with left recursion are problematic for top-down parsers, as they lead to infinite regress.

  13. Left recursion example expr Grammar: expr + term expr -> expr + term | term expr + term term -> id FIRST sets for rule expr + term alternatives are not disjoint: FIRST(expr) = id term FIRST(term) = id

  14. Left recursion example expr Grammar: 𝛽 𝛾 expr + term expr -> expr + term | term expr + term term -> id FIRST sets for rule expr + term alternatives are not disjoint: FIRST(expr) = id term FIRST(term) = id 𝛾 𝛽 𝛽 𝛽

  15. Rewriting grammar to remove left recursion expr rule is of form A -> A 𝛽 | 𝛾 Rewrite as two rules A -> 𝛾 R R -> 𝛽 R | 𝜁

  16. Back to example expr term R Grammar is re- written as + term R expr -> term R + term R R -> + term R | 𝜁 + term R 𝛾 𝛽 𝛽 𝛽 𝜁

  17. Ambiguity A grammar G is ambiguous if ∃ 𝛕 ∈ 𝓜 (G) that has two or more distinct parse trees. Example - dangling 'else': if <expr> then if <expr> then <stmt> else <stmt> if <expr> then { if <expr> then <stmt> } else <stmt> if <expr> then { if <expr> then <stmt> else <stmt> }

  18. dangling else resolution usually resolved so else matches closest if- then we can re-write grammar to force this interpretation (ms = matched statement, os = open statement) <stmt> -> <ms> | <os> <ms> -> if <expr> then <ms> else <ms> | … <os> -> if <expr> then <stmt> | if <expr> then <ms> else <os>

  19. Left factoring If two (or more) rules share a prefix then their FIRST sets do not distinguish between rule alternatives. If there is a choice point later in the rule, rewrite rule by factoring common prefix Example: rewrite A -> 𝛽 𝛾 1 | 𝛽 𝛾 2 as A -> 𝛽 A' A' -> 𝛾 1 | 𝛾 2

  20. Predictive parsing: a special case of recursive-descent parsing that does not require backtracking Each non-terminal A ∈ N has an associated procedure: void A() { choose an A-production A -> X1 X2 … Xk for (i = 1 to k) { if (xi ∈ N) { call xi() } else if (xi = current input symbol) { advance input to next symbol } else error } }

  21. Predictive parsing: a special case of recursive-descent parsing that does not require backtracking Each non-terminal A ∈ N has an associated procedure: void A() { choose an A-production A -> X1 X2 … Xk for (i = 1 to k) { There is non-determinism if (xi ∈ N) { in choice of production. If "wrong" choice is made call xi() the parser will need to } revisit its choice by else if (xi = current input symbol) { backtracking. advance input to next symbol A predictive parser can } always made the correct else error choice here. } }

  22. FIRST(X) if X ∈ T then FIRST(X) = { X } if X ∈ N and X -> Y 1 Y 2 … Y k ∈ P for k ≥ 1, then add a ∈ T to FIRST(X) if ∃ i s.t. a ∈ FIRST(Y i ) and 𝜁 ∈ FIRST(Y j ) ∀ j < i (i.e. Y 1 Y 2 … Y k ⇒ * 𝜁 ) if 𝜁 ∈ FIRST(Y j ) ∀ j < k add 𝜁 to FIRST(X)

  23. FOLLOW(X) Place $ in FOLLOW(S), where S is the start symbol ($ is an end marker) if A -> 𝛽 B 𝛾 ∈ P, then FIRST( 𝛾 ) - { 𝜁 } is in FOLLOW(B) if A -> 𝛽 B ∈ P or A -> 𝛽 B 𝛾 ∈ P where 𝜁 ∈ FIRST( 𝛾 ), then everything in FOLLOW(A) is in FOLLOW(B)

  24. Table-driven predictive parsing Algorithm 4.32 (p. 224) INPUT: Grammar G = (N,T,P,S) OUTPUT: Parsing table M For each production A -> 𝛽 of G: 1. For each terminal a ∈ FIRST( 𝛽 ), add A -> 𝛽 to M[A,a] 2. If 𝜁 ∈ FIRST( 𝛽 ), then for each terminal b in FOLLOW(A), add A -> 𝛽 to M[A,b] 3. If 𝜁 ∈ FIRST( 𝛽 ) and $ ∈ FOLLOW(A), add A -> 𝛽 to M[A,$]

  25. Example G given by its productions: E -> T E' E' -> + T E' | 𝜁 T -> F T' For each production A -> 𝛽 of G: T' -> * F T' | 𝜁 For each terminal a ∈ FIRST( 𝛽 ), F -> ( E ) | id add A -> 𝛽 to M[A,a] If 𝜁 ∈ FIRST( 𝛽 ), then for each terminal b in FOLLOW(A), add A - > 𝛽 to M[A,b] If 𝜁 ∈ FIRST( 𝛽 ) and $ ∈ FOLLOW(A), add A -> 𝛽 to M[A,$]

  26. FIRST SETS E -> T E' E' -> + T E' | 𝜁 T -> F T' T' -> * F T' | 𝜁 F -> ( E ) | id FIRST(F) = { ( , id } FIRST(T) = FIRST(F) = { ( , id } FIRST(E) = FIRST(T) = FIRST(F) = { ( , id } FIRST(E') = { + , 𝜁 } FIRST(T') = { * , 𝜁 } For each production A -> 𝛽 of G: if X ∈ T then FIRST(X) = { X } if X ∈ N and X -> Y 1 Y 2 … Y k ∈ P for k ≥ 1, then For each terminal a ∈ FIRST( 𝛽 ), add A -> 𝛽 to M[A,a] add a ∈ T to FIRST(X) if ∃ i s.t. a ∈ If 𝜁 ∈ FIRST( 𝛽 ), then for each FIRST(Y i ) and 𝜁 ∈ FIRST(Y j ) ∀ j < i (i.e. Y 1 terminal b in FOLLOW(A), add A - Y 2 … Y k ⇒ * 𝜁 ) > 𝛽 to M[A,b] If 𝜁 ∈ FIRST( 𝛽 ) and $ ∈ if 𝜁 ∈ FIRST(Y j ) ∀ j < k add 𝜁 to FIRST(X) FOLLOW(A), add A -> 𝛽 to M[A,$]

  27. FOLLOW SETS E -> T E' E' -> + T E' | 𝜁 T -> F T' T' -> * F T' | 𝜁 F -> ( E ) | id FOLLOW(E) = { ) , $ } FOLLOW(E') = FOLLOW(E) = { ) , $ } FOLLOW(T) = { + , ) , $ } FOLLOW(T') = FOLLOW(T) = { + , ) , $ } FOLLOW(F) = { + , * , ) , $ } For each production A -> 𝛽 of G: Place $ in FOLLOW(S), where S is the start For each terminal a ∈ FIRST( 𝛽 ), symbol ($ is an end marker) add A -> 𝛽 to M[A,a] if A -> 𝛽 B 𝛾 ∈ P, then FIRST( 𝛾 ) - { 𝜁 } is in If 𝜁 ∈ FIRST( 𝛽 ), then for each FOLLOW(B) terminal b in FOLLOW(A), add A - if A -> 𝛽 B ∈ P or A -> 𝛽 B 𝛾 ∈ P where 𝜁 ∈ FIRST( 𝛾 ), > 𝛽 to M[A,b] then everything in FOLLOW(A) is in FOLLOW(B) If 𝜁 ∈ FIRST( 𝛽 ) and $ ∈ FOLLOW(A), add A -> 𝛽 to M[A,$]

  28. Parse-table M NON id + * ( ) $ TERMINALS E E -> T E' E -> T E' E' E' -> 𝜁 E' -> 𝜁 E' -> + T E' T T -> F T' T -> F T' T' T' -> 𝜁 T' -> 𝜁 T' -> 𝜁 T' -> * F T F F -> id F -> ( E ) For each production A -> 𝛽 of G: FIRST(E) = FIRST(T) = FIRST(F) = { ( , id } FIRST(E') = { + , 𝜁 } For each terminal a ∈ FIRST( 𝛽 ), add FIRST(T') = { * , 𝜁 } E -> T E' A -> 𝛽 to M[A,a] If 𝜁 ∈ FIRST( 𝛽 ), then for each E' -> + T E' | 𝜁 terminal b in FOLLOW(A), add A -> 𝛽 to T -> F T' FOLLOW(E') = FOLLOW(E) = { ) , $ } M[A,b] T' -> * F T' | 𝜁 FOLLOW(T') = FOLLOW(T) = { + , ) , $ } If 𝜁 ∈ FIRST( 𝛽 ) and $ ∈ FOLLOW(A), F -> ( E ) | id FOLLOW(F) = { + , * , ) , $ } add A -> 𝛽 to M[A,$]

  29. Algorithm 4.34 [p. 226] INPUT: A string 𝜕 and a parsing table M for a grammar G=(N,T,P,S). OUTPUT: If 𝜕∈𝓜 (G), a leftmost derivation of 𝜕 , error otherwise input $ 𝜕 stack S M parser $ output

  30. Algorithm 4.34 [p. 226] Let a be the first symbol of 𝜕 Let X be the top stack symbol while (X ≠ $) { if (X == a) { pop the stack, advance a in 𝜕 } else if (X is a terminal) { error } else if (M[X,a] is blank) { error } else if (M[X,a] is X -> Y 1 Y 2 … Y k ) { output X -> Y 1 Y 2 … Y k pop the stack push Y k … Y 2 Y 1 onto the stack } Let X be the top stack symbol } Accept if a == X == $

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend