 
              Syntax Analysis Syntax Analysis Recursive Equations over Grammars – Wilhelm/Seidl/Hack: Compiler Design, Syntactic and Semantic Analysis– Reinhard Wilhelm Universität des Saarlandes wilhelm@cs.uni-saarland.de 29. Oktober 2013
Syntax Analysis Properties of a Grammar Sometimes need to determine properties of (constituents of) a grammar: ◮ whether the grammar has useless symbols, ◮ what can start a word for a nonterminal, ◮ what can follow after a nonterminal. Properties are expressed as recursive systems of equations.
Syntax Analysis Reachability and Productivity Non-terminal A is reachable: iff there exist ϕ 1 , ϕ 2 ∈ V T ∪ V N such that ∗ = ⇒ ϕ 1 A ϕ 2 S ∗ productive: iff there exists w ∈ V ∗ T , A = ⇒ w ◮ These definitions are useless for tests; they involve quantifications over infinite sets. ◮ We need equivalent definitions that allow (efficient) computation. ◮ Eliminate non-reachable and non-productive nonterminals from the grammar, ◮ does not change the described language.
Syntax Analysis Two-Level Definitions 1. A non-terminal Y is reachable through its occurrence in X → ϕ 1 Y ϕ 2 iff X is reachable, 2. A non-terminal is reachable iff it is reachable through at least one of its occurrences, 3. S ′ is reachable. Re ( S ′ ) = true ∀ X � = S ′ Re ( X ) = � Y → ϕ 1 X ϕ 2 Re ( Y ) 1. A non-terminal X is productive through production X → ϕ iff all non-terminals occurring in ϕ are productive. 2. A non-terminal is productive iff it is productive through at least one of its alternatives. Pr ( X ) = � � { Pr ( Y ) | Y ∈ V N occurs in α } for all X ∈ V N X → α
Syntax Analysis ◮ These definitions translate reachability and productivity for a given grammar into (recursive) systems of equations. ◮ System describes a function I : [ V N → B ] → [ V N → B ] with false ⊑ true ◮ Iteration starting with smallest element, ◮ Re ( S ′ ) = true , Re ( X ) = false , ∀ X � = S ′ ◮ Pr ( X ) = false , ∀ X ∈ V N ◮ Least solution wanted to eliminate as many useless non-terminals as possible.
Syntax Analysis Trees, Subtrees, Tree Fragments S S X X X Parse tree Subtree upper treefragment for X for X X reachable: Set of upper tree fragments for X not empty, X productive: Set of subtrees for X not empty.
Syntax Analysis Recursive System of Equations Questions: Do these recursive systems of equations have ◮ solutions? ◮ unique solutions? They do have solutions if ◮ the property domain D ◮ is partially ordered by some relation ⊑ , ◮ has a uniquely defined smallest element, ⊥ , ◮ has a least upper bound, d 1 ⊔ d 2 , for each two elements d 1 , d 2 and ◮ the functions occurring in the equations are monotonic. Our domains are finite, all functions are monotonic.
Syntax Analysis Fixed Point Iteration ◮ Solutions are fixed points of a function I : [ V N → D ] → [ V N → D ] . ◮ Computed iteratively starting with ⊥ ⊥ , the function which maps all non-terminals to ⊥ . ◮ Evaluate equations until nothing changes. ◮ Iteration is guaranteed if D has only finitely ascending chains, We always compute least fixed points.
Syntax Analysis Example: Productivity Given the following grammar: S ′   → S     → S aX       G = ( { S ′ , S , X , Y , Z } , { a , b } , , S ′ ) X → bS | aYbY → ba | aZ Y         Z → aZX   Resulting system of equations: Fixed-point iteration S X Y Z Pr ( S ) = Pr ( X ) false false false false Pr ( X ) = Pr ( S ) ∨ Pr ( Y ) Pr ( Y ) = true ∨ Pr ( Z ) = true Pr ( Z ) = Pr ( Z ) ∧ Pr ( X )
Syntax Analysis Example: Reachability Given the grammar G = ( { S , U , V , X , Y , Z } , { a , b , c , d } , The equations:  S → Y    Re ( S ) = true  Y → YZ | Ya | b        Re ( U ) = false   U → V   , S ) Re ( V ) = Re ( U ) ∨ Re ( V ) X → c     Re ( X ) = Re ( Z ) V → Vd | d         Re ( S ) ∨ Re ( Y ) Re ( Y ) = Z → ZX   Re ( Z ) = Re ( Y ) ∨ Re ( Z ) Fixed-point iteration: S U V X Y Z true false false false false false
Syntax Analysis First and Follow Sets Parser generators need precomputed information about sets of ◮ prefixes of words for non-terminals (words that can begin words for non-terminals) ◮ followers of non-terminals (words that can follow a non-terminal). Use: Removing non-determinism from expand moves of the P G
Syntax Analysis Another Grammar for Arithmetic Expressions Left-factored grammar G 2 , i.e. left recursion removed. S → E TE ′ E generates T with a continuation E ′ → E E ′ generates possibly empty sequence of + T s E ′ → + E | ǫ FT ′ T generates F with a continuation T ′ T → T ′ generates possibly empty sequence of ∗ F s T ′ → ∗ T | ǫ F → id | ( E ) G 2 defines the same language as G 0 and G 1 .
Syntax Analysis The FIRST 1 Sets S → E E → TE ′ E ′ → + E | ǫ A production N → α is applicable for symbols T → FT ′ that “begin” α T ′ → ∗ T | ǫ ◮ Example: Arithmetic Expressions, Grammar G 2 F → id | ( E ) ◮ production F → id is applied when current symbol is id ◮ production F → ( E ) is applied when current symbol is ( ◮ production T → F is applied when current symbol is id or ( ◮ Formal definition: ∗ ⇒ w , w ∈ V ∗ FIRST 1 ( α ) = { 1 : w | α = T }
Syntax Analysis The FOLLOW 1 Sets S → E E → TE ′ E ′ → + E | ǫ A production N → ǫ is applicable for symbols T → FT ′ that “can follow” N in some derivation T ′ → ∗ T | ǫ ◮ Example: Arithmetic Expressions, Grammar G 2 F → id | ( E ) ◮ The production E ′ → ǫ is applied for symbols # and ) ◮ The production T ′ → ǫ is applied for symbols # , ) and + ◮ Formal definition: ∗ FOLLOW 1 ( N ) = { a ∈ V T |∃ α, γ : S ⇒ α Na γ } =
Syntax Analysis Definitions Let k ≥ 1 k - prefix of a word w = a 1 . . . a n � a 1 . . . a n n ≤ k if k : w = otherwise a 1 . . . a k k - concatenation ⊕ k : V ∗ × V ∗ → V ≤ k , defined by u ⊕ k v = k : uv extended to languages k : L = { k : w | w ∈ L } L 1 ⊕ k L 2 = { x ⊕ k y | x ∈ L 1 , y ∈ L 2 } . V ≤ k = � k i = 1 V i set of words of length at most k . . . V ≤ k T # = V ≤ k ∪ V k − 1 { # } . . . possibly terminated by # . T T
Syntax Analysis Properties Let k ≥ 1, and L 1 , L 2 , L 3 ⊆ V ≤ k . ( a ) L 1 ⊕ k ( L 2 ⊕ k L 3 ) = ( L 1 ⊕ k L 2 ) ⊕ k L 3 L 1 ⊕ k { ε } = { ε }⊕ k L 1 = k : L 1 ( b ) ( c ) L 1 ⊕ k L 2 = ∅ iff L 1 = ∅ ∨ L 2 = ∅ ε ∈ L 1 ⊕ k L 2 ε ∈ L 1 ∧ ε ∈ L 2 ( d ) iff ( e ) k : ( L 1 L 2 ) = k : L 1 ⊕ k k : L 2
Syntax Analysis FIRST k and FOLLOW k FIRST k : ( V N ∪ V T ) ∗ → 2 V ≤ k where X T ∗ FIRST k ( α ) = { k : u | α = ⇒ u } set of k –prefixes of terminal words for α ∈ FIRST k ( X ) ∈ FOLLOW k ( X ) FOLLOW k : V N → 2 V ≤ k T # where ∗ FOLLOW k ( X ) = { w | S = ⇒ β X γ and w ∈ FIRST k ( γ ) } set of k –prefixes of terminal words that may immediately follow X
Syntax Analysis FIRST k Theorem FIRST k ( Z 1 , Z 2 , . . . , Z n ) = FIRST k ( Z 1 ) ⊕ k FIRST k ( Z 2 ) ⊕ k . . . ⊕ k FIRST k ( Z n ) The recursive system of equations for FIRST k is � FIRST k ( X ) = { X → α } FIRST k ( α ) ∀ X ∈ V N ( Fi k ) FIRST k ( a ) = { a } ∀ a ∈ V T
Syntax Analysis FIRST 1 Example Grammar G 2 below defines the same language as G 0 and G 1 . E ′ T ′ 0 : → 3 : → + E 6 : → ∗ T S E TE ′ FT ′ 1 : E → 4 : T → 7 : F → ( E ) E ′ T ′ 2 : → 5 : → 8 : → id ε ε F The equations FIRST 1 for grammar G 2 :
Syntax Analysis Grammar G 2 below defines the same language as G 0 and G 1 E ′ T ′ 0 : → 3 : → + E 6 : → ∗ T S E TE ′ FT ′ → → → 1 : E 4 : T 7 : F ( E ) E ′ T ′ 2 : → 5 : → 8 : → id ε ε F The equations FIRST 1 for grammar G 2 : FIRST 1 ( S ) = FIRST 1 ( E ) FIRST 1 ( E ) = FIRST 1 ( T ) ⊕ 1 FIRST 1 ( E ′ ) FIRST 1 ( E ′ ) = { ε } ∪ { + }⊕ 1 FIRST 1 ( E ) FIRST 1 ( T ) = FIRST 1 ( F ) ⊕ 1 FIRST 1 ( T ′ ) FIRST 1 ( T ′ ) = { ε } ∪ {∗}⊕ 1 FIRST 1 ( T ) FIRST 1 ( F ) = { Id } ∪ { ( }⊕ 1 FIRST 1 ( E ) ⊕ 1 { ) }
Syntax Analysis Iteration Iterative computation of the FIRST 1 sets: E ′ T ′ S E T F ∅ ∅ ∅ ∅ ∅ ∅
Syntax Analysis FOLLOW k The system of equations for FOLLOW k is � FOLLOW k ( X ) = { Y → ϕ 1 X ϕ 2 } FIRST k ( ϕ 2 ) ⊕ k FOLLOW k ( Y ) ∀ X ∈ V N − FOLLOW k ( S ) = { # } ( Fo k )
Syntax Analysis FOLLOW k Example Regard grammar G 2 . The system of equations is: FOLLOW 1 ( S ) = { # } FOLLOW 1 ( E ) = FOLLOW 1 ( S ) ∪ FOLLOW 1 ( E ′ ) ∪ { ) }⊕ 1 FOLLOW 1 ( F ) FOLLOW 1 ( E ′ ) = FOLLOW 1 ( E ) FOLLOW 1 ( T ) = { ε, + }⊕ 1 FOLLOW 1 ( E ) ∪ FOLLOW 1 ( T ′ ) FOLLOW 1 ( T ′ ) = FOLLOW 1 ( T ) FOLLOW 1 ( F ) = { ε, ∗}⊕ 1 FOLLOW 1 ( T ) Iterative computation of the FOLLOW 1 sets: E ′ T ′ S E T F { # } ∅ ∅ ∅ ∅ ∅
Recommend
More recommend