compiler construction
play

Compiler Construction Lecture 8: Syntax Analysis IV (More on LL (1) - PowerPoint PPT Presentation

Compiler Construction Lecture 8: Syntax Analysis IV (More on LL (1) & Bottom-Up Parsing) Thomas Noll Lehrstuhl f ur Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de


  1. Compiler Construction Lecture 8: Syntax Analysis IV (More on LL (1) & Bottom-Up Parsing) Thomas Noll Lehrstuhl f¨ ur Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de http://moves.rwth-aachen.de/teaching/ss-14/cc14/ Summer Semester 2014

  2. Outline Recap: LL (1) Parsing 1 Transformation to LL (1) 2 The Complexity of LL (1) Parsing 3 Recursive-Descent Parsing 4 Bottom-Up Parsing 5 Nondeterministic Bottom-Up Parsing 6 Compiler Construction Summer Semester 2014 8.2

  3. Characterization of LL (1) Theorem (Characterization of LL (1)) G ∈ LL (1) iff for all pairs of rules A → β | γ ∈ P (where β � = γ ): la ( A → β ) ∩ la ( A → γ ) = ∅ . Proof. on the board Remark: the above theorem generally does not hold if k > 1 (cf. exercises) Compiler Construction Summer Semester 2014 8.3

  4. Deterministic Top-Down Parsing Approach: given G ∈ CFG Σ , Verify that G ∈ LL (1) by computing the lookahead sets and checking 1 alternatives for disjointness Start with nondeterministic top-down parsing automaton NTA ( G ) 2 Use 1-symbol lookahead to control the choice of expanding 3 productions: ( aw , A α, z ) ⊢ ( aw , βα, zi ) if π i = A → β and a ∈ la ( π i ) ( ε, A α, z ) ⊢ ( ε, βα, zi ) if π i = A → β and ε ∈ la ( π i ) [matching steps as before: ( aw , a α, z ) ⊢ ( w , α, z )] ⇒ deterministic top-down parsing automaton DTA ( G ) = Remarks: DTA ( G ) is actually not a pushdown automaton ( a is read but not consumed). But: can be simulated using the finite control. Advantage of using lookahead is twofold: Removal of nondeterminism Earlier detection of syntax errors ∈ � A → β ∈ P la ( A → β )) (in configurations ( aw , A α, z ) where a / Compiler Construction Summer Semester 2014 8.4

  5. Outline Recap: LL (1) Parsing 1 Transformation to LL (1) 2 The Complexity of LL (1) Parsing 3 Recursive-Descent Parsing 4 Bottom-Up Parsing 5 Nondeterministic Bottom-Up Parsing 6 Compiler Construction Summer Semester 2014 8.5

  6. Transformation to LL (1) Assume that G = � N , Σ , P , S � ∈ CFG Σ \ LL (1) (i.e., there exist A → β | γ ∈ P such that la ( A → β ) ∩ la ( A → γ ) � = ∅ ) Compiler Construction Summer Semester 2014 8.6

  7. Transformation to LL (1) Assume that G = � N , Σ , P , S � ∈ CFG Σ \ LL (1) (i.e., there exist A → β | γ ∈ P such that la ( A → β ) ∩ la ( A → γ ) � = ∅ ) Two heuristics for transforming G into G ′ ∈ LL (1): Removal of left recursion 1 Left factorization 2 (used in parser-generating systems such as ANTLR) Compiler Construction Summer Semester 2014 8.6

  8. Transformation to LL (1) Assume that G = � N , Σ , P , S � ∈ CFG Σ \ LL (1) (i.e., there exist A → β | γ ∈ P such that la ( A → β ) ∩ la ( A → γ ) � = ∅ ) Two heuristics for transforming G into G ′ ∈ LL (1): Removal of left recursion 1 Left factorization 2 (used in parser-generating systems such as ANTLR) Remarks: Transformations generally preserve the semantics (= generated language) of CFGs but not the syntactic structure of words (different syntax trees). Transformations cannot always yield an LL (1) grammar (since not every context-free language is generated by an LL grammar; details later). Compiler Construction Summer Semester 2014 8.6

  9. Left Recursion I Definition 8.1 (Left recursion) A grammar G = � N , Σ , P , S � ∈ CFG Σ is called left recursive if there exist A ∈ N and α ∈ X ∗ such that A ⇒ + A α . Compiler Construction Summer Semester 2014 8.7

  10. Left Recursion I Definition 8.1 (Left recursion) A grammar G = � N , Σ , P , S � ∈ CFG Σ is called left recursive if there exist A ∈ N and α ∈ X ∗ such that A ⇒ + A α . Corollary 8.2 If G ∈ CFG Σ is left recursive with A ⇒ + A α , then there exists β ∈ X ∗ such that A ⇒ + l A β . Compiler Construction Summer Semester 2014 8.7

  11. Left Recursion I Definition 8.1 (Left recursion) A grammar G = � N , Σ , P , S � ∈ CFG Σ is called left recursive if there exist A ∈ N and α ∈ X ∗ such that A ⇒ + A α . Corollary 8.2 If G ∈ CFG Σ is left recursive with A ⇒ + A α , then there exists β ∈ X ∗ such that A ⇒ + l A β . Example 8.3 The grammar (cf. Example 5.10) G AE : E → E + T | T T → T * F | F F → ( E ) | a | b ∈ LL (1) is left recursive, and in Example 7.4 it was shown that G AE / Compiler Construction Summer Semester 2014 8.7

  12. Left Recursion II Lemma 8.4 ∈ � If G ∈ CFG Σ is left recursive, then G / k ∈ N LL ( k ) . Compiler Construction Summer Semester 2014 8.8

  13. Left Recursion II Lemma 8.4 ∈ � If G ∈ CFG Σ is left recursive, then G / k ∈ N LL ( k ) . Proof. (for k = 1) Assume that G ∈ LL (1) is left recursive with A ⇒ + l A β . Together with the reducedness of G this implies that l vw for some v , w ∈ Σ ∗ and α ∈ X ∗ . l vA α ⇒ + l vA βα ⇒ + S ⇒ ∗ Compiler Construction Summer Semester 2014 8.8

  14. Left Recursion II Lemma 8.4 ∈ � If G ∈ CFG Σ is left recursive, then G / k ∈ N LL ( k ) . Proof. (for k = 1) Assume that G ∈ LL (1) is left recursive with A ⇒ + l A β . Together with the reducedness of G this implies that l vw for some v , w ∈ Σ ∗ and α ∈ X ∗ . l vA α ⇒ + l vA βα ⇒ + S ⇒ ∗ The corresponding computation of DTA ( G ) (Def. 7.6) starts with ( vw , S , ε ) ⊢ ∗ ( w , A α, . . . ) ⊢ + ( w , A βα, . . . ). Compiler Construction Summer Semester 2014 8.8

  15. Left Recursion II Lemma 8.4 ∈ � If G ∈ CFG Σ is left recursive, then G / k ∈ N LL ( k ) . Proof. (for k = 1) Assume that G ∈ LL (1) is left recursive with A ⇒ + l A β . Together with the reducedness of G this implies that l vw for some v , w ∈ Σ ∗ and α ∈ X ∗ . l vA α ⇒ + l vA βα ⇒ + S ⇒ ∗ The corresponding computation of DTA ( G ) (Def. 7.6) starts with ( vw , S , ε ) ⊢ ∗ ( w , A α, . . . ) ⊢ + ( w , A βα, . . . ). But in the last state the behaviour of DTA ( G ) is determined by the same input ( fi ( w )) and stack symbol ( A ). Thus it enters a loop of the form ( w , A α, . . . ) ⊢ + ( w , A βα, . . . ) ⊢ + ( w , A ββα, . . . ) ⊢ + . . . and will never recognize w . Contradiction Compiler Construction Summer Semester 2014 8.8

  16. Removing Direct Left Recursion Direct left recursion occurs in productions of the form A → A α 1 | . . . | A α m | β 1 | . . . | β n where α i � = ε and β j � = A . . . Compiler Construction Summer Semester 2014 8.9

  17. Removing Direct Left Recursion Direct left recursion occurs in productions of the form A → A α 1 | . . . | A α m | β 1 | . . . | β n where α i � = ε and β j � = A . . . Transformation: replacement by right recursion A → β 1 A ′ | . . . | β n A ′ A ′ → α 1 A ′ | . . . | α m A ′ | ε (with a new A ′ ∈ N ) which preserves L ( G ). Compiler Construction Summer Semester 2014 8.9

  18. Removing Direct Left Recursion Direct left recursion occurs in productions of the form A → A α 1 | . . . | A α m | β 1 | . . . | β n where α i � = ε and β j � = A . . . Transformation: replacement by right recursion A → β 1 A ′ | . . . | β n A ′ A ′ → α 1 A ′ | . . . | α m A ′ | ε (with a new A ′ ∈ N ) which preserves L ( G ). Example 8.5 G AE : E → E + T | T T → T * F | F is transformed into F → ( E ) | a | b G ′ AE : E → TE ′ E ′ → + TE ′ | ε T → FT ′ with G ′ AE ∈ LL (1) (see Example 7.5). T ′ → * FT ′ | ε F → ( E ) | a | b Compiler Construction Summer Semester 2014 8.9

  19. Removing Indirect Left Recursion Indirect left recursion occurs in productions of the form ( n ≥ 1) A → A 1 α 1 | . . . A 1 → A 2 α 2 | . . . . . . A n − 1 → A n α n | . . . A n → A β | . . . Compiler Construction Summer Semester 2014 8.10

  20. Removing Indirect Left Recursion Indirect left recursion occurs in productions of the form ( n ≥ 1) A → A 1 α 1 | . . . A 1 → A 2 α 2 | . . . . . . A n − 1 → A n α n | . . . A n → A β | . . . Transformation: into Greibach Normal Form with productions of the form A → aB 1 . . . B n (where n ∈ N and each B i � = S ) or S → ε (cf. Formale Systeme, Automaten, Prozesse ) Compiler Construction Summer Semester 2014 8.10

  21. Left Factorization Applies to productions of the form A → αβ | αγ which are problematic if α “at least as long as lookahead”. Compiler Construction Summer Semester 2014 8.11

  22. Left Factorization Applies to productions of the form A → αβ | αγ which are problematic if α “at least as long as lookahead”. Transformation: delaying the decision by left factorization A → α A ′ A ′ → β | γ (with a new A ′ ∈ N ) which preserves L ( G ). Compiler Construction Summer Semester 2014 8.11

  23. Left Factorization Applies to productions of the form A → αβ | αγ which are problematic if α “at least as long as lookahead”. Transformation: delaying the decision by left factorization A → α A ′ A ′ → β | γ (with a new A ′ ∈ N ) which preserves L ( G ). Example 8.6 Statement → if Condition then Statement else Statement fi | if Condition then Statement fi is transformed into Statement → if Condition then Statement S ′ S ′ → else Statement fi | fi Compiler Construction Summer Semester 2014 8.11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend