parsing
play

Parsing CSCI 3130 Formal Languages and Automata Theory Siu On CHAN - PowerPoint PPT Presentation

Parsing CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Fall 2018 Chinese University of Hong Kong 1/28 Context-free versus regular Every regular language is context-free regular expression NFA DFA 2/28 Write a CFG for the


  1. Parsing CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Fall 2018 Chinese University of Hong Kong 1/28

  2. Context-free versus regular Every regular language is context-free regular expression NFA DFA 2/28 Write a CFG for the language ( 0 + 1 ) ∗ 111 S → U 111 U → 0 U | 1 U | ε Can you do so for every regular language?

  3. Context-free versus regular Every regular language is context-free regular expression NFA DFA 2/28 Write a CFG for the language ( 0 + 1 ) ∗ 111 S → U 111 U → 0 U | 1 U | ε Can you do so for every regular language?

  4. From regular to context-free a (alphabet symbol) 1 E 1 E 2 regular expression 3/28 grammar with no rules ⇒ CFG ∅ ε S → ε S → a E 1 + E 2 S → S 1 | S 2 S → S 1 S 2 E ∗ S → SS 1 | ε S becomes the new start variable

  5. Context-free versus regular Is every context-free language regular? S 0 S 1 L 0 n 1 n n 0 Is context-free but not regular regular context-free 4/28

  6. Context-free versus regular Is every context-free language regular? Is context-free but not regular regular context-free 4/28 L = { 0 n 1 n | n � 0 } S → 0 S 1

  7. Ambiguity

  8. Ambiguity + A CFG is ambiguous if some string has more than one parse tree 2 2 * 1 5/28 2 2 1 + * 1+2*2 E → E + E | E * E | ( E ) | N N → 1 | 2 ✗ = 6 = 5

  9. Example S Two ways to derive xxx x S x S S x S x S x S x S S S Yes, because 6/28 Is S → SS | x ambiguous?

  10. Example S Two ways to derive xxx x S x S S x S x S x S x S S S Yes, because 6/28 Is S → SS | x ambiguous?

  11. Disambiguation S S S x x x Sometimes we can rewrite the grammar to remove ambiguity 7/28 S → SS | x ⇒ S → S x | x

  12. Disambiguation + and * have the same precedence! F F T T F T 8/28 E → E + E | E * E | ( E ) | N N → 1 | 2 Decompose expression into terms and factors 2 * ( 1 + 2 * 2 )

  13. Disambiguation Each term is a product of one or more factors Each factor is a parenthesized expression or a number 9/28 E → E + E | E * E | ( E ) | N N → 1 | 2 An expression is a sum of one or more terms E → T | E + T T → F | T * F F → ( E ) | 1 | 2

  14. Parsing example 2 F 1 + T T F * F F 2 ) + T F 1 1 T 10/28 E Parse tree for 2+(1+1+2*2)+1 E E E T F 2 + T F ( E E E → T | E + T T → F | T * F F → ( E ) | 1 | 2 + T

  15. In programming languages, ambiguity comes from the precedence Disambiguation Disambiguation is not always possible because rules, and we can resolve like in the example In English, ambiguity is sometimes a problem: I look at the dog with one eye 11/28 There exists inherently ambiguous languages There is no general procedure for disambiguation

  16. Disambiguation In English, ambiguity is sometimes a problem: the dog with one eye I look at Disambiguation is not always possible because 11/28 rules, and we can resolve like in the example There exists inherently ambiguous languages There is no general procedure for disambiguation In programming languages, ambiguity comes from the precedence � �� � � �� � � �� � � �� �

  17. Parsing input: 0011 If so, how to build a parse tree with a program? 12/28 S → 0 S 1 | 1 S 0 S | T T → S | ε Is 0011 ∈ L ?

  18. Parsing 0 S 1 This is (part of) the tree of all derivations, not the parse tree … 00 S 11 00 T 11 00 S 11 … 01 S 0 S 1 … 0 T 1 … 10 S 10 S … 1 S 0 S … S T S Try all derivations? input: 0011 13/28 S → 0 S 1 | 1 S 0 S | T T → S | ε 0011 ✓ ε

  19. Parsing 0 S 1 This is (part of) the tree of all derivations, not the parse tree … 00 S 11 00 T 11 00 S 11 … 01 S 0 S 1 … 0 T 1 … 10 S 10 S … 1 S 0 S … S T S Try all derivations? input: 0011 13/28 S → 0 S 1 | 1 S 0 S | T T → S | ε 0011 ✓ ε

  20. Parsing 0 S 1 This is (part of) the tree of all derivations, not the parse tree … 00 S 11 00 T 11 00 S 11 … 01 S 0 S 1 … 0 T 1 … 10 S 10 S … 1 S 0 S … S T S Try all derivations? input: 0011 13/28 S → 0 S 1 | 1 S 0 S | T T → S | ε 0011 ✓ ε

  21. Parsing 0 S 1 This is (part of) the tree of all derivations, not the parse tree … 00 S 11 00 T 11 00 S 11 … 01 S 0 S 1 … 0 T 1 … 10 S 10 S … 1 S 0 S … S T S Try all derivations? input: 0011 13/28 S → 0 S 1 | 1 S 0 S | T T → S | ε 0011 ✓ ε

  22. Problems 1. Trying all derivations may take too long Let’s tackle the 2nd problem 14/28 2. If input is not in the language, parsing will never stop

  23. Derviation may loop When to stop S and unit productions Remove productions” because of “unit T S T because of “ -productions” Derived string may shrink 01 0 T 1 0 S 1 S Problems: Idea: Stop when 15/28 S → 0 S 1 | 1 S 0 S | T | derived string | > | input | T → S | ε

  24. Derviation may loop When to stop T and unit productions Remove productions” because of “unit T S S Derived string may shrink Problems: Idea: Stop when 15/28 S → 0 S 1 | 1 S 0 S | T | derived string | > | input | T → S | ε S ⇒ 0 S 1 ⇒ 0 T 1 ⇒ 01 because of “ ε -productions”

  25. When to stop Idea: Stop when Problems: Derived string may shrink because of “unit productions” 15/28 S → 0 S 1 | 1 S 0 S | T | derived string | > | input | T → S | ε S ⇒ T ⇒ S ⇒ T ⇒ . . . S ⇒ 0 S 1 ⇒ 0 T 1 ⇒ 01 Derviation may loop because of “ ε -productions” Remove ε and unit productions

  26. 16/28 D If S is the start variable and Removing Add a new start variable T A S is not the (new) start variable E C D AC AD S C Removing ε -productions Goal: remove all A → ε rules for every non-start variable A For every rule A → ε where A the rule S → ε exists 1. Remove the rule A → ε 2. If you see B → α A β Add the rule T → S Add a new rule B → αβ S → ACD A → a B → ε C → ED | ε D → BC | b E → b

  27. 16/28 S If S is the start variable and A Add a new start variable T S E is not the (new) start variable C D AC AD Removing ε -productions Goal: remove all A → ε rules for every non-start variable A For every rule A → ε where A the rule S → ε exists 1. Remove the rule A → ε 2. If you see B → α A β Add the rule T → S Add a new rule B → αβ S → ACD D → C A → a ✘✘✘ B → ε C → ED | ε D → BC | b E → b Removing B → ε

  28. 16/28 AC If S is the start variable and A Add a new start variable T S E is not the (new) start variable C D Removing ε -productions Goal: remove all A → ε rules for every non-start variable A For every rule A → ε where A the rule S → ε exists 1. Remove the rule A → ε 2. If you see B → α A β Add the rule T → S Add a new rule B → αβ D → C | B S → ACD A → a S → AD ✘✘✘ B → ε C → ED | ✁ ε D → BC | b E → b Removing C → ε

  29. 16/28 AC If S is the start variable and A Add a new start variable T S E is not the (new) start variable C Removing ε -productions Goal: remove all A → ε rules for every non-start variable A For every rule A → ε where A the rule S → ε exists 1. Remove the rule A → ε 2. If you see B → α A β Add the rule T → S Add a new rule B → αβ D → C | B S → ACD A → a S → AD ✘✘✘ B → ε D → ε C → ED | ✁ ε D → BC | b E → b Removing C → ε

  30. 16/28 is not the (new) start variable If S is the start variable and A Add a new start variable T S Removing ε -productions Goal: remove all A → ε rules for every non-start variable A For every rule A → ε where A the rule S → ε exists 1. Remove the rule A → ε 2. If you see B → α A β Add the rule T → S Add a new rule B → αβ D → C | B S → ACD A → a S → AD | AC ✘✘✘ ✘✘✘ ✘ B → ε D → ε C → ED | ✁ ε C → E D → BC | b E → b Removing D → ε

  31. 16/28 Add a new start variable T If S is the start variable and is not the (new) start variable Removing ε -productions Goal: remove all A → ε rules for every non-start variable A For every rule A → ε where A the rule S → ε exists 1. Remove the rule A → ε 2. If you see B → α A β Add the rule T → S Add a new rule B → αβ D → C | B S → ACD A → a S → AD | AC ✘✘✘ ✘✘✘ ✘ B → ε D → ε C → ED | ✁ ε C → E D → BC | b S → A E → b Removing D → ε

  32. B A becomes B If B was removed earlier, don’t add it back 17/28 Eliminating ε -productions For every A → ε rule where A is not the start variable 1. Remove the rule A → ε 2. If you see B → α A β Add a new rule B → αβ Do 2. every time A appears B → α A β A γ yields B → αβ A γ B → α A βγ B → αβγ

  33. don’t add it back 17/28 Eliminating ε -productions For every A → ε rule where A is not the start variable 1. Remove the rule A → ε 2. If you see B → α A β Add a new rule B → αβ Do 2. every time A appears B → A becomes B → ε B → α A β A γ yields If B → ε was removed earlier, B → αβ A γ B → α A βγ B → αβγ

  34. Eliminating unit productions A unit production is a production of the form Grammar: Unit production graph: S T R 18/28 A → B S → 0 S 1 | 1 S 0 S | T T → S | R | ε R → 0 SR

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend