parsing part i
play

Parsing, Part I Jim Royer April 2, 2019 CIS 352 Parsing, Part I 1 - PowerPoint PPT Presentation

CIS 352 Parsing, Part I Jim Royer April 2, 2019 CIS 352 Parsing, Part I 1 Miss Teen South Carolinas Famous Answer https://kellblog.com/2007/09/01/parsing-the-unparseable-miss-teen-south-carolinas-answer/ CIS 352 Parsing, Part I 2


  1. CIS 352 Parsing, Part I Jim Royer April 2, 2019 CIS 352 ❖ Parsing, Part I 1

  2. Miss Teen South Carolina’s Famous Answer https://kellblog.com/2007/09/01/parsing-the-unparseable-miss-teen-south-carolinas-answer/ CIS 352 ❖ Parsing, Part I 2

  3. The Syntactic Side of Languages (Again) Natural Languages stream of stream of via parsing via lexical − − − − − → − − − − − − → sentences phonemes words analysis Artificial Languages abstract stream of stream of via parsing via lexical − − − − − → − − − − − − → syntax characters tokens analysis Tokens: Variable names, numerals, operators key-words, . . . int main ( void ) { int main(void) { printf ( "hello, world \ n" ) ; printf("hello, world \ n"); return 0; return 0 ; } } CIS 352 ❖ Parsing, Part I 3

  4. Context Free Grammars, 1 Grammars rules for organizing ◮ word-streams into sentences ◮ token-streams into abstract syntax (parse trees) Context Free Grammmars (CFGs) ◮ Terminals: concrete syntax (e.g., printf ( . . . ) ◮ Nonterminals: syntactic categories: (e.g., Noun-Phrase, key-word, . . . ) Example (Palandromes over { a , b , c } ) A :: = ǫ | a | b | c | aAa | bAb | cAc CIS 352 ❖ Parsing, Part I 4

  5. CFGs Examples: LC P :: = C | E | B Phases Commands C :: = skip | ℓ : = E | C ; C | if B then C else C | while B do C Integer Expressons E :: = n | ! ℓ | E ⊛ E ( ⊛ ∈ { + , − , × , . . . } ) B :: = b | E ⊛ E ( ⊛ ∈ { = , < , ≥ , . . . } ) Boolean Expressons n ∈ Z = { . . . , − 3, − 2, − 1, 0, 1, 2, 3, . . . } Integers Booleans b ∈ B = { true, false } ℓ ∈ L = { x 0 , x 1 , x 2 , . . . } Locations ! ℓ ≡ the integer currently stored in ℓ x1 := 1; x2 := !x0; // Computes factorial of !x0 while (!x2>0) do x1 := (!x1*!x2); x2 := (!x2-1) CIS 352 ❖ Parsing, Part I 5

  6. CFGs Examples: A Fragment of English � sentence � :: = � subject �� verb1 � | � subject �� verb2 �� object � � subject � :: = � article �� noun � | � pronoun � � object � :: = that � sentence � � verb1 � :: = swims | pauses | exists � verb2 � :: = believes | hopes | imagines � article � :: = a | some | the � noun � :: = lizard | truth | man � pronoun � :: = he | she | it CIS 352 ❖ Parsing, Part I 6

  7. CFGs, 2 ◮ CFGs recursively specify a finite collection of sets of strings, syntactic categories . ◮ Each syntactic category is named by a nonterminal symbol . E.g.: � object � , � verb1 � , and � noun � . ◮ One of the nonterminals is chosen to be the start symbol ; its syntactic category is the language given by the grammar. E.g.: � sentence � . ◮ A syntactic category (named by nonterminal N ) is described by a set of productions of the form: N :: = X 1 . . . X n where each X 1 is a terminal or nonterminal (and n could be 0). E.g.: � sentence � :: = � subject �� verb1 � � sentence � :: = � subject �� verb2 �� object � � object � :: = that � sentence � CIS 352 ❖ Parsing, Part I 7

  8. Example: Translating a regular expression to CFG Notation: X e = the nonterminal for reg. exp. e For: Add: e = a X e :: = a e = ǫ X e :: = ǫ e = ( e 1 | e 2 ) X e :: = X e 1 | X e 2 e = ( e 1 e 2 ) X e :: = X e 1 X e 2 e = ( e ′ ) ∗ X e :: = X e ′ X e | ǫ For e = ( 01 | 10 ) ∗ : X ( 01 | 10 ) ∗ :: = X 01 | 10 X ( 01 | 10 ) ∗ | ǫ X 01 | 10 :: = X 01 | X 10 X 01 :: = X 0 X 1 X 10 :: = X 1 X 0 X 0 :: = 0 X 1 :: = 1 CIS 352 ❖ Parsing, Part I 8

  9. A Big-Step Semantics for CFG Notation: N ⇓ w means w is in N ’s syntactic category. N 1 ⇓ w 1 · · · N k ⇓ w k � � N :: = u 0 N 1 u 1 N 2 . . . N k u k w = u 0 w 1 u 1 . . . w k u k N ⇓ w � exp � :: = � exp � + � exp � � num � ⇓ 3 � num � ⇓ 4 | � exp � − � exp � � num � ⇓ 2 � exp � ⇓ 3 � exp � ⇓ 4 ( ⋆ ) | � exp � ∗ � exp � � exp � ⇓ 2 � exp � ⇓ 3 ∗ 4 (†) | � exp � / � exp � � exp � ⇓ 2 + 3 ∗ 4 | � num � | ( � exp � ) ( ⋆ ) “3*4” = “3”++“*”++“4” (†) “2+3*4” = “2”++“+”++“3*4” A dodgy grammar CIS 352 ❖ Parsing, Part I 9

  10. Parse Trees Exp Exp � exp � :: = � exp � + � exp � | � exp � − � exp � Exp + Exp Exp * Exp | � exp � ∗ � exp � | � exp � / � exp � 2 Exp * Exp Exp + Exp 4 | � num � | ( � exp � ) 3 4 2 3 Two parses of 2 + 3 ∗ 4 Definition (Ambiguity) A CFG is abmiguous when some some string in the language has two possible parses. (Great for lawyers, not-so-great in computing.) [From a newspaper discussion of a documentary on Merle Haggard.] “Among those interviewed were his two ex-wives, Kris Kristofferson and Robert Duvall.” CIS 352 ❖ Parsing, Part I 10

  11. Grammar Repair, 1 ( § 3.4 in Mogensen) Definition Suppose ⊕ is an operator (e.g., + , ∗ , < ). (a) ⊕ is left-associative when a ⊕ b ⊕ c = ( a ⊕ b ) ⊕ c . (E.g., − , /) (b) ⊕ is right-associative when a ⊕ b ⊕ c = a ⊕ ( b ⊕ c ) . (E.g., :, = in C) (c) ⊕ is non-associative when a ⊕ b ⊕ c is illegal. (E.g., < ) ◮ + and ∗ can be either left- or right-associative. ◮ To be consistent with − and /, we treat them as left-assoc. For rewrite to E :: = E ⊕ E ′ | E ′ left-assoc. ⊕ E :: = E ⊕ E | � num � E ′ :: = � num � E :: = E ′ ⊕ E | E ′ right-assoc. ⊕ E :: = E ⊕ E | � num � E ′ :: = � num � [What is the parse of 1 ⊕ 2 ⊕ 3 under these two grammars?] CIS 352 ❖ Parsing, Part I 11

  12. Grammar Repair, 2 ( § 3.4 in Mogensen) Definition Operators have an ordering called precedence . In an expression a ⊕ b ⊙ c : ◮ if precedence ( ⊕ ) > precedence ( ⊙ ) , then: a ⊕ b ⊙ c = ( a ⊕ b ) ⊙ c . ◮ if precedence ( ⊕ ) < precedence ( ⊙ ) , then: a ⊕ b ⊙ c = a ⊕ ( b ⊙ c ) . ◮ if precedence ( ⊕ ) = precedence ( ⊙ ) , then: ➱ if ⊕ and ⊗ are both left-assoc., then: a ⊕ b ⊙ c = ( a ⊕ b ) ⊙ c . ➱ if ⊕ and ⊙ are both right-assoc., then: a ⊕ b ⊙ c = a ⊕ ( b ⊙ c ) . ➱ Otherwise, no standard answer. CIS 352 ❖ Parsing, Part I 12

  13. Grammar Repair, 3 ( § 3.4 in Mogensen) � exp � :: = � exp � + � exp � | � exp � − � exp � (level 1 precedence) | � exp � ∗ � exp � | � exp � / � exp � (level 2 precedence) | � num � | ( � exp � ) (level 3 precedence) ◮ Handle left- and right-associativity as before. ◮ Each level gets its own nonterminal. ◮ Go from lowest to highest precedence levels. � exp � 1 :: = � exp � 1 + � exp � 2 | � exp � 1 − � exp � 2 | � exp � 2 � exp � 2 :: = � exp � 2 ∗ � exp � 3 | � exp � 2 / � exp � 3 | � exp � 3 � exp � 3 :: = � num � | ( � exp � 1 ) [More problems and repairs in the next homework.] CIS 352 ❖ Parsing, Part I 13

  14. A Small-Steps Semantics for CFGs Warning: Greek letters! Notation: (a) α N β ⇒ αγβ means α N β rewrites to αγβ by � � N :: = γ applying the production N :: = γ . is in G G ⊢ α N β ⇒ αγβ (b) ⇒ ∗ = the reflexive-transitive closure of ⇒ . � sentence � ⇒ � subject � � verb2 � � object � ⇒ � article � � noun � � verb2 � � object � ⇒ the � noun � � verb2 � � object � ⇒ the man � verb2 � � object � ⇒ the man believes � object � ⇒ the man believes that � sentence � ⇒ the man believes that � subject � � verb1 � ⇒ the man believes that � article � � noun � � verb1 � ⇒ the man believes that some � noun � � verb1 � ⇒ the man believes that some lizard � verb1 � ⇒ the man believes that some lizard exists CIS 352 ❖ Parsing, Part I 14

  15. Digression See Graham Hutton’s slides for Chapter 8 of his “Programming in Haskell” text http://www.cs.nott.ac.uk/~gmh/chapter8.ppt Also: ◮ Hutton’s “Programming in Haskell, 2/e” homepage: http://www.cs.nott.ac.uk/~gmh/book.html ◮ Hutton’s Example Parsing Library (From the 1st edition — Not GHC 8.0.1 compliant) : http://www.cs.nott.ac.uk/~gmh/Parsing.lhs ◮ Erik Meijer’s video lecture based on the Hutton’s Chapter 8 http://channel9.msdn.com/Series/ C9-Lectures-Erik-Meijer-Functional-Programming-Fundamentals/ C9-Lectures-Dr-Erik-Meijer-Functional-Programming-Fundamentals-Chapter- (Skip to time 6:05 for the beginning for the discussion of parsers.) . . . CIS 352 ❖ Parsing, Part I 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend