csci 2320 syntactic analysis ch 3
play

CSCI-2320 Syntactic Analysis (Ch 3 & Wikipedia for CYK) - PDF document

10/3/17 CSCI-2320 Syntactic Analysis (Ch 3 & Wikipedia for CYK) Mohammad T . Irfan AKA "parser" Stream of Parse tree/ Parser tokens syntax error Question: What is the grammar? 1 10/3/17 Parsing algorithms u Predictive


  1. 10/3/17 CSCI-2320 Syntactic Analysis (Ch 3 & Wikipedia for CYK) Mohammad T . Irfan AKA "parser" Stream of Parse tree/ Parser tokens syntax error Question: What is the grammar? 1

  2. 10/3/17 Parsing algorithms u Predictive parser (as opposed to backtracking) u Recursive descent (RD) parser: each nonterminal is a function that recognizes input derivable from that nonterminal u Top-down u LL(1): left to right scan, left-most derivation, and 1 token look-ahead RD parser for assignment stmt Assignment à Id = Expr; Expr à Term {AddOp Term} AddOp à + | - Term à Factor {MulOp Factor} MulOp à * | / Factor à [UnaryOp] Primary UnaryOp à - Primary à Id | IntLiteral | FloatLiteral | (Expr) ... <Lexical syntax for Id, IntLiteral, FloatLiteral> ... 2

  3. 10/3/17 Python code for smaller version Expr à Term {(+|-) Term} Term à Factor {(*|/) Factor} Factor à IntLiteral ... <Lexical syntax for IntLiteral> ... u Code is available on Blackboard under Assignment 2 u parser_v1.py: Only check for syntactic correctness (expression evaluation later when we do semantics) Requirements for RD parser 1. Remove left recursions (why?) 2. Do "left factoring" 3

  4. 
 
 10/3/17 Removing left recursion u Example + u Algorithm (assume no cycle; i.e., no A => A) Nonterminals: A 1 , A 2 , ..., A n (ordered arbitrarily) For each i For each j < i No left recursion here Let A j à δ 1 | δ 2 | ... | δ k Replace each A i à A j γ by 
 A i à δ 1 γ | δ 2 γ | ... | δ k γ Eliminate left recursion from all A i products Left factoring u IfStmt à if Expr then Stmt u IfStmt à if Expr then Stmt else Stmt u Why can't RD parser deal with it? u Solution u Find the largest prefix α and factor it out A à αβ 1 | αβ 2 
 A à α A' 
 A' à β 1 | β 2 4

  5. 10/3/17 Literature review u NP-hard: Given a CFG, is there an LL(1) parser? u Impossibility example: LG = {a n 0 b n | n >= 1} U {a n 1 b 2n | n >= 1} u Why is an LL(1) impossible? Literature review u Is there always a parser (not necessarily LL(1)) for any CFG? u CYK algorithm: Cocke & Younger (1967) and Kasami (1965) u First parser for any CFG u Bottom-up parser u Frost (2007): First top-down parser for any CFG; improved by Ridge (2014) 5

  6. 10/3/17 CYK Parsing Algorithm https://en.wikipedia.org/wiki/CYK_algorithm What it does u Given (1) a CFG and (2) a string, verifies whether the string can be derived by this grammar u Example u Detects syntactic errors in a given C program 6

  7. 10/3/17 Requirements u CFG must be in Chomsky Normal Form (CNF) A à BC A à a u No ε in any product u OK to have left recursion! u Left factoring is out of question (why?) Idea u Bottom-up approach + dynamic programming u Start with individual symbols of input string u Combine multiple symbols together u 2 symbols u 3 symbols u ... u Climb up the grammar hierarchy u Yes answer to parsing we can get to the start symbol 7

  8. 10/3/17 CYK example u Input CFG Expr à Expr + Term | Expr – Term | Term Term à Term*Factor | Term/Factor | Factor Factor à 0 | 1 | ... | 9 u CNF Expr à Expr X X à AddOp Term AddOp à + | – Expr à Term Y #Avoid bypassing Expr à Term à ... Term à Term Y Y à MultOp Factor MultOp à * | / Factor à 0 | 1 | ... | 9 Term à 0 | 1 | ... | 9 Expr à 0 | 1 | ... | 9 CYK example (cont...) u Input string: 2 – 3 * 4 Expr à Expr X X à AddOp Term AddOp à + | – Expr à Term Y 5 Expr Term à Term Y Y à MultOp Factor 4 MultOp à * | / X Length Expr Term, Factor à 0 | 1 | ... | 9 3 Expr Term à 0 | 1 | ... | 9 Expr à 0 | 1 | ... | 9 X 2 Y Expr, Expr, Expr, 1 Term, AddOp Term, MultOp Term, Factor Factor Factor 2 – 3 * 4 Start index 1 2 3 4 5 j 8

  9. 
 10/3/17 CYK Algorithm Inputs: CNF grammar and n tokens Fill in the row for length 1 For each length i from 2 to n: For each index j from 1 to n-i+1: A à BC? 
 For k = length of B from 1 to i-1: If there's a product A à BC s.t. 
 B is in cell (j,k) and 
 C is in cell (j+k, i-k): Add A to cell (j,i) Return True iff cell (1,n) contains 
 the start symbol. Negative example u Input string: Expr à Expr X X à AddOp Term 2 + 3 * / AddOp à + | – Term à Term Y Y à MultOp Factor MultOp à * | / Factor à 0 | 1 | ... | 9 Term à 0 | 1 | ... | 9 Expr à 0 | 1 | ... | 9 9

  10. 10/3/17 Class Participation 4 u CNF grammar S à AX | AB X à SB A à 0 B à 1 u Parse the following strings using the CYK alg u 0011 ✔ u 01010 ✗ u Collaboration level: 0 (work freely in groups) 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend