lr 0 parsers
play

LR(0) Parsers CSCI 3130 Formal Languages and Automata Theory Siu On - PowerPoint PPT Presentation

1/31 LR(0) Parsers CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Chinese University of Hong Kong Fall 2016 2/31 The alphabet of Java CFG consists of tokens like Parsing computer programs if (n == 0) { return x; } First phase of


  1. 1/31 LR(0) Parsers CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Chinese University of Hong Kong Fall 2016

  2. 2/31 The alphabet of Java CFG consists of tokens like Parsing computer programs if (n == 0) { return x; } First phase of javac compiler: lexical analysis if ( ID == INT_LIT ) { return ID ; } Σ = { if , return , ( , ) , { , } , ; , == , ID , INT_LIT , . . . }

  3. 3/31 Expression Identifier Primary Expression Statement BlockStatement BlockStatements Block Statement Parsing computer programs Literal Primary Parse tree of a Java statement ExpressionRest Identifier Primary Expression Expression Infixop Statement ParExpression if ( ) { } == ; ID return INT_LIT if (n == 0) { return x; } ID

  4. 4/31 CFG of the java programming language Identifier: IdentifierChars but not a Keyword or BooleanLiteral or NullLiteral Literal: IntegerLiteral FloatingPointLiteral BooleanLiteral CharacterLiteral StringLiteral NullLiteral Expression: LambdaExpression AssignmentExpression AssignmentOperator: (one of) = *= /= %= += -= <<= >>= >>>= &= ^= |= from http: //java.sun.com/docs/books/jls/second_edition/html/syntax.doc.html#52996

  5. 5/31 Parsing Java programs Simple Java program: about 1000 tokens class Point2d { /* The X and Y coordinates of the point--instance variables */ private double x; private double y; private boolean debug; // A trick to help with debugging public Point2d (double px, double py) { // Constructor x = px; y = py; debug = false; // turn off debugging } public Point2d () { // Default constructor this (0.0, 0.0); // Invokes 2 parameter Point2D constructor } // Note that a this() invocation must be the BEGINNING of // statement body of constructor public Point2d (Point2d pt) { // Another consructor x = pt.getX(); y = pt.getY(); } ... }

  6. 6/31 Parsing algorithms How long would it take to parse this program? try all parse trees CYK algorithm hours Can we parse faster? CYK is the fastest known general-purpose parsing algorithm for CFGs Luckily, some CFGs can be rewritten to allow for a faster parsing algorithm! � 10 80 years

  7. 7/31 Hierarchy of context-free grammars context-free grammars LR( ∞ ) grammars LR( 1 ) grammars LR( 0 ) grammars Java, Python, etc have LR( 1 ) grammars We will describe LR( 0 ) parsing algorithm A grammar is LR( 0 ) if LR( 0 ) parser works correctly for it

  8. 8/31 6 A A S S 9 A A S 8 A S A S 7 4 S 5 A A LR( 0 ) parser: overview S → SA | A input: ()() A → ( S ) | () 1 • ()() 2 ( • )() 3 () • () • () • () ( • ) ( ) ( ) ( ) () • • • ( ) ( ) ( ) ( ) ( )

  9. 9/31 3 A S 5 A 4 LR( 0 ) parser: overview S → SA | A input: ()() A → ( S ) | () Features of LR( 0 ) parser: ◮ Greedily reduce the recently completed rule into a variable ◮ Unique choice of reduction at any time ⇒ ⇒ () • () • () • () ( ) ( )

  10. 10/31 A A S NFA N accepts To speed up parsing, keep track of partially completed rules in a PDA P In fact, the PDA will be a simple modification of an NFA N LR( 0 ) parsing using a PDA The NFA accepts if a rule B → β has just been completed and the PDA will reduce β to B ✓ ✓ … ⇒ 2 ( • )() ⇒ 3 () • () ⇒ 4 • () ⇒ 5 • () ⇒ … ( ) ( ) ✓ :

  11. 11/31 4 A S 7 Example: NFA acceptance condition A This case can be chained and Examples: S → SA | A A → ( S ) | () A rule B → β has just been completed if Case 1 input/bufger so far is exactly β 3 () • () • () ( ) Case 2 Or bufger so far is αβ and there is another rule C → α B γ () • ( )

  12. ( ) ( S ) ( S ) ( S ) ( S ) ( ) () ( ) () A A S A A S A A A S q S SA SA S A S A A S A S Designing NFA for Case 1 A 12/31 S → SA | A A → ( S ) | () Design an NFA N ′ to accept the right hand side of some rule B → β

  13. 12/31 A Designing NFA for Case 1 S S A S → SA | A A → ( S ) | () Design an NFA N ′ to accept the right hand side of some rule B → β S → • SA S → S • A S → SA • ε S → • A S → A • ε q 0 ( ) ε A → • ( S ) A → ( • S ) A → ( S • ) A → ( S ) • ε ( ) A → • () A → ( • ) A → () •

  14. , add C ( ) ( S ) ( S ) ( S ) ( S ) ( ) () ( ) () A S A A S A All blue A A A A 13/31 Designing NFA for Cases 1 & 2 SA and for longer chains For every rule C B , B B B q A S S S A S SA S A S A are -transitions Design an NFA N to accept αβ for some rules S → SA | A C → α B γ, B → β A → ( S ) | ()

  15. 13/31 and for longer chains S A A S Designing NFA for Cases 1 & 2 Design an NFA N to accept αβ for some rules S → SA | A C → α B γ, B → β A → ( S ) | () ε For every rule C → α B γ , B → β , add C → α • B γ B → • β S → • SA S → S • A S → SA • ε All blue − → are ε -transitions S → • A S → A • ε q 0 ( ) ε A → • ( S ) A → ( • S ) A → ( S • ) A → ( S ) • ε ( ) A → • () A → ( • ) A → () •

  16. 14/31 Summary of the NFA X The NFA N will accept whenever a rule has just been completed For every rule B → β , add ε B → • β q 0 For every rule B → α X β ( X may be terminal or variable), add B → α • X β B → α X • β Every completed rule B → β is accepting B → β • For every rule C → α B γ , B → β , add ε C → α • B γ B → • β

  17. 15/31 Equivalent DFA D for the NFA N A A S A A Observation: every accepting state contains only one rule: S Dead state (empty set) not shown for clarity S → • SA S → SA • S → S • A S → • A A → • ( S ) A → • ( S ) A → • () A → • () ( A → ( S • ) ( A → ( • S ) S → S • A A → ( • ) A → • ( S ) S → A • S → • SA A → • () ( S → • A ) ( A → • ( S ) ) A → () • A → ( S ) • A → • () a completed rule B → β • , and such rules appear only in accepting states

  18. 16/31 Every accepting state contains only one rule: and completed rules appear only in accepting states Shifu state: no completed rule Reduce state: has (unique) completed rule LR( 0 ) grammars A grammar G is LR( 0 ) if its corresponding D G satisfies: a completed rule of the form B → β • A → ( S ) • S → S • A A → • ( S ) A → • ()

  19. 17/31 Simulating DFA D Our parser P simulates state transitions in DFA D Solution: keep track of previous states in a stack go back to the correct state by looking at the stack ⇒ (() • ) ( A • ) ( ) Afuer reducing () to A , what is the new state?

  20. 18/31 Let’s label D ’s states A S A A S A q 1 q 2 q 3 S → • SA S → SA • S → S • A S → • A A → • ( S ) A → • ( S ) A → • () A → • () q 6 ( A → ( S • ) ( q 5 A → ( • S ) S → S • A A → ( • ) A → • ( S ) q 4 S → A • S → • SA A → • () ( S → • A ) ( A → • ( S ) ) q 8 q 7 A → () • A → ( S ) • A → • ()

  21. 19/31 2. constructs part of the parse tree X k B symbol is B 1. P simulates D ’s transition upon reading terminal or variable X At D ’s non-accepting state q i completed rules P ’s stack contains labels of D ’s states to remember progress of partially LR( 0 ) parser: a “PDA” P simulating DFA D 2. P pushes current state label q i onto its stack At D ’s accepting state with completed rule B → X 1 . . . X k 1. P pops k labels q k , . . . , q 1 from its stack . . . X 1 X 2 3. P goes to state q 1 (last label popped earlier), pretend next input

  22. 20/31 3 5 stack state A 6 4 Example S A A 4 A S 2 state stack 1 3 • ()() $ q 1 ( • )() $1 q 5 • () () • () $15 $1 q 2 q 8 • A () $ q 1 ( ) ( ) • () $1 q 4 ( • ) $12 q 5 ( ) • S () $ q 1 ( ) ( )

  23. 21/31 S A A Example state stack 8 A 8 A 9 S S A A S parser’s output is the parse tree S 7 state A stack S 7 A • S $ q 1 () • $125 q 8 ( ) ( ) • A $1 ( ) q 2 • $1 q 2 ( ) ( ) • $12 ( ) q 3 ( ) ( ) ( )

  24. 22/31 NFA N : A A Another LR( 0 ) grammar L = { w # w R | w ∈ { a , b } ∗ } C → a C a | b C b | # a a C → • a C a C → a • C a C → a C • a C → a C a • ε ε ε ε # C → • # C → # • q 0 ε ε ε ε ε b C → • b C b C → b • C b C → b C • b C → b C b • b

  25. 23/31 S 2 4 5 6 7 8 input: stack state action 1 4 C S 3 S 2 R 5 S 7 R 6 S 8 1 3 C R Another LR( 0 ) grammar C → a C a | b C b | # ba#ab C → • a C a # C → • b C b C → # • C → • # $ b # $1 a # $14 C → a • C a C → b • C b b $143 C → • a C a C → • a C a $143 a b C → • b C b C → • b C b $1435 a C → • # C → • # $14 $146 C → a C • a C → b C • b a b C → a C a • C → b C b •

  26. 24/31 Deterministic PDAs Some CFLs require non-deterministic PDAs, such as PDA for LR( 0 ) parsing is deterministic L = { ww R | w ∈ { a , b } ∗ } What goes wrong when we do LR( 0 ) parsing on L ?

  27. 25/31 Example 2 NFA N : A A L = { ww R | w ∈ { a , b } ∗ } C → a C a | b C b | ε a a C → • a C a C → a • C a C → a C • a C → a C a • ε ε ε ε C → • q 0 ε ε ε ε ε b C → • b C b C → b • C b C → b C • b C → b C b • b

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend