predictive parsers ll k parsing
play

Predictive Parsers LL(k) Parsing Can we avoid backtracking? Yes, if - PDF document

10/17/2012 Predictive Parsers LL(k) Parsing Can we avoid backtracking? Yes, if for a given input symbol and given non- LL(k) terminal, we can choose the alternative appropriately. L left to right scan L leftmost derivation


  1. 10/17/2012 Predictive Parsers LL(k) Parsing Can we avoid backtracking? Yes, if for a given input symbol and given non- LL(k) terminal, we can choose the alternative appropriately. • L — left to right scan • L — leftmost derivation This is possible if the first terminal of every alternative in a production is unique: • k — k symbols of lookahead A → a B D | b B B B → c | b c e in practice, k = 1 D → d parsing an input “abced” has no backtracking. It is table-driven and efficient. Left factoring to enable predication: A →  |  change to  A’ A → A’ →  |  For predicative parsers, must eliminate left recursion LL(k) Parser Structure Sample Parse Table … int * + ( ) $ Input Tokens: $ E → TX E → TX E Read head X →  X →  X → +E X T → int Y T → ( E ) T Top Output Parser Driver Y →  Y →  Y →  Y → * T Y Syntax Stack Parse table Implementation with 2-D parse table: • A row for each non-terminal • A column for all possible terminals and $ (the end of input marker) $ • Every table entry contains at most one production • Required for a grammar to be LL(1) • No backtracking Syntax stack — hold right hand side (RHS) of grammar rules Parse table — M[A,b] — an entry containing rule “ A → … ” or error Fixed action for each (non-terminal, input symbol) combination Parser driver — next action based on (current token, stack top) LL(1) Parsing Algorithm Push RHS in Reverse Order X — symbol at the top of the syntax stack X — symbol at the top of the syntax stack a — current input symbol a — current input symbol Parsing based on (X, a) : if M[X,a] = “ X → B c D ”: If X = a = $, then parser halts with “ success ” If X = a ≠ $, then pop X from stack and advance input head If X ≠ a, then B Case (a): if X  T, then c parser halts with “ failed ,” input rejected X D Case (b): if X  N, M[X,a] = “ X → RHS ” … … pop X and push RHS to stack in reverse order $ $ 1

  2. 10/17/2012 LL(1) Grammars LL(1) Parsing Remove left recursive and perform left factoring int * int $ Input Tokens: Given the grammar: E → T + E | T Read head T → int * T | int | ( E ) E Top The grammar has no left recursion but requires left factoring. $ After rewriting grammar, we have: E → TX Parse table X → + E |  int * + ( ) $ T → int Y | ( E ) E → TX E → TX Y → * T |  E X →  X →  X → +E X T → int Y T → ( E ) T Y →  Y →  Y →  Y → * T Y LL(1) Parsing LL(1) Parsing Input Tokens: int * int $ Input Tokens: int * int $ Read head Read head T Top E Top X $ $ Parse table Parse table int * + ( ) $ int * + ( ) $ E → TX E → TX E → TX E → TX E E X →  X →  X →  X →  X → +E X → +E X X T → int Y T → ( E ) T → int Y T → ( E ) T T Y →  Y →  Y →  Y →  Y →  Y →  Y → * T Y → * T Y Y LL(1) Parsing LL(1) Parsing int * int $ int * int $ Input Tokens: Input Tokens: int Top Read head Read head T Top Y X X $ $ Parse table Parse table int * + ( ) $ int * + ( ) $ E E → TX E → TX E E → TX E → TX X →  X →  X →  X →  X X → +E X X → +E T T → int Y T → ( E ) T T → int Y T → ( E ) Y →  Y →  Y →  Y →  Y →  Y →  Y Y → * T Y Y → * T 2

  3. 10/17/2012 LL(1) Parsing LL(1) Parsing int * int $ int * int $ Input Tokens: Input Tokens: Read head Read head Y Y Top Top X X $ $ Parse table Parse table int * + ( ) $ int * + ( ) $ E → TX E → TX E → TX E → TX E E X →  X →  X →  X →  X → +E X → +E X X T → int Y T → ( E ) T → int Y T → ( E ) T T Y →  Y →  Y →  Y →  Y →  Y →  Y → * T Y → * T Y Y LL(1) Parsing LL(1) Parsing Input Tokens: int * int $ Input Tokens: int * int $ * Top Read head Read head T T Top X X $ $ Parse table Parse table int * + ( ) $ int * + ( ) $ E → TX E → TX E → TX E → TX E E X →  X →  X →  X →  X → +E X → +E X X T → int Y T → ( E ) T → int Y T → ( E ) T T Y →  Y →  Y →  Y →  Y →  Y →  Y → * T Y → * T Y Y LL(1) Parsing LL(1) Parsing int * int $ int * int $ Input Tokens: Input Tokens: int Top Read head Read head T Top Y X X $ $ Parse table Parse table int * + ( ) $ int * + ( ) $ E E → TX E → TX E E → TX E → TX X →  X →  X →  X →  X X → +E X X → +E T T → int Y T → ( E ) T T → int Y T → ( E ) Y →  Y →  Y →  Y →  Y →  Y →  Y Y → * T Y Y → * T 3

  4. 10/17/2012 LL(1) Parsing LL(1) Parsing int * int $ int * int $ Input Tokens: Input Tokens: Read head Read head Y Top X X Top $ $ Parse table Parse table int * + ( ) $ int * + ( ) $ E → TX E → TX E → TX E → TX E E X →  X →  X →  X →  X → +E X → +E X X T → int Y T → ( E ) T → int Y T → ( E ) T T Y →  Y →  Y →  Y →  Y →  Y →  Y → * T Y → * T Y Y LL(1) Parsing LL(1) Parsing Input Tokens: int * int $ Input Tokens: int * int $ Read head Read head Accept! $ Top $ Top Parse table Parse table int * + ( ) $ int * + ( ) $ E → TX E → TX E → TX E → TX E E X →  X →  X →  X →  X → +E X → +E X X T → int Y T → ( E ) T → int Y T → ( E ) T T Y →  Y →  Y →  Y →  Y →  Y →  Y → * T Y → * T Y Y Action List Constructing the Parse Table We need to know what non-terminals to place our productions in the table? Stack Input Action E $ int * int $ E → TX We know that we have restricted our grammars so that left recursion is eliminated T X $ int * int $ T → int Y and they have been left factored. That means that each production is uniquely int Y X $ int * int $ terminal recognizable by the first terminal that production would derive. Y X $ * int $ Y → * T * T X $ * int $ terminal Thus, we can construct our table from 2 sets: T X $ int $ T → int Y • For each symbol A, the set of terminals that can begin a string derived from A. This set is called the FIRST set of A int Y X $ int $ terminal • For each non-terminal A, the set of terminals that can appear after a Y X $ $ Y →  string derived from A is called the FOLLOW set of A X $ $ X →  $ $ Halt and accept 4

  5. 10/17/2012 First(  ) Follow(  ) First(  ) = set of terminals that start string of terminals derived from  . Follow (  ) = { t | S ⇒ *  t  } Intuition: if X → A B , then First ( B ) ⊆ Follow ( A ) • Apply following rules until no terminal or  can be added ∗ However, B may be  i.e., � • ⇒ � 1. If t  T, then First ( t ) = { t }. For example First ( + ) = { + }. Apply following rules until no terminal or  can be added 1. $  Follow ( S ), where S is the start symbol. 2. If X  N and X →  exists (nullable), then add  to First ( X ). e.g., Follow ( E ) = {$ ... }. For example, First ( Y ) = { *,  }. 2. Look at the occurrence of a non-terminal on the right hand side of a 3. If X  N and X → Y 1 Y 2 Y 3 … Y m , where Y 1 , Y 2 , Y 3 , ... Y m are non- production which is followed by something terminals, then: If A →  B  , then First (  ) - {  } ⊆ Follow ( B ) for each i from 1 to m if Y 1 … Y i-1 are all nullable (or if i = 1) 3. Look at N on the RHS that is not followed by anything, if ( A →  B ) or ( A →  B  and   First (  )), First ( X ) = First ( X ) ∪ First ( Y i ) then Follow ( A ) ⊆ Follow ( B ) Algorithm to Compute FIRST, Example FOLLOW, and nullable Initialize FIRST and FOLLOW to all empty sets, and nullable to all Grammar: false. Symbol First Follow E → T X ( X → + E |  ( foreach terminal symbol Z ) T → int Y | ( E ) ) FIRST[Z] ← {Z} Y → * T |  + + do foreach production X → Y 1 Y 2 … Y k * * if Y 1 … Y k are all nullable (or if k = 0) Int First Set: Follow Set: int then nullable[X] ← true E → T X $ *,  $, ), + Y foreach i from 1 to k, each j from i + 1 to k X → + E E → T X +,  $, ) X →  X if Y 1 … Y i − 1 are all nullable (or if i = 1) X → + E then FIRST[X] ← FIRST[X] ∪ FIRST[Y i ] T → int Y T → int Y (, int $, ), + T if Y i+1 … Y k are all nullable (or if i = k) T → ( E ) T → ( E ) (, int $, ) E Y → * T Y → * T then FOLLOW[Y i ] ← FOLLOW[Y i ] ∪ FOLLOW[X] Y →  if Y i+1 … Y j − 1 are all nullable (or if i + 1 = j ) then FOLLOW[Y i ] ← FOLLOW[Y i ] ∪ FIRST[Y j ] until FIRST, FOLLOW, and nullable did not change in this iteration. Constructing LL(1) Parse Table Constructing LL(1) Parse Table To construct the parse table, we check each A →  For each terminal a  First (  ), add A →  to M[A,  ]. Symbol First Follow ( ( For each terminal a  First (  ), add A →  to M[A,  ]. ) • ) Grammar: + + E → T X * * • If   First (  ), then for each terminal b  Follow (A), X → + E int int • add A →  to M[A,  ]. X →  *,  Y $, ), + T → int Y +,  $, ) X If   First (  ) and $  Follow (A), then add A →  to M[A, $]. • T → ( E ) (, int $, ), + T Y → * T E (, int $, ) Y →  int * + ( ) $ E → T X E → T X E X → + E X T → int Y T → ( E ) T Y → * T Y 5

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend