sablecc sablecc
play

SableCC SableCC The output is: a LALR(1) parser for the defined - PowerPoint PPT Presentation

The SableCC Tool The SableCC Tool The input is: a sequence of token definitions Compilation 2007 Compilation 2007 a context-free grammar SableCC SableCC The output is: a LALR(1) parser for the defined language


  1. The SableCC Tool The SableCC Tool � The input is: • a sequence of token definitions Compilation 2007 Compilation 2007 • a context-free grammar SableCC SableCC � The output is: • a LALR(1) parser for the defined language • available as a Java class Michael I. Schwartzbach BRICS, University of Aarhus SableCC 2 Our Favorite Grammar in SableCC Generated Classes Our Favorite Grammar in SableCC Generated Classes Helpers Productions tab = 9; start = {plus} start plus term | drwxr-xr-x 2 mis users 4096 Sep 7 09:28 analysis/ cr = 13; {minus} start minus term | drwxr-xr-x 2 mis users 4096 Sep 7 09:28 lexer/ lf = 10; {term} term; term = {mult} term star factor | drwxr-xr-x 2 mis users 4096 Sep 7 09:32 node/ Tokens {div} term slash factor | drwxr-xr-x 2 mis users 4096 Sep 7 09:28 parser/ eol = cr | lf | cr lf; {factor} factor; blank = ' ' | tab; factor = {id} id | -rw-r--r-- 1 mis users 536 Sep 7 09:32 xyz.sablecc star = '*'; {paren} lpar start rpar; slash = '/'; plus = '+'; We never need to look at this output minus = '-'; lpar = '('; rpar = ')'; id = 'x' | 'y' | 'z'; Ignored Tokens blank,eol; SableCC 3 SableCC 4 1

  2. The Main Application An Ambiguous Grammar The Main Application An Ambiguous Grammar import parser.*; import lexer.*; import node.*; X → Λ | a X | a a a a X import java.io.*; class Main { public static void main(String args[]) { try { Any string in this language has exponentially Parser p = many different parse trees new Parser ( new Lexer ( new PushbackReader(new InputStreamReader(System.in)))); Start tree = p.parse(); /* parse the input */ } catch(Exception e) { a a . . . a a a . . . a has exactly Fib(n) parse trees System.out.println(e); } } n } SableCC 5 SableCC 6 The SableCC Version SableCC is Unhappy The SableCC Version SableCC is Unhappy Tokens reduce/reduce conflict in state [stack: TA TA PX *] on EOF in { a = 'a'; [ PX = TA PX * ] followed by EOF (reduce), [ PX = TA TA PX * ] followed by EOF (reduce) Productions } x = {empty} | {one} a x | The LALR(1) table contains conflicting actions {two} [first]:a [second]:a x; � Note that all symbols must have unique names � The default name for foo is [foo]: SableCC 7 SableCC 8 2

  3. Solution: Less Stupid Grammar A Grammar for If- -Statements Statements Solution: Less Stupid Grammar A Grammar for If Tokens Tokens eol = cr | lf | cr lf; a = 'a'; blank = ' ' | tab; exp = 'exp'; Productions if = 'if'; x = {empty} | then = 'then'; else = 'else'; {one} a x ; assign = 'assign'; Ignored Tokens blank,eol; Productions stm = {one} if exp then stm | {both} if exp then [thenbranch]:stm else [elsebranch]:stm | {assign} assign; SableCC 9 SableCC 10 SableCC is Unhappy Solution: Less Natural Grammar SableCC is Unhappy Solution: Less Natural Grammar shift/reduce conflict in state [stack: TIf TExp TThen PStm *] Productions on TElse in { stm = {one} if exp then stm | [ PStm = TIf TExp TThen PStm * TElse PStm ] (shift), {both} if exp then [thenbranch]:stm2 else [elsebranch]:stm | [ PStm = TIf TExp TThen PStm * ] followed by TElse (reduce) {assign} assign; } stm2 = {both} if exp then [thenbranch]:stm2 else [elsebranch]:stm2 | {assign} assign; But the grammar does not appear to be stupid... SableCC 11 SableCC 12 3

  4. Dangling Else Problem The Palindrome Grammar Dangling Else Problem The Palindrome Grammar � An example statement: Tokens zero = '0'; one = '1'; if exp then if exp then assign else assign Productions pal = {empty} | � To which if does the else belong? {one} one | {zero} zero | � The first grammar is ambiguous {oneone} [first]:one pal [second]:one | � Our modified grammar parses the string as: {zerozero} [first]:zero pal [second]:zero; ( ) if exp then if exp then assign else assign SableCC 13 SableCC 14 SableCC is Unhappy No Solution! SableCC is Unhappy No Solution! shift/reduce conflict in state [stack: TZero *] on TZero in { � There is no LALR(1) grammar for this language [ PPal = * TZero PPal TZero ] (shift), [ PPal = * TZero ] (shift), [ PPal = * ] followed by TZero (reduce), [ PPal = TZero * ] followed by TZero (reduce) � Some grammars are not LALR(1) } � And some languages are not LALR(1) shift/reduce conflict in state [stack: TZero *] on TZero in { [ PPal = * TZero PPal TZero ] (shift), [ PPal = * TZero ] (shift), [ PPal = * ] followed by TZero (reduce), � Some grammars are ambiguous [ PPal = TZero * ] followed by TZero (reduce) } � And some languages are ambiguous shift/reduce conflict in state [stack: TZero *] on TOne in { [ PPal = * TOne PPal TOne ] (shift), [ PPal = * TOne ] (shift), [ PPal = TZero * ] followed by TOne (reduce) } SableCC 15 SableCC 16 4

  5. Language Containments EBNF Features Language Containments EBNF Features SableCC allows right-hand side abbreviations: � Optional: x = y? Context-Free { a i b j c k | i=j or j=k } � List: x = y* � Non-empty list: x = y+ Unambiguous LALR(1) This has many benefits: � shorter � less error-prone � fewer names must be invented palindromes SableCC 17 SableCC 18 EBNF Example EBNF Expansion EBNF Example EBNF Expansion � x = y? block = lbrace decl* stm+ rbrace ; decl = type id init? semicolon ; init = equals exp; x = {some} y | {none} ; � x = y* x = {zero} | {more} y x ; � x = y+ x = {one} y | {more} y x ; SableCC 19 SableCC 20 5

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend