hw8 use lex yacc to turn this into this
play

HW8Use Lex/Yacc to Turn this: Into this: <P> - PowerPoint PPT Presentation

HW8Use Lex/Yacc to Turn this: Into this: <P> Here's a list: Here's a list: * This is item one of a list Lex and Yacc <UL> * This is item two. Lists should be <LI> This is item one of a list indented four


  1. HW8–Use Lex/Yacc to Turn this: Into this: <P> Here's a list: Here's a list: * This is item one of a list Lex and Yacc <UL> * This is item two. Lists should be <LI> This is item one of a list indented four spaces, with each item <LI>This is item two. Lists should be marked by a "*" two spaces left of indented four spaces, with each item four-space margin. Lists may contain marked by a "*" two spaces left of four- nested lists, like this: space margin. Lists may contain * Hi, I'm item one of an inner list. nested lists, like this:<UL><LI> Hi, I'm A Quick Tour * Me two. item one of an inner list. <LI>Me two. * Item 3, inner. <LI> Item 3, inner. </UL><LI> Item 3, * Item 3, outer list. outer list.</UL> This is outside both lists; should be back This is outside both lists; should be to no indent. back to no indent. <P><P> Final suggestions: Final suggestions 2 if myVar == 6.02e23**2 then f( .. � Lex / Yacc History char stream LEX token stream ! Origin – early 1970’s at Bell Labs if myVar == 6.02e23**2 then f( � ! Many versions & many similar tools tokenstream YACC ! Lex, flex, jflex, posix, … parse tree ! Yacc, bison, byacc, CUP, posix, … if-stmt ! Targets C, C++, C#, Python, Ruby, ML, … == fun call ! We’ll use jflex & byacc/j, targeting java (but for simplicity, I usually just say lex/yacc) var ** Arg 1 Arg 2 float-lit int-lit . . . � 3 4

  2. Lex: Uses A Lexical Analyzer Generator ! Input: ! “Front end” of many real compilers ! Regular exprs defining "tokens" ! E.g., gcc my.flex ! Fragments of declarations & code ! “Little languages”: ! Output: jflex ! Many special purpose utilities evolve some ! A java program “yylex.java” clumsy, ad hoc , syntax ! Use: yylex.java ! Often easier, simpler, cleaner and more ! Compile & link with your main() flexible to use lex/yacc or similar tools from ! Calls to yylex() read chars & return the start successive tokens. 5 7 yacc: A Parser Generator Lex Input: "mylexer.flex" ! Input: // java stuff my.y %% %: Lex ! A context-free grammar section %byaccj ! Fragments of declarations & code Declarations & code: most delims %{ copied verbatim to java pgm byaccj ! Output: public foo()… ! A java program & some “header” files %} ! Use: ParserVal.java Token code %% ! Compile & link it with your main() Rules/ [a-zA-Z]+ {foo(); return(42); } regexps ! Call yyparse() to parse the entire input Parser.java [ \t\n] {; /* skip whitespace */} + … ! yyparse() calls yylex() to get successive tokens {Actions} No action 9 11

  3. S ! E E ! E+n | E-n | n Lex Regular Expressions ! Yacc Input: “expr.y” %{ Letters & numbers match themselves ! Parser.java Java decls import java.io.*;… %} Ditto \n, \t, \r ! Yacc decls Parser.java %token NUM VAR Punctuation often has special meaning ! %% But can be escaped: \* matches “*” ! stmt: exp { printf(”%d\n”,$1);} Union, Concatenation and Star ! ; Rules exp : exp ’+’ NUM { $$ = $1 + $3; } r|s, rs, r*; also r+, r?; parens for grouping ! and | exp ’-’ NUM { $$ = $1 - $3; } Character groups ! {Actions} | NUM { $$ = $1; } [ab*c] == [*cab], [a-z2648AEIOU], [^abc] ! ; C code; java ex later “^” for “not” only in char groups, not complementation ! %% Parser.java Java code public static void main(… 12 14 Lex/Yacc Interface: Expression lexer: “expr.l” Compile Time y.tab.h: %{ my.y my.flex more.java #define NUM 258 #include "y.tab.h" #define VAR 259 byaccj #define YYSTYPE int jflex %} extern YYSTYPE yylval; %% [0-9]+ { yylval = atoi(yytext); return NUM;} Yylex.java Parser.java ParserVal.java [ \t] { /* ignore whitespace */ } \n { return 0; /* logical EOF */ } javac . { return yytext[0]; /* +-*, etc. */ } %% yyerror(char *msg){printf("%s,%s\n",msg,yytext);} Parser.class int yywrap(){return 1;} 15 17

  4. Lex/Yacc Interface: Parser “Value” class Run Time public class ParserVal 
 //then do � main() { 
 yylval = new ParserVal(3.14); 
 public int ival; 
 yylval = new ParserVal(42); 
 public double dval; 
 // ...or something like... 
 public String sval; 
 yyparse() yylval = new ParserVal(new 
 public Object obj; 
 myTypeOfObject()); public ParserVal(int val) 
 { ival=val; } 
 Token code public ParserVal(double val) 
 yylex () yylval { dval=val; } 
 // in yacc actions, e.g.: � public ParserVal(String val) 
 { sval=val; } 
 $$.ival = $1.ival + $2.ival; 
 Myaction: public ParserVal(Object val) 
 $$.dval = $1.dval - $2.dval; � ... { obj=val; } 
 Token value }//end class � yylval = ... ... return(code) 18 20 “Calculator” example On this & More Yacc Declarations From http://byaccj.sourceforge.net/ next 3 slides, some details may be missing or %{ � wrong, but import java.lang.Math; � the big import java.io.*; � picture is OK import java.util.StringTokenizer; � Token %token BHTML BHEAD BTITLE BBODY P BR LI %} � names & %token EHTML EHEAD ETITLE EBODY /* YACC Declarations; mainly op prec & assoc */ � types %token <sval> TEXT %token NUM � %left '-' '+’ � Type of yylval (if any) Nonterm %left '*' '/’ � %type <obj> page head title %left NEG /* negation--unary minus */ � names & %right '^' /* exponentiation */ � %type <obj> words list item items types /* Grammar follows */ � %% � %start page Start sym ... � 22 25

  5. %% � ... � String ins; � /* Grammar follows */ � StringTokenizer st; � %% � void yyerror(String s){ � input: /* empty string */ � input is one expression per line; System.out.println("par:"+s); � | input line � output is its value } � ; � boolean newline; � NOT using lex; barehanded int yylex(){ � lexer with same interface line: ’\n’ � String s; int tok; Double d; � | exp ’\n’ { System.out.println(" ” + $1.dval + " "); } � if (!st.hasMoreTokens()) � ; � if (!newline) { � token code newline=true; � exp: NUM � { $$ = $1; } � via return return ’\n'; //As in classic YACC example � | exp '+' exp � { $$ = new ParserVal($1.dval + $3.dval); } � } else return 0; � | exp '-' exp � { $$ = new ParserVal($1.dval - $3.dval); } � s = st.nextToken(); � value via yylval | exp '*' exp � { $$ = new ParserVal($1.dval * $3.dval); } � try { � | exp '/' exp � { $$ = new ParserVal($1.dval / $3.dval); } � d = Double.valueOf(s); /*this may fail*/ � | '-' exp %prec NEG � { $$ = new ParserVal(-$2.dval); } � yylval = new ParserVal(d.doubleValue()); � | exp '^' exp � { $$=new ParserVal(Math.pow( $1.dval, $3.dval ));} � tok = NUM; } � | '(' exp ')' � { $$ = $2; } � See slide 20 catch (Exception e) { � ; � tok = s.charAt(0);/*if not float, return char*/ � } 
 %% � Ambiguous grammar; prec/assoc decls are a (smart) hack to fix that. return tok; � ... � } � 26 27 void dotest(){ � BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); � System.out.println("BYACC/J Calculator Demo"); � System.out.println("Note: Since this example uses the StringTokenizer"); � System.out.println("for simplicity, you will need to separate the items"); � System.out.println("with spaces, i.e.: '( 3 + 5 ) * 2'"); � while (true) { 
 System.out.print("expression:"); � Lex and Yacc try { � ins = in.readLine(); � } � catch (Exception e) { } � st = new StringTokenizer(ins); � newline=false; � More Details yyparse(); � } � } � public static void main(String args[]){ � Parser par = new Parser(false); � par.dotest(); � } � 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend