Plan for Lexical Analysis with Jlex and One Pass Code Gen Structure - - PowerPoint PPT Presentation

plan for lexical analysis with jlex and one pass code gen
SMART_READER_LITE
LIVE PREVIEW

Plan for Lexical Analysis with Jlex and One Pass Code Gen Structure - - PowerPoint PPT Presentation

Plan for Lexical Analysis with Jlex and One Pass Code Gen Structure of the MeggyJava Compiler Analysis Synthesis Overview of the MeggyJava Assignments character stream PA2:Lexer/scanner in MJPA2.jar lexical analysis code gen Expressing


slide-1
SLIDE 1

CS453 Lecture Lexical Analysis with JLex 1

Plan for Lexical Analysis with Jlex and One Pass Code Gen

Overview of the MeggyJava Assignments PA2:Lexer/scanner in MJPA2.jar Expressing tokens with regular expressions

– regular expression syntax for JLex – using JLex with JavaCup

How do lexer generators work?

– Convert regular expressions to NFA – Converting an NFA to DFA – Implementing the DFA

PA2: Syntax-directed code generation (MJ.jar)

CS453 Lecture Introduction 2

Structure of the MeggyJava Compiler

sentences Synthesis Analysis character stream lexical analysis words tokens semantic analysis syntactic analysis AST AST and symbol table code gen Atmel assembly code PA1: Write test cases in MeggyJava, and AVR warmup PA2: MeggyJava scanner and setPixel PA3: add exps and control flow (AST) PA4: add methods (symbol table) PA5: add variables and objects PA6: add arrays and register allocation

PA2 Scanner/Lexer

Look at the assignment writeup and point out the tar ball. Look at the input files. Look at the output files. Look at MJPA2Driver.java. Look at mj.lex. Look at the Makefile.

CS453 Lecture Lexical Analysis with JLex 3 CS453 Lecture Lexical Analysis with JLex 4

Specifying Tokens with JLex

JLex example input file: package mjparser; import java_cup.runtime.Symbol; %% %line %char %cup %public %eofval{ return new Symbol(sym.EOF, new

TokenValue("EOF", yyline, yychar));

%eofval} LETTER=[A-Za-z] DIGIT=[0-9] UNDERSCORE="_" LETT_DIG_UND={LETTER}|{DIGIT}|{UNDERSCORE} ID={LETTER}({LETT_DIG_UND})* ... %% "&&" {return new Symbol(sym.AND, new

TokenValue(yytext(), yyline, yychar)); }

"+" {return new Symbol(sym.PLUS, ...); } "if" {return new Symbol(sym.IF,...); } {ID} {return new Symbol(sym.ID, new ... {EOL} { /* reset yychar */ … } {WS} { /* ignore */ }
slide-2
SLIDE 2

1

q

2

q

3

q

a a a

q } {a

Alphabet =

Nondeterministic Finite Acceptor (NFA)

1

q

2

q

3

q

a a a

q

Two choices

} {a

Alphabet =

Nondeterministic Finite Accepter (NFA)

a a

q

1

q

2

q

3

q

a a

First Choice

a a a

q

1

q

2

q

3

q

a a a

First Choice

slide-3
SLIDE 3

a a

q

1

q

2

q

3

q

a a

First Choice

a a a

q

1

q

2

q

3

q

a a a “accept”

First Choice

All input is consumed a a

q

1

q

2

q

3

q

a a

Second Choice

a a a

q

1

q

2

q

a a

Second Choice

a 3

q

slide-4
SLIDE 4

a a

q

1

q

2

q

a a a 3

q

Second Choice

No transition: the automaton hangs a a

q

1

q

2

q

a a a 3

q

Second Choice

should we reject aa? Input cannot be consumed An NFA accepts a string: when there is a computation of the NFA that accepts the string all the input is consumed and the automaton is in a final state

AND When To Accept a String

Example

aa

is accepted by the NFA:

q

1

q

2

q

3

q

a a a

“accept”

q

1

q

2

q

a a a 3

q

“reject??” because this computation accepts aa But this only tells us that choice didn’t work….

slide-5
SLIDE 5

a

q

1

q

2

q

3

q

a a

Rejection example

a a

q

1

q

2

q

3

q

a a a

First Choice

a

q

1

q

2

q

3

q

a a a

First Choice

“reject??”

Second Choice

a

q

1

q

2

q

3

q

a a a

slide-6
SLIDE 6

Second Choice

a

q

1

q

2

q

a a a 3

q

Second Choice

a

q

1

q

2

q

a a a 3

q

“reject??” An NFA rejects a string: when there is NO computation of the NFA that accepts the string:

  • All the input is consumed and the

automaton is in a non final state

  • The input cannot be consumed

OR

Example

a is rejected by the NFA:

q

1

q

2

q

a a a 3

q “reject??” q

1

q

2

q

a a a 3

q

“reject??” All possible computations lead to rejection

slide-7
SLIDE 7

1

q

2

q

3

q

a a a

q

Language accepted:

} {aa L =

CS453 Lecture Lexical Analysis with JLex 26

Specifying Tokens with JLex

JLex example input file: package mjparser; import java_cup.runtime.Symbol; %% %line %char %cup %public %eofval{ return new Symbol(sym.EOF, new

TokenValue("EOF", yyline, yychar));

%eofval} LETTER=[A-Za-z] DIGIT=[0-9] UNDERSCORE="_” EOL=(\n|\r|\r\n) LETT_DIG_UND={LETTER}|{DIGIT}|{UNDERSCORE} ID={LETTER}({LETT_DIG_UND})* %% "&&" {return new Symbol(sym.AND, new

TokenValue(yytext(), yyline, yychar)); }

"+" {return new Symbol(sym.PLUS, ...); } "if" {return new Symbol(sym.IF,...); } {ID} {return new Symbol(sym.ID, new ... {EOL} { /* reset yychar */ … } {WS} { /* ignore */ }

CS453 Lecture Lexical Analysis with JLex 27

Example NFA for Multiple Tokens

CS453 Lecture Lexical Analysis with JLex 28

DFA from IF and ID NFAs (Do in class)

slide-8
SLIDE 8

CS453 Lecture Lexical Analysis with JLex 29

DFA from IF and ID NFAs (Answer) Implementing DFAs?

CS453 Lecture Lexical Analysis with JLex 30

PA2 Syntax Directed Code Generation

Look at the assignment writeup and point out usage of MJ.jar. Input files are MeggyJava files that fit the PA2 grammar. Look at current output file. Will be a .s file that can go through the

simulator.

Look at MJDriver.java. Look at mj.cup. Look at the Makefile.

CS453 Lecture Lexical Analysis with JLex 31 CS453 Lecture Context Free Grammar Intro 32

Recall Doing Syntax-Directed Interpretation 42 + 7 * 6 (1) exp --> exp * exp (2) exp --> exp + exp (3) exp --> NUM Grammar String

slide-9
SLIDE 9

CS453 Lecture Context Free Grammar Intro 33

Semantic Rules for Expression Example Code Generation versus Interpretation

When interpreting . . .

– Each action in the .cup file associates a value with the left hand side of the non terminal. – Each non terminal on the right hand side has a value associated with it. – This approach will also be useful when we are building the Abstract Syntax Tree (AST) in PA3.

When doing one pass compilation . . .

– Actions output the target code (in this case AVR assembly)

CS453 Lecture Lexical Analysis with JLex 34

Parse Tree for An Empty MeggyJava Program

CS453 Lecture Lexical Analysis with JLex 35