Compiler Construction
Lecture 2: Compiler Structure and Lexical Analysis 2020-01-10 Michael Engel
Includes material by Jan Christian Meyer
Compiler Construction Lecture 2: Compiler Structure and Lexical - - PowerPoint PPT Presentation
Compiler Construction Lecture 2: Compiler Structure and Lexical Analysis 2020-01-10 Michael Engel Includes material by Jan Christian Meyer .org Theoretical and practical exercises TA: Lahiru Rasnayake Six problem sets, one every
Includes material by Jan Christian Meyer
Compiler Construction 02: Compiler Structure, Scanning
2
Compiler Construction 02: Compiler Structure, Scanning
3
Compiler Construction 02: Compiler Structure, Scanning
4
Compiler Construction 02: Compiler Structure, Scanning
5
Compiler Construction 02: Compiler Structure, Scanning
6
Compiler Construction 02: Compiler Structure, Scanning
7
Compiler Construction 02: Compiler Structure, Scanning
8
Frontend Backend Source code Target program
Compiler Construction 02: Compiler Structure, Scanning
9
Frontend Backend Source code Target program
Optimizer IR IR
Compiler Construction 02: Compiler Structure, Scanning
10
Java ML Pascal C C++ Sparc MIPS Pentium Itanium Java ML Pascal C C++ Sparc MIPS Pentium Itanium IR
Compiler Construction 02: Compiler Structure, Scanning
11
Lexical analysis Syntax analysis Semantic analysis Code generation Code
Source code character stream token sequence machine-level program x = y + 42 id(x)
id(y)
number(42) character stream token sequence
Compiler Construction 02: Compiler Structure, Scanning
12
id(x)
id(y)
number(42) Lexical analysis Semantic analysis Code generation Code
Source code token sequence machine-level program Syntax analysis syntax tree
Compiler Construction 02: Compiler Structure, Scanning
13
Syntax analysis Semantic analysis syntax tree IR Lexical analysis Code generation Code
Source code machine-level program
Compiler Construction 02: Compiler Structure, Scanning
14
Code generation Code
IR Semantic analysis Syntax analysis Lexical analysis Source code machine-level program IR
Compiler Construction 02: Compiler Structure, Scanning
15
Semantic analysis Code generation IR Syntax analysis Lexical analysis Code
Source code machine-level program machine code
Compiler Construction 02: Compiler Structure, Scanning
16
Lexical analysis ASCII encoding
Compiler Construction 02: Compiler Structure, Scanning
17
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
18
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
19
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
20
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
21
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
22
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
23
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
24
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
25
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
26
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
27
Lexical analysis
Compiler Construction 02: Compiler Structure, Scanning
28
enum {error = 0, success}; int scan_real_number(void) { char c; enum states = {s1, s2, s3}; enum states cur = s1; while (1) { c = getchar(); // get next char if (c==EOF) break; // end? switch(cur) { case s1: if (c>='0' && c<='9') cur = s2; else return error; break; case s2: if (c>='0' && c<='9') cur = s2; else if (c=='.') cur = s3; else return error; break;
case s3: if (c>='0' && c<='9') cur = s3; else return error; break; } // switch } // while // check for accepting state if (cur != s2 && cur != s3) return error; else return success; }
Compiler Construction 02: Compiler Structure, Scanning
29
enum {error = 0, success}; enum states {s1, s2, s3, er}; enum states cur = s1; char alphabet[] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '.' }; // next state for each char in alphabet (columns) struct scanner { enum states next[sizeof(alphabet)]; }; // rows of the transition table struct scanner delta[sizeof(enum states)] = { // 0 1 2 3 4 5 6 7 8 9 . {s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, er}, // s1 {s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s3}, // s2 {s3, s3, s3, s3, s3, s3, s3, s3, s3, s3, er}, // s3 {er, er, er, er, er, er, er, er, er, er, er}, // er }; int scan_real_number(void) { char c; while (1) { c = getchar(); // get next char if (c==EOF) break; // end? cur = delta[cur].next[lookup(c)]; } // while // check for accepting state
if (cur!=s2 && cur!=s3) return error;
else return success; } δ 1 2 3 4 5 6 7 8 9 . s1 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 er s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s3 s3 s3 s3 s3 s3 s3 s3 s3 s3 s3 s3 er
What is the task of the function call lookup(c) here and how would you implement it?
Beware: there's a subtle but potentially dangerous bug in the code! Can you find it?
Compiler Construction 02: Compiler Structure, Scanning
30