Lexical Analyzer — Scanner
ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu
tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu
1
Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, - - PowerPoint PPT Presentation
Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce as output a sequence of tokens that
tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu
1
⊲ identifier (variable) starts with a letter and follows by letters, digits or “ ”; ⊲ floating point number starts with a string of digits + a dot + another string of digits;
Compiler notes #2, Tsan-sheng Hsu, IIS 2
⊲ s0 ≡ ǫ; ⊲ si ≡ si−1s, i > 0.
Compiler notes #2, Tsan-sheng Hsu, IIS 3
Compiler notes #2, Tsan-sheng Hsu, IIS 4
i=0Li;
i=1Li;
Compiler notes #2, Tsan-sheng Hsu, IIS 5
Compiler notes #2, Tsan-sheng Hsu, IIS 6
Compiler notes #2, Tsan-sheng Hsu, IIS 7
Compiler notes #2, Tsan-sheng Hsu, IIS 8
Compiler notes #2, Tsan-sheng Hsu, IIS 9
Compiler notes #2, Tsan-sheng Hsu, IIS 10
Compiler notes #2, Tsan-sheng Hsu, IIS 11
a
b
c
c
a
b
c
Compiler notes #2, Tsan-sheng Hsu, IIS 12
a
a
b
b
a
a
b
b
Compiler notes #2, Tsan-sheng Hsu, IIS 13
ε ε start NFA for r NFA for s
start state for s start state for r
ε ε start NFA for r
start state for r
ε ε
accepting states for r
start state for s start state for r
convert all accepting states in r into non accepting states and add −transitions
ε
Compiler notes #2, Tsan-sheng Hsu, IIS 14
start a b a b b
1 2
4
✂5
✄6
☎7
✆8
✝9
✞10 11 12
Compiler notes #2, Tsan-sheng Hsu, IIS 15
Compiler notes #2, Tsan-sheng Hsu, IIS 16
start a b a b b
1 2
4
✂5
✄6
☎7
✆8
✝9
✞10 11 12
Compiler notes #2, Tsan-sheng Hsu, IIS 17
a
⊲ mark the state with the label T ⊲ for each input symbol a do ⊲ U ← ǫ-closure(move(T, a)) ⊲ if U is a subset of states that is never seen before ⊲ then add an unmarked state with the label U ⊲ end for
Compiler notes #2, Tsan-sheng Hsu, IIS 18
start a b a b b
1 2
4
✂5
✄6
☎7
✆8
✝9
✞10 11 12
Compiler notes #2, Tsan-sheng Hsu, IIS 19
start a b a b b
1 2
4
✂5
✄6
☎7
✆8
✝9
✞10 11 12
Compiler notes #2, Tsan-sheng Hsu, IIS 20
⊲ S ← ǫ-closure(move(S, a))
Compiler notes #2, Tsan-sheng Hsu, IIS 21
Compiler notes #2, Tsan-sheng Hsu, IIS 22
Compiler notes #2, Tsan-sheng Hsu, IIS 23
Compiler notes #2, Tsan-sheng Hsu, IIS 24
Compiler notes #2, Tsan-sheng Hsu, IIS 25
Compiler notes #2, Tsan-sheng Hsu, IIS 26
Compiler notes #2, Tsan-sheng Hsu, IIS 27
Compiler notes #2, Tsan-sheng Hsu, IIS 28
Compiler notes #2, Tsan-sheng Hsu, IIS 29
Compiler notes #2, Tsan-sheng Hsu, IIS 30
Compiler notes #2, Tsan-sheng Hsu, IIS 31
⊲ def: word has a well-defined meaning in a certain context. ⊲ example: FORTRAN, PL/1, . . . if if then else = then ; id id id ⊲ Makes compiler to work harder!
⊲ def: regardless of context, word cannot be used for other purposes. ⊲ example: COBOL, ALGOL, PASCAL, C, ADA, . . . ⊲ task of compiler is simpler ⊲ reserved words cannot be used as identifiers ⊲ listing of reserved words is tedious for the scanner, also makes scanner large ⊲ solutions: treat them as identifiers, and use a table to check whether it is a reserved word.
Compiler notes #2, Tsan-sheng Hsu, IIS 32
Compiler notes #2, Tsan-sheng Hsu, IIS 33