INF5110 – Compiler Construction
Scanning Spring 2016
1 / 102
INF5110 Compiler Construction Scanning Spring 2016 1 / 102 - - PowerPoint PPT Presentation
INF5110 Compiler Construction Scanning Spring 2016 1 / 102 Outline 1. Scanning Intro Regular expressions DFA Implementation of DFA NFA From regular expressions to DFAs Thompsons construction Determinization Minimization Scanner
1 / 102
2 / 102
3 / 102
aThe argument of a scanner is often a file name or an input stream or similar.
4 / 102
1Characters are language-independent, but perhaps the encoding may vary,
2There are large commonalities across many languages, though. 3No theoretical necessity, but that’s how also humans consume or “scan” a
5 / 102
6 / 102
7 / 102
8 / 102
9 / 102
10 / 102
4Very deep down, if one still has a magnetic disk (as opposed to SSD) the
11 / 102
5There was no computer science as profession or university curriculum. 12 / 102
13 / 102
6It’s mostly a question of language pragmatics. The lexers/parsers would
7Sometimes, the part of a lexer / parser which removes whitespace (and
14 / 102
15 / 102
16 / 102
17 / 102
18 / 102
19 / 102
20 / 102
8Maybe even prepare useful error messages if scanning (not scanner
21 / 102
22 / 102
23 / 102
24 / 102
25 / 102
26 / 102
27 / 102
28 / 102
9To be careful, we will (later) distinguish between context-free languages on
29 / 102
30 / 102
31 / 102
10Sometimes confusingly “the same” notation. 32 / 102
33 / 102
34 / 102
35 / 102
36 / 102
11Historically, design of electronic circuitry (not yet chip-based, though) was
37 / 102
38 / 102
39 / 102
12That means, for each pair q, a from Q × Σ, δ(q, a) is defined. Some people
40 / 102
41 / 102
42 / 102
43 / 102
44 / 102
45 / 102
46 / 102
47 / 102
48 / 102
49 / 102
50 / 102
51 / 102
52 / 102
53 / 102
54 / 102
55 / 102
1
2 3
4
5
6
7
8
9
10
11
12
13
14
15
16
56 / 102
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
57 / 102
58 / 102
59 / 102
1
2
3
4
5 6
7
8
9
10
11
12
13
60 / 102
61 / 102
13It does not matter much anyhow, as we will see. 62 / 102
63 / 102
64 / 102
65 / 102
66 / 102
67 / 102
68 / 102
69 / 102
70 / 102
71 / 102
14does not matter much, though. 72 / 102
73 / 102
74 / 102
75 / 102
76 / 102
77 / 102
78 / 102
15For some forms of automata, non-deterministic versions are strictly more
79 / 102
80 / 102
81 / 102
82 / 102
83 / 102
84 / 102
85 / 102
86 / 102
87 / 102
88 / 102
i , . . . Qk i s.t. the above situation is repaired for
i (but don’t split more than necessary).
89 / 102
90 / 102
91 / 102
92 / 102
93 / 102
94 / 102
95 / 102
96 / 102
97 / 102
98 / 102
99 / 102
100 / 102
16Tokens and actions of a parser will be covered later. For example,
101 / 102
[Hopcroft, 1971] Hopcroft, J. E. (1971). An n log n algorithm for minimizing the states in a finite automaton. In Kohavi, Z., editor, The Theory of Machines and Computations, pages 189–196. Academic Press, New York. [Kleene, 1956] Kleene, S. C. (1956). Representation of events in nerve nets and finite automata. In Automata Studies, pages 3–42. Princeton University Press. [Rabin and Scott, 1959] Rabin, M. and Scott, D. (1959). Finite automata and their decision problems. IBM Journal of Research Developments, 3:114–125. [Thompson, 1968] Thompson, K. (1968). Programming techniques: Regular expression search algorithm. Communications of the ACM, 11(6):419. 102 / 102