COMP20121 The Implementation and Power of Computer Languages Power - - PowerPoint PPT Presentation

comp20121 the implementation and power of computer
SMART_READER_LITE
LIVE PREVIEW

COMP20121 The Implementation and Power of Computer Languages Power - - PowerPoint PPT Presentation

COMP20121 The Implementation and Power of Computer Languages Power Part http://www.cs.man.ac.uk/ petera/2121/index.html . Peter Aczel room: CS2.52, tel: 56155 petera@cs.man.ac.uk email: School of Computer Science, University of


slide-1
SLIDE 1

COMP20121 The Implementation and Power of Computer Languages ‘Power’ Part

http://www.cs.man.ac.uk/∼petera/2121/index.html .

Peter Aczel room: CS2.52, tel: 56155 email:

petera@cs.man.ac.uk

School of Computer Science, University of Manchester

COMP20121 - Section 0 – p.1/307

slide-2
SLIDE 2

COMP20121, ‘power’ part: section 1

lecture 1: Regular languages

LECTURE TWO

Finite automata for regular languages

COMP20121 - Section 1 – p.36/307

slide-3
SLIDE 3

Match, or no match?

It can be hard to work out whether a given word matches a pattern.

COMP20121 - Section 1 – p.37/307

slide-4
SLIDE 4

Match, or no match?

It can be hard to work out whether a given word matches a pattern. Consider the pattern: ab∗(a|b)b∗(a|b)∗,

COMP20121 - Section 1 – p.37/307

slide-5
SLIDE 5

Match, or no match?

It can be hard to work out whether a given word matches a pattern. Consider the pattern: ab∗(a|b)b∗(a|b)∗, and the word: abbababababbbababab. Does it match, or doesn’t it?

COMP20121 - Section 1 – p.37/307

slide-6
SLIDE 6

Match, or no match?

It can be hard to work out whether a given word matches a pattern. Consider the pattern: ab∗(a|b)b∗(a|b)∗, and the word: abbababababbbababab. Does it match, or doesn’t it? We want to match letters in the target word with letters in the pattern. Answering ‘yes’ or ‘no’ is subject to a

COMP20121 - Section 1 – p.37/307

slide-7
SLIDE 7

Automata

Instead of dealing with syntax, human beings often do better with pictures. e.g. when working with the pattern

(((0∗)|1)2|(0|(1∗))2) : you might draw

COMP20121 - Section 1 – p.38/307

slide-8
SLIDE 8

Automata

Instead of dealing with syntax, human beings often do better with pictures. e.g. when working with the pattern

(((0∗)|1)2|(0|(1∗))2) : you might draw

2 2 1 2 1

COMP20121 - Section 1 – p.38/307

slide-9
SLIDE 9

Accepting a word

We can follow any word through the

  • automaton. Consider 0002.

2 2 1 2 1

COMP20121 - Section 1 – p.39/307

slide-10
SLIDE 10

Accepting a word

We can follow any word through the

  • automaton. Consider 0002.

2 2 1 2 1

COMP20121 - Section 1 – p.39/307

slide-11
SLIDE 11

Accepting a word

We can follow any word through the

  • automaton. Consider 0002.

2 2 1 2 1

COMP20121 - Section 1 – p.39/307

slide-12
SLIDE 12

Accepting a word

We can follow any word through the

  • automaton. Consider 0002.

2 2 1 2 1

COMP20121 - Section 1 – p.39/307

slide-13
SLIDE 13

Accepting a word

We can follow any word through the

  • automaton. Consider 0002.

2 2 1 2 1

COMP20121 - Section 1 – p.39/307

slide-14
SLIDE 14

Accepting a word

We can follow any word through the

  • automaton. Consider 0002.

2 2 1 2 1

The double circle is an accepting state.

COMP20121 - Section 1 – p.39/307

slide-15
SLIDE 15

Rejecting a word-I

We can follow any word through the

  • automaton. Consider 0001.

2 2 1 2 1

COMP20121 - Section 1 – p.40/307

slide-16
SLIDE 16

Rejecting a word-I

We can follow any word through the

  • automaton. Consider 0001.

2 2 1 2 1

COMP20121 - Section 1 – p.40/307

slide-17
SLIDE 17

Rejecting a word-I

We can follow any word through the

  • automaton. Consider 0001.

2 2 1 2 1

COMP20121 - Section 1 – p.40/307

slide-18
SLIDE 18

Rejecting a word-I

We can follow any word through the

  • automaton. Consider 0001.

2 2 1 2 1

COMP20121 - Section 1 – p.40/307

slide-19
SLIDE 19

Rejecting a word-I

We can follow any word through the

  • automaton. Consider 0001.

2 2 1 2 1

COMP20121 - Section 1 – p.40/307

slide-20
SLIDE 20

Rejecting a word-I

We can follow any word through the

  • automaton. Consider 0001.

2 2 1 2 1

Reject this word because we are stuck.

COMP20121 - Section 1 – p.40/307

slide-21
SLIDE 21

Rejecting a word-II

We can follow any word through the

  • automaton. Consider 000.

2 2 1 2 1

COMP20121 - Section 1 – p.41/307

slide-22
SLIDE 22

Rejecting a word-II

We can follow any word through the

  • automaton. Consider 000.

2 2 1 2 1

COMP20121 - Section 1 – p.41/307

slide-23
SLIDE 23

Rejecting a word-II

We can follow any word through the

  • automaton. Consider 000.

2 2 1 2 1

COMP20121 - Section 1 – p.41/307

slide-24
SLIDE 24

Rejecting a word-II

We can follow any word through the

  • automaton. Consider 000.

2 2 1 2 1

COMP20121 - Section 1 – p.41/307

slide-25
SLIDE 25

Rejecting a word-II

We can follow any word through the

  • automaton. Consider 000.

2 2 1 2 1

COMP20121 - Section 1 – p.41/307

slide-26
SLIDE 26

Rejecting a word-II

We can follow any word through the

  • automaton. Consider 000.

2 2 1 2 1

Reject: we end in a non-accepting state.

COMP20121 - Section 1 – p.41/307

slide-27
SLIDE 27

DFA—Definition

Definition 7 Let Σ be a finite alphabet. A (deterministic) finite automaton, (DFA),

  • ver Σ consists of (Q, q•, F, δ) where:
  • Q is a finite set of states;

COMP20121 - Section 1 – p.42/307

slide-28
SLIDE 28

DFA—Definition

Definition 7 Let Σ be a finite alphabet. A (deterministic) finite automaton, (DFA),

  • ver Σ consists of (Q, q•, F, δ) where:
  • Q is a finite set of states;
  • q• ∈ Q is the start state;

COMP20121 - Section 1 – p.42/307

slide-29
SLIDE 29

DFA—Definition

Definition 7 Let Σ be a finite alphabet. A (deterministic) finite automaton, (DFA),

  • ver Σ consists of (Q, q•, F, δ) where:
  • Q is a finite set of states;
  • q• ∈ Q is the start state;
  • F ⊆ Q is the set of accepting states;

and

COMP20121 - Section 1 – p.42/307

slide-30
SLIDE 30

DFA—Definition (continued)

  • δ is a transition function which for every

state q ∈ Q and every symbol x ∈ Σ returns the next state δ(q, x) ∈ Q.

  • So δ is a function from Q × Σ to Q.
  • When δ(q, x) = q′ we often write

q

x

→ q′.

COMP20121 - Section 1 – p.43/307

slide-31
SLIDE 31

NFA—Definition

Definition 8 A non-deterministic finite automaton, (NFA), consists of (Q, q•, F, δ) where:

  • Q is a a finite set of states;

COMP20121 - Section 1 – p.44/307

slide-32
SLIDE 32

NFA—Definition

Definition 8 A non-deterministic finite automaton, (NFA), consists of (Q, q•, F, δ) where:

  • Q is a a finite set of states;
  • q• ∈ Q is the start state;

COMP20121 - Section 1 – p.44/307

slide-33
SLIDE 33

NFA—Definition

Definition 8 A non-deterministic finite automaton, (NFA), consists of (Q, q•, F, δ) where:

  • Q is a a finite set of states;
  • q• ∈ Q is the start state;
  • F ⊆ Q is the set of accepting states;

and

COMP20121 - Section 1 – p.44/307

slide-34
SLIDE 34

NFA—Definition (continued)

  • δ is a transition relation which relates a

pair consisting of a state and a letter with a state.

COMP20121 - Section 1 – p.45/307

slide-35
SLIDE 35

NFA—Definition (continued)

  • δ is a transition relation which relates a

pair consisting of a state and a letter with a state. When (q, x) is δ-related to q′ we often write

q

x

→ q′ .

COMP20121 - Section 1 – p.45/307

slide-36
SLIDE 36

Variation of DFA: Dump states

Strictly speaking, our pictures do not quite fit the definition.

2 2 1 2 1

COMP20121 - Section 1 – p.46/307

slide-37
SLIDE 37

Variation of DFA: Dump states

Strictly speaking, our pictures do not quite fit the definition.

2 2 1 2 1

There is no arrow labelled 1 from the red

  • state. So reject 001!

COMP20121 - Section 1 – p.46/307

slide-38
SLIDE 38

Variation of DFA: Dump states

We do not always draw all the states. We can leave out ‘dump states’; i.e. states from which no accepting state can be reached. If we add a dump state to this DFA, it would look like this:

2 1 1 3 2 0, 1, 2, 3 0, 1, 2, 3 2 0, 3 1, 3

COMP20121 - Section 1 – p.46/307

slide-39
SLIDE 39

Variation of DFA: Dump states

We obtain this new picture by adding in one new state, . . .

2 1 1 2 2 0, 3

COMP20121 - Section 1 – p.46/307

slide-40
SLIDE 40

Variation of DFA: Dump states

We obtain this new picture by adding in one new state, and arrows to this new state whenever there is an arrow missing in the

  • riginal picture.

2 1 1 3 2 0, 1, 2, 3 2 0, 3 1, 3

COMP20121 - Section 1 – p.46/307

slide-41
SLIDE 41

Variation of DFA: Dump states

Further we add an arrow from the new state to itself labelled with all the letters of the underlying alphabet Σ = {0, 1, 2, 3}.

2 1 1 3 2 0, 1, 2, 3 0, 1, 2, 3 2 0, 3 1, 3

COMP20121 - Section 1 – p.46/307

slide-42
SLIDE 42

Variation of DFA: Dump states

If we draw all states, then we can follow a word through the automaton along a unique path without getting stuck. The word is accepted if and only if the state reached when the word ends is an accepting state.

COMP20121 - Section 1 – p.46/307

slide-43
SLIDE 43

Variation of DFA: Dump states

If we leave out some dump states, then we can follow a word through the automaton along a unique path, but we might get

  • stuck. The word is accepted if and only if

we do not get stuck and the state reached when the word ends is an accepting state.

COMP20121 - Section 1 – p.46/307

slide-44
SLIDE 44

Accepting a word

Definition 9 A word α = x1 · · · xn is accepted by a DFA if

δ(q•, x1) = q1, δ(q1, x2) = q2,

. . .

δ(qn−1, xn) = qn ∈ F

COMP20121 - Section 1 – p.47/307

slide-45
SLIDE 45

Accepting a word

This means that we have a sequence of states q•, q1, . . . , qn such that

q•

x1

→ q1

x2

→ q2

✲ · · ·

xn

→ qn ∈ F

COMP20121 - Section 1 – p.47/307

slide-46
SLIDE 46

Acceptance for NFAs

When we consider an NFA, there may be more than one way of following a word through the automaton. Consider the word

002.

2 1 2 1 2 0, 1

COMP20121 - Section 1 – p.48/307

slide-47
SLIDE 47

Acceptance for NFAs

First attempt: Consider the word 002.

2 1 2 1 2 0, 1

COMP20121 - Section 1 – p.48/307

slide-48
SLIDE 48

Acceptance for NFAs

First attempt: Consider the word 002.

2 1 2 1 2 0, 1

We get stuck and this is not an accepting path.

COMP20121 - Section 1 – p.48/307

slide-49
SLIDE 49

Acceptance for NFAs

Second attempt: Consider the word 002.

2 1 2 1 2 0, 1

COMP20121 - Section 1 – p.48/307

slide-50
SLIDE 50

Acceptance for NFAs

Second attempt: Consider the word 002.

2 1 2 1 2 0, 1

COMP20121 - Section 1 – p.48/307

slide-51
SLIDE 51

Acceptance for NFAs

Second attempt: Consider the word 002.

2 1 2 1 2 0, 1

This is an accepting path and the word is accepted.

COMP20121 - Section 1 – p.48/307

slide-52
SLIDE 52

Acceptance for NFAs

Third attempt: Consider the word 002.

2 1 2 1 2 0, 1

COMP20121 - Section 1 – p.48/307

slide-53
SLIDE 53

Acceptance for NFAs

Third attempt: Consider the word 002.

2 1 2 1 2 0, 1

COMP20121 - Section 1 – p.48/307

slide-54
SLIDE 54

Acceptance for NFAs

Third attempt: Consider the word 002.

2 1 2 1 2 0, 1

This is an accepting path and the word is accepted.

COMP20121 - Section 1 – p.48/307

slide-55
SLIDE 55

Accepting a word—NFAs

Definition 9 ctd A word α = x1 · · · xn is accepted by an NFA if there are states

(q0 = q•), q1, . . . , qn

such that for all 0 ≤ i < n,

δ relates (qi, xi) with qi+1

and qn is an accepting state.

COMP20121 - Section 1 – p.49/307

slide-56
SLIDE 56

Accepting a word—NFAs

In other words, we have a sequence of states q•, q1, . . . , qn such that

q•

x1

→ q1

x2

→ q2

✲ · · ·

xn

→ qn

and qn is an accepting state.

COMP20121 - Section 1 – p.49/307

slide-57
SLIDE 57

Accepting a word—NFAs

For a DFA, it is easy to decide whether or not a given word is accepted. Just follow the unique path through the automaton given by the word.

COMP20121 - Section 1 – p.49/307

slide-58
SLIDE 58

Accepting a word—NFAs

For an NFA, there will typically be several paths given by the word. We have to investigate all of those in order to find out whether the word is accepted.

  • We will see later why NFAs are

nonetheless useful.

COMP20121 - Section 1 – p.49/307

slide-59
SLIDE 59

A DFA for a regular expression

Proposition 1.2 For every regular expression over some alphabet Σ there is a DFA that accepts precisely those words which match the regular expression.

COMP20121 - Section 1 – p.50/307

slide-60
SLIDE 60

A DFA for a regular expression

Proposition 1.2 For every regular expression over some alphabet Σ there is a DFA that accepts precisely those words which match the regular expression. However, this isn’t as easy as you might first think!

COMP20121 - Section 1 – p.50/307

slide-61
SLIDE 61

From patterns to DFAs,1

The obvious proof of Proposition 1.2 is to proceed by induction over the way a pattern is built.

COMP20121 - Section 1 – p.51/307

slide-62
SLIDE 62

From patterns to DFAs,1

The obvious proof of Proposition 1.2 is to proceed by induction over the way a pattern is built. And indeed, the start (base cases) looks promising.

COMP20121 - Section 1 – p.51/307

slide-63
SLIDE 63

From patterns to DFAs,2

We can define automata for the pattern ∅ (which matches no word at all).

Accepts no word at all

Σ

COMP20121 - Section 1 – p.52/307

slide-64
SLIDE 64

From patterns to DFAs,2

We can define automata for the patterns ∅ (which matches no word at all), ǫ (which matches precisely the word ǫ.

Accepts no word at all

Σ Σ

Accepts precisely ǫ

COMP20121 - Section 1 – p.52/307

slide-65
SLIDE 65

From patterns to DFAs,2

We can define automata for the patterns ∅ (which matches no word at all), ǫ (which matches precisely the word ǫ and x, which matches precisely the word x, where

x ∈ Σ.

Accepts no word at all

Σ Σ

Accepts precisely ǫ

x

COMP20121 - Section 1 – p.52/307

slide-66
SLIDE 66

The other forms of pattern

  • concatenation: (p1p2)
  • alternative: (p1|p2)
  • Kleene star: p∗

COMP20121 - Section 1 – p.53/307

slide-67
SLIDE 67

Concatenation

We start running into problems with concatenation:

A2 A1 A2 A2 A2 ? A1

+ =

COMP20121 - Section 1 – p.54/307

slide-68
SLIDE 68

Concatenation

Here is an example which shows why. The pattern a∗b∗ is the concatenation of a∗ with b∗.

b ¬b

+

a ¬a Σ Σ

COMP20121 - Section 1 – p.54/307

slide-69
SLIDE 69

Concatenation

If we treat this according to our original idea, we get

b ¬b

+

a b ¬b ¬a a ¬a Σ Σ Σ Σ

COMP20121 - Section 1 – p.54/307

slide-70
SLIDE 70

Concatenation

If we treat this according to our original idea, we get

b ¬b

+

a b ¬b ¬a a ¬a Σ Σ Σ Σ

But that allows the string aba to be accepted, which doesn’t match a∗b∗.

COMP20121 - Section 1 – p.54/307

slide-71
SLIDE 71

τ-transitions,1

The problem with the example clearly was that we couldn’t separate the two automata sufficiently.

b ¬b

+

a b ¬b ¬a a ¬a Σ Σ Σ Σ

COMP20121 - Section 1 – p.55/307

slide-72
SLIDE 72

τ-transitions,1

What we would have liked to do is something like this:

b ¬b a ¬a Σ Σ

COMP20121 - Section 1 – p.55/307

slide-73
SLIDE 73

τ-transitions,1

What we would have liked to do is something like this:

b ¬b a ¬a Σ Σ

We have drawn an arrow from every accept- ing state in the first automaton to the start state of the second automaton. Note that this arrow is not labelled!

COMP20121 - Section 1 – p.55/307

slide-74
SLIDE 74

τ-transitions,2

b ¬b a ¬a Σ Σ τ

The idea is that we should be able to take this transition whenever we like, without matching it against a letter of the word. We call such transitions τ-transitions and la- bel them in this way to avoid confusion.

COMP20121 - Section 1 – p.56/307

slide-75
SLIDE 75

Concatenation revisited

This allows us to deal with concatenation like this.

A1 A1 A2 A2

combine to

τ

COMP20121 - Section 1 – p.57/307

slide-76
SLIDE 76

Concatenation revisited

This allows us to deal with concatenation like this.

A1 A1 A2 A2

combine to

τ

Note that we need to append the automaton A2 (via a τ-move) for every accepting state of A1.

COMP20121 - Section 1 – p.57/307

slide-77
SLIDE 77

Concatenation revisited

So maybe we should be using this picture instead.

A2 A2 A2 A2 A1

+ =

A1 τ τ τ

COMP20121 - Section 1 – p.57/307

slide-78
SLIDE 78

Alternative

Using τ-moves makes it is easy to deal with alternative.

A1 A2

combine to

A1 A2 τ τ

COMP20121 - Section 1 – p.58/307

slide-79
SLIDE 79

Alternative

Using τ-moves makes it is easy to deal with alternative.

A1 A2

combine to

A1 A2 τ τ

Convince yourself that the new automaton recognizes the intended language!

COMP20121 - Section 1 – p.58/307

slide-80
SLIDE 80

Kleene star,1

The Kleene star leads to the most complicated looking automaton. This is because to be general enough to apply to all automata, we really do need all these

τ-transitions.

A

becomes

A τ τ τ τ

COMP20121 - Section 1 – p.59/307

slide-81
SLIDE 81

Kleene star,2

A

becomes

A τ τ τ τ

Convince yourself that the new automaton recognizes the intended language!

COMP20121 - Section 1 – p.60/307

slide-82
SLIDE 82

Kleene star,2

A

becomes

A τ τ τ τ

Convince yourself that the new automaton recognizes the intended language! For many examples we can make do with fewer states and τ-transitions than those given in the general rule!

COMP20121 - Section 1 – p.60/307

slide-83
SLIDE 83

Example 1

Consider the following automaton which corresponds to the pattern a|ab.

a b

COMP20121 - Section 1 – p.61/307

slide-84
SLIDE 84

Example 1

If we just try to insert τ-moves to get an automaton for (a|ab)∗, we get

a b τ τ τ τ

which accepts the word bb, which it shouldn’t.

COMP20121 - Section 1 – p.61/307

slide-85
SLIDE 85

Example 1

Let us now at least deal with the accepting states, as suggested in our rule.

a b τ τ τ τ τ τ

COMP20121 - Section 1 – p.62/307

slide-86
SLIDE 86

Example 1

a b τ τ τ τ τ τ

This does work, but if the start state was an accepting state as well, we might have to add another state.

COMP20121 - Section 1 – p.63/307

slide-87
SLIDE 87

Example 2

Let us assume we now want to find an NFA with τ-moves for the pattern a∗b∗.

COMP20121 - Section 1 – p.64/307

slide-88
SLIDE 88

Example 2

Let us assume we now want to find an NFA with τ-moves for the pattern a∗b∗. We start with the pattern a, which gives us the automaton

a

We apply the Kleene star to get an automaton for the pattern a∗.

τ τ τ τ a

COMP20121 - Section 1 – p.64/307

slide-89
SLIDE 89

Example 2

Let us assume we now want to find an NFA with τ-moves for the pattern a∗b∗. We apply the Kleene star to get an automaton for the pattern a∗. We can remove some superfluous τ-moves.

τ τ a

COMP20121 - Section 1 – p.64/307

slide-90
SLIDE 90

Example 2

Let us assume we now want to find an NFA with τ-moves for the pattern a∗b∗. We get an identical automaton for b∗, and putting the two together we obtain the solution to the overall problem.

τ τ τ τ τ b a

COMP20121 - Section 1 – p.64/307

slide-91
SLIDE 91

We have shown. . .

So what have we done now? We have constructed automata from patterns, but those automata mention a new kind of move (τ-transitions), and they are non-deterministic.

COMP20121 - Section 1 – p.65/307

slide-92
SLIDE 92

We have shown. . .

So what have we done now? We have constructed automata from patterns, but those automata mention a new kind of move (τ-transitions), and they are non-deterministic. In other words we have convinced

  • urselves that the following holds.

COMP20121 - Section 1 – p.65/307

slide-93
SLIDE 93

We have shown. . .

Proposition 1.3 For every regular expres- sion over some alphabet Σ there is an NFA with τ-transitions that accepts pre- cisely those words which match the regular expression.

COMP20121 - Section 1 – p.66/307

slide-94
SLIDE 94

From NFAs to DFAs,1

Our original aim was, of course, to construct a DFA (without any fancy moves like τ-transitions). We will therefore next concentrate on taking an NFA with such transitions and turning it into a DFA.

COMP20121 - Section 1 – p.67/307

slide-95
SLIDE 95

From NFAs to DFAs,2

In other words, we want to show the following result. Proposition 1.4 For every NFA with τ- moves there exists a DFA that accepts pre- cisely the same words.

COMP20121 - Section 1 – p.68/307