Lecture 4 Regular Expressions 4-0 DFAs vs NFAs - - PDF document

lecture 4
SMART_READER_LITE
LIVE PREVIEW

Lecture 4 Regular Expressions 4-0 DFAs vs NFAs - - PDF document

The University of Melbourne Dept. of Computer Science and Software Eng. 433330 Theory of Computation Harald Sndergaard Lecture 4 Regular Expressions 4-0 DFAs vs NFAs Surprisingly, for finite automata,


slide-1
SLIDE 1

✬ ✫ ✩ ✪ The University of Melbourne

  • Dept. of Computer Science and Software Eng.

433–330 Theory of Computation Harald Søndergaard

Lecture 4

Regular Expressions

4-0

slide-2
SLIDE 2

✬ ✫ ✩ ✪

DFAs vs NFAs

Surprisingly, for finite automata, adding the non-determinism does not result in more computing power. The class of languages recognised by NFAs is exactly the class of regular languages. Theorem: Every NFA has an equivalent DFA. The proof rests on the so-called subset construction. Given NFA N, we construct DFA M, each of whose states is a set of N-states. If N has k states then M may have up to 2k states (but it will often have far fewer than that).

4-1

slide-3
SLIDE 3

✬ ✫ ✩ ✪

DFAs vs NFAs (cont.)

Consider the NFA

  • 2

a,b

  • a
  • 1

b

  • ǫ
  • 3

a

  • We can systematically construct an equivalent

DFA. Its start state is {1, 3}. From this state an a will take us back to {1, 3}. From {1, 3}, b can only take us to {2}. Continuing thus, gives the DFA. Any state S which contains an accept state from the NFA will be an accept state for the DFA.

4-2

slide-4
SLIDE 4

✬ ✫ ✩ ✪

More Formally . . .

Let N = (Q, Σ, δ, q0, F). Let E(S) be the “ǫ closure” of S ⊆ Q, that is, S together with all states reachable from S using

  • nly ǫ steps:

E(S) =

  • s∈S

{s′ ∈ Q | s

→ǫ s′} We construct M = (P(Q), Σ, δ′, q′

0, F ′) as follows.

  • q′

0 = E({q0}).

  • F ′ = {S ⊆ Q | S ∩ F = ∅}.
  • δ′(S, a) =

s∈S E(δ(s, a)).

Note: This construction may include some unreachable states.

4-3

slide-5
SLIDE 5

✬ ✫ ✩ ✪

Closure Results

Theorem: The class of regular languages is closed under union. Proof: Let A and B be regular languages. An NFA that recognises A ∪ B is easily constructed: machine for A

  • ǫ
  • ǫ
  • machine for B

4-4

slide-6
SLIDE 6

✬ ✫ ✩ ✪

Closure Results (cont.)

Theorem: The class of regular languages is closed under concatenation. Proof: Let A and B be regular languages with these recognisers, respectively: From these we can easily construct an NFA that recognises A ◦ B:

ǫ ǫ

4-5

slide-7
SLIDE 7

✬ ✫ ✩ ✪

Closure Results (cont.)

Theorem: The class of regular languages is closed under Kleene star. Proof: Let A be a regular language with recogniser Here is how we construct an NFA to recognise A∗:

ǫ ǫ ǫ

4-6

slide-8
SLIDE 8

✬ ✫ ✩ ✪

Closure Results (cont.)

Regular languages have several other closure

  • properties. They are closed under
  • intersection,
  • complement, A
  • difference, as A \ B = A ∩ B,
  • reversal.

4-7

slide-9
SLIDE 9

✬ ✫ ✩ ✪

Regular Expressions

Regular expressions is a notation for languages. You are probably familiar with similar notation in Unix, Awk or Perl. Example: 0 ∪ 1 ∪ (0(0 ∪ 1)∗0) ∪ (1(0 ∪ 1)∗1) denotes the set of non-empty binary strings that begin and end with the same symbol. We can avoid excessive parentheses if we agree that the star binds tighter than concatenation, which in turn binds tighter than union.

4-8

slide-10
SLIDE 10

✬ ✫ ✩ ✪

Regular Expressions (cont.)

Syntax: The regular expressions over an alphabet Σ = {a1, . . . , an} is given by the grammar re → a1 | · · · | an | ǫ | ∅ | re ∪ re | re ◦ re | re∗ (Sometimes we leave out the ◦.) Semantics: L(a) = {a} L(ǫ) = {ǫ} L(∅) = ∅ L(R1 ∪ R2) = L(R1) ∪ L(R2) L(R1 ◦ R2) = L(R1) ◦ L(R2) L(R∗) = L(R)∗

4-9

slide-11
SLIDE 11

✬ ✫ ✩ ✪

Regular Expressions – Examples

110 : {110} (ΣΣ)∗ : all strings of even length (0 ∪ ǫ)(ǫ ∪ 1) : {ǫ, 0, 1, 01} 1∗ : all sequences of 1s ǫ ∪ 1 ∪ (ǫ ∪ 1)∗(ǫ ∪ 1) : all sequences of 1s

4-10

slide-12
SLIDE 12

✬ ✫ ✩ ✪

Regular Expressions vs Automata

Theorem: A language is regular iff it can be described by a regular expression. Let us first show the ‘if’ direction, by showing how to convert a regular expression R into an NFA that recognises L(R). The proof is by structural induction over the form

  • f R.

Case R = a:

  • a
  • Case R = ǫ:
  • Case R = ∅:
  • Case R = R1 ∪ R2, R = R1 ◦ R2, or R = R∗

1:

We already gave the constructions when we showed that regular languages were closed under the regular operations.

4-11

slide-13
SLIDE 13

✬ ✫ ✩ ✪

NFAs from Regular Expressions

Let us construct an NFA for (a ∪ b)∗bc Start from innermost expressions and work out:

  • a
  • b
  • So a ∪ b yields:
  • a
  • ǫ
  • ǫ
  • b
  • 4-12
slide-14
SLIDE 14

✬ ✫ ✩ ✪ Then (a ∪ b)∗ yields:

  • a
  • ǫ
  • ǫ
  • ǫ
  • ǫ
  • b
  • ǫ
  • Finally (a ∪ b)∗bc yields:
  • a
  • ǫ
  • ǫ
  • ǫ
  • ǫ
  • ǫ
  • ǫ
  • b
  • c
  • b
  • ǫ
  • ǫ
  • Of course there are simpler, equivalent automata.

4-13

slide-15
SLIDE 15

✬ ✫ ✩ ✪

Regular Expressions from NFAs

We now show the ‘only if’ direction of the theorem. We sketch how an NFA can be turned into a regular expression in a systematic process of “state elimination”. In the process, arcs are labelled with regular expressions. Since we only eliminate states that are neither start nor accept states, the process produces either

  • R1
  • R2
  • R3
  • R4
  • r
  • R
  • (R1 ∪ R2R∗

3R4)∗R2R∗ 3 in the first case.

R∗ in the second. Note that Rs could well be ǫ or ∅.

4-14

slide-16
SLIDE 16

✬ ✫ ✩ ✪

The State Elimination Process

Consider a node

R1

  • R2
  • R3
  • Any such pair of incoming/outgoing arcs get

replaced by a single arc that bypasses the node. The new arc gets the label R1R∗

2R3.

If there are n accept states, we eliminate non-accept states first, then apply the process for each accepting state, giving n regular expressions. Then we form the union. Let us illustrate this process.

4-15

slide-17
SLIDE 17

✬ ✫ ✩ ✪

State Elimination Example

  • A

0,1

  • 1
  • B

0,1

  • C

0,1

  • D

First turn annotations into regular expressions:

  • A

0∪1

  • 1
  • B

0∪1

  • C

0∪1

  • D

Then eliminate B:

  • A

0∪1

  • 1(0∪1)
  • C

0∪1

  • D

Here we branch, eliminating C and D separately.

  • A

0∪1

  • 1(0∪1)
  • C
  • A

0∪1

  • 1(0∪1)(0∪1)
  • D

4-16

slide-18
SLIDE 18

✬ ✫ ✩ ✪

State Elimination Example (cont.)

The resulting regular expression is (0 ∪ 1)∗1(0 ∪ 1) ∪ (0 ∪ 1)∗1(0 ∪ 1)(0 ∪ 1) That language could also be written (0 ∪ 1)∗1(0 ∪ 1)(ǫ ∪ 0 ∪ 1) Sipser provides all the details of this kind of translation.

4-17

slide-19
SLIDE 19

✬ ✫ ✩ ✪

Some Useful Laws for Regexps

A ∪ A = A A ∪ B = B ∪ A (A ∪ B) ∪ C = A ∪ (B ∪ C) (A ◦ B) ◦ C = A ◦ (B ◦ C) ∅ ∪ A = A ǫ ◦ A = A ◦ ǫ = A ∅ ◦ A = A ◦ ∅ = ∅ (A ∪ B) ◦ C = A ◦ C ∪ B ◦ C A ◦ (B ∪ C) = A ◦ B ∪ A ◦ C (A∗)∗ = A∗ ∅∗ = ǫ∗ = ǫ (ǫ ∪ A)∗ = A∗ (A ∪ B)∗ = (A∗B∗)∗

4-18