Nonregular Languages Z. Sawa (TU Ostrava) Theoretical Computer - - PowerPoint PPT Presentation

nonregular languages
SMART_READER_LITE
LIVE PREVIEW

Nonregular Languages Z. Sawa (TU Ostrava) Theoretical Computer - - PowerPoint PPT Presentation

Nonregular Languages Z. Sawa (TU Ostrava) Theoretical Computer Science October 22, 2020 1 / 18 Nonregular Languages Not all languages are regular. There are languages for which there exist no finite automata accepting them. Examples of


slide-1
SLIDE 1

Nonregular Languages

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 1 / 18

slide-2
SLIDE 2

Nonregular Languages

Not all languages are regular. There are languages for which there exist no finite automata accepting them. Examples of nonregular languages: L1 = {anbn | n ≥ 0} L2 = {ww | w ∈ {a, b}∗} L3 = {wwR | w ∈ {a, b}∗} Remark: The existence of nonregular languages is already apparent from the fact that there are only countably many (nonisomorphic) automata working over some alphabet Σ but there are uncountably many languages

  • ver the alphabet Σ.
  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 2 / 18

slide-3
SLIDE 3

Nonregular Languages

How to prove that some language L is not regular? A language is not regular if there is no automaton (i.e., it is not possible to construct an automaton) accepting the language. But how to prove that something does not exist?

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 3 / 18

slide-4
SLIDE 4

Nonregular Languages

How to prove that some language L is not regular? A language is not regular if there is no automaton (i.e., it is not possible to construct an automaton) accepting the language. But how to prove that something does not exist? The answer: By contradiction. E.g., we can assume there is some automaton A accepting the language L, and show that this assumption leads to a contradiction.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 3 / 18

slide-5
SLIDE 5

Nonregular Languages

We show that language L = {anbn | n ≥ 0} is not regular. The proof by contradiction. Let us assume there exists a DFA A = (Q, Σ, δ, q0, F) such that L(A) = L.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 4 / 18

slide-6
SLIDE 6

Nonregular Languages

We show that language L = {anbn | n ≥ 0} is not regular. The proof by contradiction. Let us assume there exists a DFA A = (Q, Σ, δ, q0, F) such that L(A) = L. Let |Q| = n.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 4 / 18

slide-7
SLIDE 7

Nonregular Languages

We show that language L = {anbn | n ≥ 0} is not regular. The proof by contradiction. Let us assume there exists a DFA A = (Q, Σ, δ, q0, F) such that L(A) = L. Let |Q| = n. Consider word z = anbn.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 4 / 18

slide-8
SLIDE 8

Nonregular Languages

We show that language L = {anbn | n ≥ 0} is not regular. The proof by contradiction. Let us assume there exists a DFA A = (Q, Σ, δ, q0, F) such that L(A) = L. Let |Q| = n. Consider word z = anbn. Since z ∈ L, there must be an accepting computation of the automaton A

q0

a

− → q1

a

− → q2

a

− → · · ·

a

− → qn−1

a

− → qn

b

− → qn+1

b

− → · · ·

b

− → q2n−1

b

− → q2n

where q0 is an initial state, and q2n ∈ F.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 4 / 18

slide-9
SLIDE 9

Nonregular Languages

Consider now the first n + 1 states of the computation

q0

a

− → q1

a

− → q2

a

− → · · ·

a

− → qn−1

a

− → qn

b

− → qn+1

b

− → · · ·

b

− → q2n−1

b

− → q2n

i.e., the sequence of states q0, q1, . . . , qn. It is obvious that all states in this sequence can not be pairwise different, since |Q| = n and the sequence has n + 1 elements. This means that there exists a state q ∈ Q which occurs (at least) twice in the sequence.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 5 / 18

slide-10
SLIDE 10

Nonregular Languages

Consider now the first n + 1 states of the computation

q0

a

− → q1

a

− → q2

a

− → · · ·

a

− → qn−1

a

− → qn

b

− → qn+1

b

− → · · ·

b

− → q2n−1

b

− → q2n

i.e., the sequence of states q0, q1, . . . , qn. It is obvious that all states in this sequence can not be pairwise different, since |Q| = n and the sequence has n + 1 elements. This means that there exists a state q ∈ Q which occurs (at least) twice in the sequence. It is an application of so called pigeonhole principle.

Pigeonhole principle

If we have n + 1 pigeons in n holes then there is at least one hole containing at least two pigeons.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 5 / 18

slide-11
SLIDE 11

Nonregular Languages

Consider now the first n + 1 states of the computation

q0

a

− → q1

a

− → q2

a

− → · · ·

a

− → qn−1

a

− → qn

b

− → qn+1

b

− → · · ·

b

− → q2n−1

b

− → q2n

i.e., the sequence of states q0, q1, . . . , qn. It is obvious that all states in this sequence can not be pairwise different, since |Q| = n and the sequence has n + 1 elements. This means that there exists a state q ∈ Q which occurs (at least) twice in the sequence. I.e., there are indexes i, j such that 0 ≤ i < j ≤ n and qi = qj which means that the automaton A must go through a cycle when reading the symbols a in the word z = anbn.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 5 / 18

slide-12
SLIDE 12

Nonregular Languages

a a a a a a a a a a a b b b b a

q0 q1 q2 qi−1 qi = qj qi+1 qi+2 qi+3 qj−1 qj+1 qj+2 qn−1 qn qn+1 qn+2 q2n−1 q2n

u v w The word z = anbn can be divided into three parts u, v, w such that z = uvw: u = ai v = aj−i w = an−jbn

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 6 / 18

slide-13
SLIDE 13

Nonregular Languages

For the words u = ai, v = aj−i, and w = an−jbn we have q0

u

− → qi qi

v

− → qj qj

w

− → q2n Let r be the length of the word v, i.e., r = j − i (obviously r > 0, due to i < j). Since qi = qj, the automaton accepts word uw = an−rbn that does not belong to L: q0

u

− → qi

w

− → q2n The word uvvw = an+rbn, that also does not belong to L, is accepted too: q0

u

− → qi

v

− → qi

v

− → qi

w

− → q2n

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 7 / 18

slide-14
SLIDE 14

Nonregular Languages

Similarly we can show that every word of the form uvvvv · · · vvw, i.e., of the form uvkw for some k ≥ 0, is accepted by the automaton A: q0

u

− → qi

v

− → qi

v

− → qi

v

− → · · ·

v

− → qi

v

− → qi

w

− → q2n A word of the form uvkw looks as follows: an−r+rkbn. Since r > 0, the following equivalence holds only for k = 1: n − r + rk = n This means that if k ≥ 1 then uvkw does not belong to the language L. However, the automaton A accepts each such word, which is a contradiction with the assumption that L(A) = {anbn | n ≥ 0}.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 8 / 18

slide-15
SLIDE 15

Pumping Lemma

Let us assume that language L is accepted by some particular automaton A, i.e., L = L(A). Let us consider some arbitrary word z ∈ L where z = a1a2 · · · ak. Since automaton A accepts word z, there must be some accepting computation of the automaton, i.e., a sequence of states: q0, q1, q2, . . . , qk−1, qk

  • f length k + 1 where

q0 is an initial state qi−1

ai

− → qi for each i ∈ {1, 2, . . . , k} qk is an accepting state

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 9 / 18

slide-16
SLIDE 16

Pumping Lemma

Let us assume that A has n states (i.e., |Q| = n), and that |z| ≥ n. Since |z| = k, the computation of automaton A over word z forms a sequence, whose length is at least n + 1, that contains at most n different states: q0, q1, q2, . . . , qk−1, qk It follows that there must be at least one state q that occurs at least twice in this sequence (recall the pigeonhole principle).

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 10 / 18

slide-17
SLIDE 17

Pumping Lemma

Let us say that the repeated state occurs on positions i and j, i.e., qi = qj where i < j. q0, · · · , qi, · · · , qj, · · · , qk Remark: It is obvious that in fact we can find i and j such that i < j ≤ n. The word z can be divided into three parts: a1 · · · ai

u

ai+1 · · · aj

  • v

aj+1 · · · ak

  • w

q0

u

− → qi qi

v

− → qj (and so also qi

v

− → qi since qj = qi) qj

w

− → qk (and so also qi

w

− → qk since qj = qi)

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 11 / 18

slide-18
SLIDE 18

Pumping Lemma

Consider now words: a1 · · · ai

u

aj+1 · · · ak

  • w

a1 · · · ai

u

ai+1 · · · aj

  • v

ai+1 · · · aj

  • v

aj+1 · · · ak

  • w

a1 · · · ai

u

ai+1 · · · aj

  • v

ai+1 · · · aj

  • v

ai+1 · · · aj

  • v

aj+1 · · · ak

  • w

· · · It is obvious that A accepts all of them because q0

u

− → qi qi

v

− → qi qi

w

− → qk where qk ∈ F

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 12 / 18

slide-19
SLIDE 19

Pumping Lemma

Pumping Lemma

If language L is regular then there exists n ∈ N such that every word z ∈ L such that |z| ≥ n can be divided into subwords u, v, w such that z = uvw, |uv| ≤ n, |v| ≥ 1, and for every i ≥ 0 it holds that uviw ∈ L. Formally: If L is regular then (∃n ∈ N)(∀z ∈ L s.t. |z| ≥ n)(∃u, v, w s.t. z = uvw, |uv| ≤ n, |v| ≥ 1) (∀i ≥ 0) : uviw ∈ L

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 13 / 18

slide-20
SLIDE 20

Pumping Lemma

We can take the contrapositive of the pumping lemma. (A ⇒ B is equivalent to ¬B ⇒ ¬A.) If (∀n ∈ N)(∃z ∈ L s.t. |z| ≥ n)(∀u, v, w s.t. z = uvw, |uv| ≤ n, |v| ≥ 1) (∃i ≥ 0) : uviw ∈ L, then L is not regular. So if we want to show that a language L is not regular, it is sufficient to show that L satisfies this condition.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 14 / 18

slide-21
SLIDE 21

Pumping Lemma

Example: Let us consider laguage L = {aibi | i ≥ 0}. Let us assume that L is accepted by some automaton with n states. Let us consider word z = anbn. Let us consider all possibilities how z can be divided into three subwords u, v, w satisfying conditions |uv| ≤ n and |v| ≥ 1. It is obvious that words u and v contain only symbols a. For every particular division there are some j and k such that j + k ≤ n, k ≥ 1, and

u = aj v = ak w = an−(j+k)bn

If we choose i = 0, we obtain uviw = uw = an−kbn. Since n − k < n, we have uviw ∈ L.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 15 / 18

slide-22
SLIDE 22

Pumping Lemma

Remark: Proving that some first order logic formula with alternating universal and existential quantifiers can be viewed as game played by two players, Player A and Player B. Player A chooses values of variables bound by existential quantifiers and Player B values of variables bound by universal quantifiers. If we want to refute the given claim, it is sufficient to find a winning strategy for Player B.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 16 / 18

slide-23
SLIDE 23

Pumping Lemma

If L is regular then (∃n ∈ N)(∀z ∈ L s.t. |z| ≥ n)(∃u, v, w s.t. z = uvw, |uv| ≤ n, |v| ≥ 1) (∀i ≥ 0) : uviw ∈ L. The game for Pumping Lemma looks as follows:

1 Player A chooses some n ∈ N. 2 Player B chooses a word z such that z ∈ L and |z| ≥ n. 3 Player A chooses words u, v, w such that z = uvw, |uv| ≤ n, |v| ≥ 1. 4 Player B chooses i ≥ 0. 5 If uviw ∈ L then Player A wins. If uviw /

∈ L then Player B wins. If Player B has a winning strategy in this game then L is not regular.

  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 17 / 18

slide-24
SLIDE 24

Pumping Lemma

Example: L = {aibi | i ≥ 0}

1 Player A chooses n > 0. 2 Player B chooses z = anbn. 3 Player A chooses words u, v, w such that z = uvw, |uv| ≤ n, |v| ≥ 1. 4 Player B chooses i = 0. 5 Player B wins, since no matter what Player A does, we always have

uviw ∈ L because a non-empty word z occurs in the part of word z consisting only of symbols a, and when we omit it, we obtain a word

  • f the form akbn where k < n, which does not belong to L.
  • Z. Sawa (TU Ostrava)

Theoretical Computer Science October 22, 2020 18 / 18