INF2080 2. Regular Expressions and Nonregular languages Daniel Lupp - - PowerPoint PPT Presentation

inf2080
SMART_READER_LITE
LIVE PREVIEW

INF2080 2. Regular Expressions and Nonregular languages Daniel Lupp - - PowerPoint PPT Presentation

INF2080 2. Regular Expressions and Nonregular languages Daniel Lupp Universitetet i Oslo 25th January 2018 Department of University of Informatics Oslo INF2080 Lecture :: 25th January 1 / 39 Last week Deterministic finite automata (DFA)


slide-1
SLIDE 1

INF2080

  • 2. Regular Expressions and Nonregular languages

Daniel Lupp

Universitetet i Oslo

25th January 2018

Department of Informatics University of Oslo

INF2080 Lecture :: 25th January 1 / 39

slide-2
SLIDE 2

Last week

Deterministic finite automata (DFA)

start 1

INF2080 Lecture :: 25th January 2 / 39

slide-3
SLIDE 3

Last week

Deterministic finite automata (DFA)

start 1

Regular languages are those languages accepted by DFA’s

INF2080 Lecture :: 25th January 2 / 39

slide-4
SLIDE 4

Last week

Deterministic finite automata (DFA)

start 1

Regular languages are those languages accepted by DFA’s Nondeterministic automata (NFA)

start 0, 1 1 start 2 3 ε a b a, b a

INF2080 Lecture :: 25th January 2 / 39

slide-5
SLIDE 5

Last week

Deterministic finite automata (DFA)

start 1

Regular languages are those languages accepted by DFA’s Nondeterministic automata (NFA)

start 0, 1 1 start 2 3 ε a b a, b a

DFA ↔ NFA

INF2080 Lecture :: 25th January 2 / 39

slide-6
SLIDE 6

Regular Expressions

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ,

INF2080 Lecture :: 25th January 3 / 39

slide-7
SLIDE 7

Regular Expressions

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε,

INF2080 Lecture :: 25th January 3 / 39

slide-8
SLIDE 8

Regular Expressions

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅,

INF2080 Lecture :: 25th January 3 / 39

slide-9
SLIDE 9

Regular Expressions

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅, (R1 ∪ R2) for regular expressions R1, R2,

INF2080 Lecture :: 25th January 3 / 39

slide-10
SLIDE 10

Regular Expressions

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅, (R1 ∪ R2) for regular expressions R1, R2, (R1R2) for regular expressions R1, R2,

INF2080 Lecture :: 25th January 3 / 39

slide-11
SLIDE 11

Regular Expressions

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅, (R1 ∪ R2) for regular expressions R1, R2, (R1R2) for regular expressions R1, R2, R∗

1 for a regular expression R1.

INF2080 Lecture :: 25th January 3 / 39

slide-12
SLIDE 12

Regular Expressions

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅, (R1 ∪ R2) for regular expressions R1, R2, (R1R2) for regular expressions R1, R2, R∗

1 for a regular expression R1.

→ Regular expressions represent languages!

INF2080 Lecture :: 25th January 3 / 39

slide-13
SLIDE 13

Regular Expressions - Examples

What languages do the following regular expressions (RE) represent? 0∗

INF2080 Lecture :: 25th January 4 / 39

slide-14
SLIDE 14

Regular Expressions - Examples

What languages do the following regular expressions (RE) represent? 0∗ 10∗1

INF2080 Lecture :: 25th January 4 / 39

slide-15
SLIDE 15

Regular Expressions - Examples

What languages do the following regular expressions (RE) represent? 0∗ 10∗1 (1(0 ∪ 1)∗1) ∪ (0(0 ∪ 1)∗0) ∪ 0 ∪ 1

INF2080 Lecture :: 25th January 4 / 39

slide-16
SLIDE 16

Regular Expressions - Automata

What is the connection between RE and DFA/NFA?

INF2080 Lecture :: 25th January 5 / 39

slide-17
SLIDE 17

Regular Expressions - Automata

What is the connection between RE and DFA/NFA? Language 0(0 ∪ 1)∗0:

INF2080 Lecture :: 25th January 5 / 39

slide-18
SLIDE 18

Regular Expressions - Automata

What is the connection between RE and DFA/NFA? Language 0(0 ∪ 1)∗0: 2 1 start 3 0, 1

INF2080 Lecture :: 25th January 5 / 39

slide-19
SLIDE 19

Regular Expressions and Automata

What is the connection between RE and DFA/NFA? Can all RE be represented using DFA/NFA? Can all DFA/NFA be described by RE?

INF2080 Lecture :: 25th January 6 / 39

slide-20
SLIDE 20

Regular Expressions and Automata

What is the connection between RE and DFA/NFA? Can all RE be represented using DFA/NFA? Can all DFA/NFA be described by RE? Yes!

INF2080 Lecture :: 25th January 6 / 39

slide-21
SLIDE 21

Regular Expressions and Automata

Proposition Every language described by an RE is regular. Proof based on inductive definition of RE!

INF2080 Lecture :: 25th January 7 / 39

slide-22
SLIDE 22

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅, (R1 ∪ R2) for regular expressions R1, R2, (R1R2) for regular expressions R1, R2, R∗

1 for a regular expression R1.

INF2080 Lecture :: 25th January 8 / 39

slide-23
SLIDE 23

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅, (R1 ∪ R2) for regular expressions R1, R2, (R1R2) for regular expressions R1, R2, R∗

1 for a regular expression R1.

if R = a for a ∈ Σ, then L(R) = {a} is ac- cepted by start a

INF2080 Lecture :: 25th January 8 / 39

slide-24
SLIDE 24

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅, (R1 ∪ R2) for regular expressions R1, R2, (R1R2) for regular expressions R1, R2, R∗

1 for a regular expression R1.

if R = a for a ∈ Σ, then L(R) = {a} is ac- cepted by start a If R = ε, then L(R) = {ε} is accepted by start

INF2080 Lecture :: 25th January 8 / 39

slide-25
SLIDE 25

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅, (R1 ∪ R2) for regular expressions R1, R2, (R1R2) for regular expressions R1, R2, R∗

1 for a regular expression R1.

if R = a for a ∈ Σ, then L(R) = {a} is ac- cepted by start a If R = ε, then L(R) = {ε} is accepted by start If R = ∅, then L(R) = ∅ is accepted by start

INF2080 Lecture :: 25th January 8 / 39

slide-26
SLIDE 26

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅, (R1 ∪ R2) for regular expressions R1, R2, (R1R2) for regular expressions R1, R2, R∗

1 for a regular expression R1.

The rest is union, concatanation and Kleene star of regular languages, as discussed last week!

INF2080 Lecture :: 25th January 9 / 39

slide-27
SLIDE 27

Definition (Regular Expression) Given an alphabet Σ, a regular expression is a for some a ∈ Σ, ε, ∅, (R1 ∪ R2) for regular expressions R1, R2, (R1R2) for regular expressions R1, R2, R∗

1 for a regular expression R1.

The rest is union, concatanation and Kleene star of regular languages, as discussed last week! (recall: the union/concatanation/Kleene star of regular languages is itself regular)

INF2080 Lecture :: 25th January 9 / 39

slide-28
SLIDE 28

Regular Expressions and Automata

So we’ve just proven Proposition Every language described by a RE is regular.

INF2080 Lecture :: 25th January 10 / 39

slide-29
SLIDE 29

Regular Expressions and Automata

So we’ve just proven Proposition Every language described by a RE is regular. Next: Proposition Every regular language can be described using a RE.

INF2080 Lecture :: 25th January 10 / 39

slide-30
SLIDE 30

GNFA

Generalized Nondeterministic Finite Automaton (GNFA): NFA where the transitions are RE, not

  • nly symbols from Σ.

INF2080 Lecture :: 25th January 11 / 39

slide-31
SLIDE 31

GNFA

Generalized Nondeterministic Finite Automaton (GNFA): NFA where the transitions are RE, not

  • nly symbols from Σ.

start 1 (0 ∪ 1)∗ 1∗ start ((ab∗) ∪ (ba∗))∗

INF2080 Lecture :: 25th January 11 / 39

slide-32
SLIDE 32

GNFA

Generalized Nondeterministic Finite Automaton (GNFA): NFA where the transitions are RE, not

  • nly symbols from Σ.

some other assumptions for convenience: start 1 (0 ∪ 1)∗ 1∗ start ((ab∗) ∪ (ba∗))∗

INF2080 Lecture :: 25th January 11 / 39

slide-33
SLIDE 33

GNFA

Generalized Nondeterministic Finite Automaton (GNFA): NFA where the transitions are RE, not

  • nly symbols from Σ.

some other assumptions for convenience: start state goes to every other state, but has no incoming states start 1 (0 ∪ 1)∗ 1∗ start ((ab∗) ∪ (ba∗))∗

INF2080 Lecture :: 25th January 11 / 39

slide-34
SLIDE 34

GNFA

Generalized Nondeterministic Finite Automaton (GNFA): NFA where the transitions are RE, not

  • nly symbols from Σ.

some other assumptions for convenience: start state goes to every other state, but has no incoming states every state goes to the unique accepting state, which is different from the starting

  • state. The accepting state does not have

any outgoing arrows. start 1 (0 ∪ 1)∗ 1∗ start ((ab∗) ∪ (ba∗))∗

INF2080 Lecture :: 25th January 11 / 39

slide-35
SLIDE 35

GNFA

Generalized Nondeterministic Finite Automaton (GNFA): NFA where the transitions are RE, not

  • nly symbols from Σ.

some other assumptions for convenience: start state goes to every other state, but has no incoming states every state goes to the unique accepting state, which is different from the starting

  • state. The accepting state does not have

any outgoing arrows. all other states have one transition to all

  • ther states, including themselves.

start 1 (0 ∪ 1)∗ 1∗ start ((ab∗) ∪ (ba∗))∗

INF2080 Lecture :: 25th January 11 / 39

slide-36
SLIDE 36

GNFA: Convenient assumptions

Generalized Nondeterministic Finite Automaton (GNFA): NFA where the transitions are RE, not

  • nly symbols from Σ.

some other assumptions for convenience: start state goes to every other state, but has no incoming states every state goes to the unique accepting state, which is different from the starting

  • state. The accepting state does not have

any outgoing arrows. all other states have one transition to all

  • ther states, including themselves.

A start B C D 1 (0 ∪ 1)∗ 1∗ ǫ

INF2080 Lecture :: 25th January 12 / 39

slide-37
SLIDE 37

GNFA: Convenient assumptions

Generalized Nondeterministic Finite Automaton (GNFA): NFA where the transitions are RE, not

  • nly symbols from Σ.

some other assumptions for convenience: start state goes to every other state, but has no incoming states every state goes to the unique accepting state, which is different from the starting

  • state. The accepting state does not have

any outgoing arrows. all other states have one transition to all

  • ther states, including themselves.

start A B C D ǫ 1 (0 ∪ 1)∗ 1∗ ε ∅ ∅ ∅

INF2080 Lecture :: 25th January 13 / 39

slide-38
SLIDE 38

GNFA: Convenient assumptions

Generalized Nondeterministic Finite Automaton (GNFA): NFA where the transitions are RE, not

  • nly symbols from Σ.

some other assumptions for convenience: start state goes to every other state, but has no incoming states every state goes to the unique accepting state, which is different from the starting

  • state. The accepting state does not have

any outgoing arrows. all other states have one transition to all

  • ther states, including themselves. ()

start A B C D ǫ 1 (0 ∪ 1)∗ 1∗ ε ∅ ∅ ∅ ε ε ∅ ∅ ∅

INF2080 Lecture :: 25th January 14 / 39

slide-39
SLIDE 39

GNFA: Convenient assumptions

Generalized Nondeterministic Finite Automaton (GNFA): NFA where the transitions are RE, not

  • nly symbols from Σ.

some other assumptions for convenience: start state goes to every other state, but has no incoming states every state goes to the unique accepting state, which is different from the starting

  • state. The accepting state does not have

any outgoing arrows. all other states have one transition to all

  • ther states, including themselves. ()

start A B C D ǫ 1 (0 ∪ 1)∗ 1∗ ε ∅ ∅ ∅ ε ε ∅ ∅ ∅

for the last point, add ∅ transitions between any two non-accepting/starting states that were not previously connected (e.g., (B, D))

INF2080 Lecture :: 25th January 14 / 39

slide-40
SLIDE 40

Generalized Nondeterministic Finite Automata

Definition A generalized nondeterministic finite automaton (GNFA) is a 5-tuple (Q, Σ, δ, qstart, qaccept) where

1 Q is a finite set of states 2 Σ is the input alphabet 3 δ : (Q \ {qaccept}) × (Q \ {qstart}) → R is the transition function, where R is the set of

all RE’s over Σ,

4 qstart is the start state, and 5 qaccept is the accept state. INF2080 Lecture :: 25th January 15 / 39

slide-41
SLIDE 41

Regular Expressions and Automata

Proposition Every regular language can be described using a RE.

INF2080 Lecture :: 25th January 16 / 39

slide-42
SLIDE 42

Regular Expressions and Automata

Proposition Every regular language can be described using a RE. Proof idea: take DFA and transform into a GNFA that accepts the same language. Iteratively remove (non-starting and non-accepting) states so that the same language is accepted, until

  • nly the starting and accepting state remain. Then the RE along the transition between the

two states describes the regular language.

INF2080 Lecture :: 25th January 16 / 39

slide-43
SLIDE 43

Regular Expressions and Automata

Proposition Every regular language can be described using a RE. Proof: Given a DFA M, we construct an equivalent GNFA G by adding a new start state qstart with an ε transition to the old start state q0, as well as a new accepting state qacept, with ε transitions from all old accept states. add ∅ transitions for all state pairs that do not have a transition in M.

INF2080 Lecture :: 25th January 17 / 39

slide-44
SLIDE 44

Regular Expressions and Automata

Recall the “convenient” properties of GNFA: start state goes to every other state, but has no incoming states every state goes to the unique accepting state, which is different from the starting

  • state. The accepting state does not have

any outgoing arrows. all other states have one transition to all

  • ther states, including themselves.

INF2080 Lecture :: 25th January 18 / 39

slide-45
SLIDE 45

Regular Expressions and Automata

Recall the “convenient” properties of GNFA: start state goes to every other state, but has no incoming states every state goes to the unique accepting state, which is different from the starting

  • state. The accepting state does not have

any outgoing arrows. all other states have one transition to all

  • ther states, including themselves.

⇒ When removing X, we only need to consider situations like this:

X R2 R4 R3 R1

INF2080 Lecture :: 25th January 18 / 39

slide-46
SLIDE 46

Regular Expressions and Automata

Recall the “convenient” properties of GNFA: start state goes to every other state, but has no incoming states every state goes to the unique accepting state, which is different from the starting

  • state. The accepting state does not have

any outgoing arrows. all other states have one transition to all

  • ther states, including themselves.

⇒ When removing X, we only need to consider situations like this:

X R2 R4 R3 R1

R1 ∪ (R2R∗

3R4) INF2080 Lecture :: 25th January 18 / 39

slide-47
SLIDE 47

Regular Expressions and Automata

Let’s formalize this! Let us define a procedure CONVERT(G):

1 If k = 2 then G only has one start and one accept state, so return the regular expression

R of the transition connecting them.

INF2080 Lecture :: 25th January 19 / 39

slide-48
SLIDE 48

Regular Expressions and Automata

Let’s formalize this! Let us define a procedure CONVERT(G):

1 If k = 2 then G only has one start and one accept state, so return the regular expression

R of the transition connecting them.

2 if k > 2 select a state q′ ∈ {qaccept, qstart}. Define G ′ = {Q′, Σ, δ′, qstart, qaccept} with

Q′ = Q \ {q′} and δ′(qi, qj) = R1 ∪ R2R∗

3R4

where R1 = δ(qi, qj), R2 = δ(qi, q′), R3 = δ(q′, q′), R4 = δ(q′, qj), and .

INF2080 Lecture :: 25th January 19 / 39

slide-49
SLIDE 49

Regular Expressions and Automata

Let’s formalize this! Let us define a procedure CONVERT(G):

1 If k = 2 then G only has one start and one accept state, so return the regular expression

R of the transition connecting them.

2 if k > 2 select a state q′ ∈ {qaccept, qstart}. Define G ′ = {Q′, Σ, δ′, qstart, qaccept} with

Q′ = Q \ {q′} and δ′(qi, qj) = R1 ∪ R2R∗

3R4

where R1 = δ(qi, qj), R2 = δ(qi, q′), R3 = δ(q′, q′), R4 = δ(q′, qj), and .

3 Return the result of CONVERT(G ′). INF2080 Lecture :: 25th January 19 / 39

slide-50
SLIDE 50

Regular Expressions and Automata

Let’s formalize this! Let us define a procedure CONVERT(G):

1 If k = 2 then G only has one start and one accept state, so return the regular expression

R of the transition connecting them.

2 if k > 2 select a state q′ ∈ {qaccept, qstart}. Define G ′ = {Q′, Σ, δ′, qstart, qaccept} with

Q′ = Q \ {q′} and δ′(qi, qj) = R1 ∪ R2R∗

3R4

where R1 = δ(qi, qj), R2 = δ(qi, q′), R3 = δ(q′, q′), R4 = δ(q′, qj), and .

3 Return the result of CONVERT(G ′). 4 correctness still remains to be shown! See book for details! (Claim 1.65) INF2080 Lecture :: 25th January 19 / 39

slide-51
SLIDE 51

Regular Expressions and Automata

Proposition Every regular language can be described using a RE. Example: DFA:

start 1 1 1

INF2080 Lecture :: 25th January 20 / 39

slide-52
SLIDE 52

Regular Expressions and Automata

Proposition Every regular language can be described using a RE. Example: DFA:

start 1 1 1

GNFA:

start 1 1 1 ∅ ∅ ∅

INF2080 Lecture :: 25th January 20 / 39

slide-53
SLIDE 53

Regular Expressions and Automata

Proposition Every regular language can be described using a RE. Example: Remove state X:

start X 1 1 1 ∅ ∅ ∅

INF2080 Lecture :: 25th January 21 / 39

slide-54
SLIDE 54

Regular Expressions and Automata

Proposition Every regular language can be described using a RE. Example: Remove state X:

start X 1 1 1 ∅ ∅ ∅ start 1 ∅ ∪ (10∗1)

INF2080 Lecture :: 25th January 21 / 39

slide-55
SLIDE 55

Regular Expressions and Automata

Proposition Every regular language can be described using a RE. Example: Remove state Y:

start Y 1 (10∗1)

INF2080 Lecture :: 25th January 22 / 39

slide-56
SLIDE 56

Regular Expressions and Automata

Proposition Every regular language can be described using a RE. Example: Remove state Y:

start Y 1 (10∗1) start (10∗1) ∪ (01∗0)

INF2080 Lecture :: 25th January 22 / 39

slide-57
SLIDE 57

Summary

So RE = GNFA = DFA = NFA = Regular languages...

INF2080 Lecture :: 25th January 23 / 39

slide-58
SLIDE 58

Summary

So RE = GNFA = DFA = NFA = Regular languages... But when is a language nonregular? How can we check?

INF2080 Lecture :: 25th January 23 / 39

slide-59
SLIDE 59

Summary

So RE = GNFA = DFA = NFA = Regular languages... But when is a language nonregular? How can we check? ⇒ Pumping Lemma!

INF2080 Lecture :: 25th January 23 / 39

slide-60
SLIDE 60

Pumping Lemma

DFAs only have finite memory, aka states.

INF2080 Lecture :: 25th January 24 / 39

slide-61
SLIDE 61

Pumping Lemma

DFAs only have finite memory, aka states. Pumping lemma gives a pumping length: if a string is longer than the pumping length, it can be pumped, i.e., there is a substring that can be repeated arbitrarily often such that the string remains in the language

INF2080 Lecture :: 25th January 24 / 39

slide-62
SLIDE 62

Pumping Lemma

DFAs only have finite memory, aka states. Pumping lemma gives a pumping length: if a string is longer than the pumping length, it can be pumped, i.e., there is a substring that can be repeated arbitrarily often such that the string remains in the language If a DFA has p states, and a string has length ≥ p, then the accepting path in the DFA must visit at least p + 1 states. In other words, at least one state appears twice. ⇒ loop! This loop can be repeated while staying in the language.

INF2080 Lecture :: 25th January 24 / 39

slide-63
SLIDE 63

Pumping Lemma - Example

a start b c d 1 1 1

INF2080 Lecture :: 25th January 25 / 39

slide-64
SLIDE 64

Pumping Lemma - Example

a start b c d 1 1 1

Language (10∗1) ∪ (01∗0)

INF2080 Lecture :: 25th January 25 / 39

slide-65
SLIDE 65

Pumping Lemma - Example

a start b c d 1 1 1

Language (10∗1) ∪ (01∗0) DFA has 4 states

INF2080 Lecture :: 25th January 25 / 39

slide-66
SLIDE 66

Pumping Lemma - Example

a start b c d 1 1 1

Language (10∗1) ∪ (01∗0) DFA has 4 states consider string 10001, length 5

INF2080 Lecture :: 25th January 25 / 39

slide-67
SLIDE 67

Pumping Lemma - Example

a start b c d 1 1 1

Language (10∗1) ∪ (01∗0) DFA has 4 states consider string 10001, length 5 ⇒ path must contain a loop (in this case, at node b)

INF2080 Lecture :: 25th January 25 / 39

slide-68
SLIDE 68

Pumping Lemma - Example

a start b c d e 1 1 1

INF2080 Lecture :: 25th January 26 / 39

slide-69
SLIDE 69

Pumping Lemma - Example

a start b c d e 1 1 1

Language 1(010)∗1

INF2080 Lecture :: 25th January 26 / 39

slide-70
SLIDE 70

Pumping Lemma - Example

a start b c d e 1 1 1

Language 1(010)∗1 DFA has 5 states

INF2080 Lecture :: 25th January 26 / 39

slide-71
SLIDE 71

Pumping Lemma - Example

a start b c d e 1 1 1

Language 1(010)∗1 DFA has 5 states consider string 10101, length 5

INF2080 Lecture :: 25th January 26 / 39

slide-72
SLIDE 72

Pumping Lemma - Example

a start b c d e 1 1 1

Language 1(010)∗1 DFA has 5 states consider string 10101, length 5 ⇒ path must contain a loop (in this case, at nodes b,d,e)

INF2080 Lecture :: 25th January 26 / 39

slide-73
SLIDE 73

Pumping Lemma - Example

a start b c d e 1 1 1

Language 1(010)∗1 DFA has 5 states consider string 10101, length 5 ⇒ path must contain a loop (in this case, at nodes b,d,e) ⇒ 10100101 is also a word!

INF2080 Lecture :: 25th January 26 / 39

slide-74
SLIDE 74

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p. INF2080 Lecture :: 25th January 27 / 39

slide-75
SLIDE 75

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Proof: We formalize our intuition. Let M = (Q.Σ, δ, q0, F) be a DFA, A = L(M) the language accepted by M, and p be the number of states in M.

INF2080 Lecture :: 25th January 28 / 39

slide-76
SLIDE 76

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Proof: We formalize our intuition. Let M = (Q.Σ, δ, q0, F) be a DFA, A = L(M) the language accepted by M, and p be the number of states in M. Let w = w1 · · · wn be a word in A of length n ≥ p. Since w ∈ A, it is accepted by M, i.e., there exists a sequence of states s1, s2, . . . sn+1 of length n + 1, where sn+1 is an accept state and δ(si, wi+1) = si+1.

INF2080 Lecture :: 25th January 28 / 39

slide-77
SLIDE 77

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let w = w1 · · · wn be a word in A of length n ≥ p. Since w ∈ A, it is accepted by M, i.e., there exists a sequence of states s1, s2, . . . sn+1 of length n + 1.

INF2080 Lecture :: 25th January 29 / 39

slide-78
SLIDE 78

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let w = w1 · · · wn be a word in A of length n ≥ p. Since w ∈ A, it is accepted by M, i.e., there exists a sequence of states s1, s2, . . . sn+1 of length n + 1. Since n + 1 ≥ p + 1, one state must occur twice within the first p + 1 elements of the sequence (pigeonhole pricniple).

INF2080 Lecture :: 25th January 29 / 39

slide-79
SLIDE 79

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let w = w1 · · · wn be a word in A of length n ≥ p. Since w ∈ A, it is accepted by M, i.e., there exists a sequence of states s1, s2, . . . sn+1 of length n + 1. Since n + 1 ≥ p + 1, one state must occur twice within the first p + 1 elements of the sequence (pigeonhole pricniple). Let these occurences be sj and sl. Since these occur in the first p + 1 elements of the sequence, we have l ≤ p + 1.

INF2080 Lecture :: 25th January 29 / 39

slide-80
SLIDE 80

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let these occurences be sj and sl. Since these occur in the first p + 1 elements of the sequence, we have l ≤ p + 1.

INF2080 Lecture :: 25th January 30 / 39

slide-81
SLIDE 81

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let these occurences be sj and sl. Since these occur in the first p + 1 elements of the sequence, we have l ≤ p + 1. define x = w1 · · · wj−1, y = wj · · · wl−1, z = wl · · · wn.

INF2080 Lecture :: 25th January 30 / 39

slide-82
SLIDE 82

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let these occurences be sj and sl. Since these occur in the first p + 1 elements of the sequence, we have l ≤ p + 1. define x = w1 · · · wj−1, y = wj · · · wl−1, z = wl · · · wn. Then |y| > 0 and |xy| ≤ p.

INF2080 Lecture :: 25th January 30 / 39

slide-83
SLIDE 83

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

define x = w1 · · · wj−1, y = wj · · · wl−1, z = wl · · · wn. x takes M from s1 to sj, y takes M from sj to sl, z takes M from sl to sn+1.

INF2080 Lecture :: 25th January 31 / 39

slide-84
SLIDE 84

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

define x = w1 · · · wj−1, y = wj · · · wl−1, z = wl · · · wn. x takes M from s1 to sj, y takes M from sj to sl, z takes M from sl to sn+1. Thus the word xyiz takes M from the start state s1 to sj, follows the path from sj to sl i times (recall that sj = sl), then takes M from sl to sn+1.

INF2080 Lecture :: 25th January 31 / 39

slide-85
SLIDE 85

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

define x = w1 · · · wj−1, y = wj · · · wl−1, z = wl · · · wn. x takes M from s1 to sj, y takes M from sj to sl, z takes M from sl to sn+1. Thus the word xyiz takes M from the start state s1 to sj, follows the path from sj to sl i times (recall that sj = sl), then takes M from sl to sn+1. Thus M accepts any word xyiz for i ≥ 0.

INF2080 Lecture :: 25th January 31 / 39

slide-86
SLIDE 86

Pumping Lemma

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

define x = w1 · · · wj−1, y = wj · · · wl−1, z = wl · · · wn. x takes M from s1 to sj, y takes M from sj to sl, z takes M from sl to sn+1. Thus the word xyiz takes M from the start state s1 to sj, follows the path from sj to sl i times (recall that sj = sl), then takes M from sl to sn+1. Thus M accepts any word xyiz for i ≥ 0.

INF2080 Lecture :: 25th January 31 / 39

slide-87
SLIDE 87

Pumping Lemma

very useful for determining if a language is nonregular

INF2080 Lecture :: 25th January 32 / 39

slide-88
SLIDE 88

Pumping Lemma

very useful for determining if a language is nonregular → find a string with length ≥ p such that the pumping lemma does not hold

INF2080 Lecture :: 25th January 32 / 39

slide-89
SLIDE 89

Pumping Lemma

very useful for determining if a language is nonregular → find a string with length ≥ p such that the pumping lemma does not hold not very useful for proving a language is regular

INF2080 Lecture :: 25th January 32 / 39

slide-90
SLIDE 90

Pumping Lemma

very useful for determining if a language is nonregular → find a string with length ≥ p such that the pumping lemma does not hold not very useful for proving a language is regular → not an if and only if statement!

INF2080 Lecture :: 25th January 32 / 39

slide-91
SLIDE 91

Pumping Lemma - Applied

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let A = {0n1n | n ≥ 0}.

INF2080 Lecture :: 25th January 33 / 39

slide-92
SLIDE 92

Pumping Lemma - Applied

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let A = {0n1n | n ≥ 0}. Is A regular?

INF2080 Lecture :: 25th January 33 / 39

slide-93
SLIDE 93

Pumping Lemma - Applied

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let A = {0n1n | n ≥ 0}. Is A regular? If it is, then the pumping lemma gives us a pumping length p. Let s = 0p1p.

INF2080 Lecture :: 25th January 33 / 39

slide-94
SLIDE 94

Pumping Lemma - Applied

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let A = {0n1n | n ≥ 0}. Let s = 0p1p.

INF2080 Lecture :: 25th January 34 / 39

slide-95
SLIDE 95

Pumping Lemma - Applied

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let A = {0n1n | n ≥ 0}. Let s = 0p1p. Condition 3 tells us that y consists of only 0s.

INF2080 Lecture :: 25th January 34 / 39

slide-96
SLIDE 96

Pumping Lemma - Applied

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let A = {0n1n | n ≥ 0}. Let s = 0p1p. Condition 3 tells us that y consists of only 0s. ⇒ then xyiz for i ≥ 2 has more 0s than 1s. Contradiction! ⇒ A is nonregular.

INF2080 Lecture :: 25th January 34 / 39

slide-97
SLIDE 97

Pumping Lemma - Applied

Even if a language is nonregular, it might contain strings for which the pumping lemma is true!

INF2080 Lecture :: 25th January 35 / 39

slide-98
SLIDE 98

Pumping Lemma - Applied

Even if a language is nonregular, it might contain strings for which the pumping lemma is true! We have to be careful!

INF2080 Lecture :: 25th January 35 / 39

slide-99
SLIDE 99

Pumping Lemma - Applied

Lemma (Pumping Lemma) |w| ≥ p, w = xyz, s.t.

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let B = {ω | ω contains an equal number of 0s and 1s}. Let w = (01)p.

INF2080 Lecture :: 25th January 36 / 39

slide-100
SLIDE 100

Pumping Lemma - Applied

Lemma (Pumping Lemma) |w| ≥ p, w = xyz, s.t.

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let B = {ω | ω contains an equal number of 0s and 1s}. Let w = (01)p. x = ε, y = 01, z = (01)p−1

INF2080 Lecture :: 25th January 36 / 39

slide-101
SLIDE 101

Pumping Lemma - Applied

Lemma (Pumping Lemma) |w| ≥ p, w = xyz, s.t.

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let B = {ω | ω contains an equal number of 0s and 1s}. Let w = (01)p. x = ε, y = 01, z = (01)p−1 all conditions are met!

INF2080 Lecture :: 25th January 36 / 39

slide-102
SLIDE 102

Pumping Lemma - Applied

Lemma (Pumping Lemma)

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let B = {ω | ω contains an equal number of 0s and 1s}. Let s = 0p1p.

INF2080 Lecture :: 25th January 37 / 39

slide-103
SLIDE 103

Pumping Lemma - Applied

Lemma (Pumping Lemma)

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let B = {ω | ω contains an equal number of 0s and 1s}. Let s = 0p1p. x = ε, y = 0p1p, z = ε

INF2080 Lecture :: 25th January 37 / 39

slide-104
SLIDE 104

Pumping Lemma - Applied

Lemma (Pumping Lemma)

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let B = {ω | ω contains an equal number of 0s and 1s}. Let s = 0p1p. x = ε, y = 0p1p, z = ε looks like it can be pumped

INF2080 Lecture :: 25th January 37 / 39

slide-105
SLIDE 105

Pumping Lemma - Applied

Lemma (Pumping Lemma)

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let B = {ω | ω contains an equal number of 0s and 1s}. Let s = 0p1p. x = ε, y = 0p1p, z = ε looks like it can be pumped, but are all conditions met?

INF2080 Lecture :: 25th January 37 / 39

slide-106
SLIDE 106

Pumping Lemma - Applied

Lemma (Pumping Lemma)

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let B = {ω | ω contains an equal number of 0s and 1s}. Let s = 0p1p. x = ε, y = 0p1p, z = ε looks like it can be pumped, but are all conditions met? condition 3 ⇒ y must contain only 0s, so it cannot be pumped

INF2080 Lecture :: 25th January 37 / 39

slide-107
SLIDE 107

Pumping Lemma - Applied

Lemma (Pumping Lemma)

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

Let B = {ω | ω contains an equal number of 0s and 1s}. Let s = 0p1p. x = ε, y = 0p1p, z = ε looks like it can be pumped, but are all conditions met? condition 3 ⇒ y must contain only 0s, so it cannot be pumped⇒ B nonregular!

INF2080 Lecture :: 25th January 37 / 39

slide-108
SLIDE 108

Pumping Lemma - Applied

A = {0n1n | n ≥ 0}. B = {ω | ω contains an equal number of 0s and 1s} Another way of showing B is nonregular is to reduce it to the nonregularity of A:

INF2080 Lecture :: 25th January 38 / 39

slide-109
SLIDE 109

Pumping Lemma - Applied

A = {0n1n | n ≥ 0}. B = {ω | ω contains an equal number of 0s and 1s} Another way of showing B is nonregular is to reduce it to the nonregularity of A: regular languages are closed under intersection

INF2080 Lecture :: 25th January 38 / 39

slide-110
SLIDE 110

Pumping Lemma - Applied

A = {0n1n | n ≥ 0}. B = {ω | ω contains an equal number of 0s and 1s} Another way of showing B is nonregular is to reduce it to the nonregularity of A: regular languages are closed under intersection and A = B ∩ 0∗1∗

INF2080 Lecture :: 25th January 38 / 39

slide-111
SLIDE 111

Pumping Lemma - Applied

A = {0n1n | n ≥ 0}. B = {ω | ω contains an equal number of 0s and 1s} Another way of showing B is nonregular is to reduce it to the nonregularity of A: regular languages are closed under intersection and A = B ∩ 0∗1∗ if B is regular and since 0∗1∗ is regular, then A must be as well

INF2080 Lecture :: 25th January 38 / 39

slide-112
SLIDE 112

Pumping Lemma - Applied

A = {0n1n | n ≥ 0}. B = {ω | ω contains an equal number of 0s and 1s} Another way of showing B is nonregular is to reduce it to the nonregularity of A: regular languages are closed under intersection and A = B ∩ 0∗1∗ if B is regular and since 0∗1∗ is regular, then A must be as well, contradiction!

INF2080 Lecture :: 25th January 38 / 39

slide-113
SLIDE 113

Summary

regular expressions are shorthand notations for languages

INF2080 Lecture :: 25th January 39 / 39

slide-114
SLIDE 114

Summary

regular expressions are shorthand notations for languages RE = GNFA = DFA = NFA, i.e., regular expressions are shorthand for regular languages

INF2080 Lecture :: 25th January 39 / 39

slide-115
SLIDE 115

Summary

regular expressions are shorthand notations for languages RE = GNFA = DFA = NFA, i.e., regular expressions are shorthand for regular languages proof involved transforming a DFA to a GNFA then reducing the number of states to 2 while accepting the same language

INF2080 Lecture :: 25th January 39 / 39

slide-116
SLIDE 116

Summary

regular expressions are shorthand notations for languages RE = GNFA = DFA = NFA, i.e., regular expressions are shorthand for regular languages proof involved transforming a DFA to a GNFA then reducing the number of states to 2 while accepting the same language → the regular expressions describe the paths in the DFA

INF2080 Lecture :: 25th January 39 / 39

slide-117
SLIDE 117

Summary

regular expressions are shorthand notations for languages RE = GNFA = DFA = NFA, i.e., regular expressions are shorthand for regular languages proof involved transforming a DFA to a GNFA then reducing the number of states to 2 while accepting the same language → the regular expressions describe the paths in the DFA every regular language has a pumping length

INF2080 Lecture :: 25th January 39 / 39

slide-118
SLIDE 118

Summary

regular expressions are shorthand notations for languages RE = GNFA = DFA = NFA, i.e., regular expressions are shorthand for regular languages proof involved transforming a DFA to a GNFA then reducing the number of states to 2 while accepting the same language → the regular expressions describe the paths in the DFA every regular language has a pumping length useful for determining if a language is nonregular

INF2080 Lecture :: 25th January 39 / 39