Notes to Closure Properties Simpli fi cation of the automata design L - - PowerPoint PPT Presentation

notes to closure properties
SMART_READER_LITE
LIVE PREVIEW

Notes to Closure Properties Simpli fi cation of the automata design L - - PowerPoint PPT Presentation

Notes to Closure Properties Simpli fi cation of the automata design L . = . L = { } . L = L . { } = L ( L ) L = 1 ) = L ( L 1 L 2 ) L 1 ( L 2 . L 2 ( L 1 . L 2 ) = ( L 1 . L 2 ) R L


slide-1
SLIDE 1

Notes to Closure Properties

Simplification of the automata design L.∅ = ∅.L = ∅ {λ}.L = L.{λ} = L (L∗)∗ = L∗ (L1 ∪ L2)∗ = L∗

1(L2.L∗ 1)∗ = L∗ 2(L1.L∗ 2)∗

(L1.L2)R = LR

2 .LR 1

∂w(L1 ∪ L2) = ∂w(L1) ∪ ∂w(L2) ∂w(Σ∗ − L) = Σ∗ − ∂wL Proof of non–regularity

L = {w|w ∈ {0, 1}∗, |w|1 = |w|2} is not regular since L ∩ {0i1j|i, j ≥ 0} = {0i1i|i ≥ 0} is not regular (pumping lemma).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 75 / 75 - 95

slide-2
SLIDE 2

Regular Expressions (RegE)

Definition 4.1 (Regular Expression (RegE), value of a RegE L(α)) Regular expressions α, β ∈ RegE(Σ) over a finite non–empty alphabet Σ = {x1, x2, . . . , xn} and their value L(α) is defined by induction: Basis: expression α for value L(α) ≡ [α] ∅ empty expression L(∅) = {} ≡ ∅ λ empty string L(λ) = {λ} a a ∈ Σ L(a) = {a}. Induction: expression value remark α + β L(α + β) = L(α) ∪ L(β) αβ L(αβ) = L(α)L(β) . may be used α∗ L(α∗) = L(α)∗ (α) L((α)) = L(α) brackets do not change the value. The class of regular expressions over Σ: RegE(Σ) is the smallest class closed under operations above.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 76 / 75 - 95

slide-3
SLIDE 3

Examples, Precedence

Example 4.1 (Regular Expressions) The language of alternating 0’s and 1’s may be written: either (01)∗ + (10)∗ + 1(01)∗ + 0(10)∗

  • r (λ + 1)(01)∗(λ + 0).

The language L((0∗10∗10∗1)∗0∗) = {w|w ∈ {0, 1}∗, |w|1 = 3k, k ≥ 0}. Definition 4.2 (Precedence) The star ∗is the operator with highest precedence, then concatenation ., the lowest precedence has the union +. Theorem 4.1 (RegE and DFA !Kleene theorem (a variant)) Any language recognizable by a DFA can be expressed by a regular expression. Any language of a regular expression can be recognized by a λ-NFA (therefore also a DFA).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 77 / 75 - 95

slide-4
SLIDE 4

Example

1 2 1 0,1 R(2)

12 = 1∗0(0 + 1)∗

R(0)

11

λ + 1 = R(0)

12

= R(0)

21

∅ = R(0)

22

(λ + 0 + 1) = R(1)

11

λ + 1 + (λ + 1)(λ + 1)∗(λ + 1) =1∗ R(1)

12

0 + (λ + 1)(λ + 1)∗0 =1∗0 R(1)

21

∅ + ∅(λ + 1)∗(λ + 1) =∅ R(1)

22

λ + 0 + 1 + ∅(λ + 1)∗0 =λ + 0 + 1 R(2)

11

1∗ + 1∗0(λ + 0 + 1)∗∅ =1∗ R(2)

12

1∗0 + 1∗0(λ + 0 + 1)∗(λ + 0 + 1) =1∗0(0 + 1)∗ R(2)

21

∅ + (λ + 0 + 1)(λ + 0 + 1)∗∅ =∅ R(2)

22

λ + 0 + 1 + (λ + 0 + 1)(λ + 0 + 1)∗(λ + 0 + 1)=(0 + 1)∗

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 78 / 75 - 95

slide-5
SLIDE 5

From a DFA to RegE

From a DFA to RegE Let us have a DFA A, QA = {1, . . . , n} with n states. Let R(k)

ij

be a regular expression, L(R(k)

ij ) = {w|δ∗ ≤k(i, w) = j} the set of

words transferring the state i into j in A where no state with an index higher than k is on the path. We iteratively construct R(k)

ij

pro k = 0, . . . , n. k = 0, i = j: R(0)

ij

= a1 + a2 + . . . + am where a1, a2, . . . , am are symbols on edges i into j (or R(0)

ij

= ∅ or R(0)

ij

= a for m = 0, 1). k = 0, i = j: loops, R(0)

ii

= λ + a1 + a2 + . . . + am where a1, a2, . . . , am are symbols on loops in i.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 79 / 75 - 95

slide-6
SLIDE 6
  • Induction. We have ∀i, j ∈ Q R(k)

ij . We construct R(k+1) ij

. i k+1 j R(k)

i,(k+1)

R(k)

(k+1),j

R(k)

i,j

R(k)

(k+1),(k+1)

R(k+1)

ij

= R(k)

ij

+ R(k)

i(k+1)(R(k) (k+1)(k+1))∗R(k) (k+1)j

Paths from i into j not meeting (k + 1) are already in R (k)

ij .

Paths from i into j through (k + 1) with possible loops can be expressed R(k)

i(k+1)(R(k) (k+1)(k+1))∗R(k) (k+1)j.

Finally RegE = ⊕j∈FAR(n)

1j

the union over all accepting states j.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 80 / 75 - 95

slide-7
SLIDE 7

DFA to RegE by Successive State Elimination

Previous method may generate up to 4n symbols. Following algorithm sometimes avoids duplicity. We allow regular expressions to anotate the graph (a transformation of the automaton). State s selected for elimination

q1 p1 s q2 qk pm R11 R1m R21 R2m R31 R3m Q1 Q2 Qk P1 Pm S

After s is eliminated.

q1 p1 q2 qk pm R11 + Q1S∗P1 R1m + Q1S∗Pm R21 + Q2S∗P1 R2m + Q2S∗Pm Rk1 + QkS∗P1 Rkm + QkS∗Pm

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 81 / 75 - 95

slide-8
SLIDE 8

A RegE from a DFA

For every accepting state q ∈ F we eliminate all states p ∈ Q \ {q, q0}. For q = q0 we take RegE(q) = (R + SU∗T)∗SU∗. R S T U For q = q0 we take RegE(q) = R∗. R And take the union (addition) over all accepting states: RegE(DFA) = ⊕q∈FRegE(q).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 82 / 75 - 95

slide-9
SLIDE 9

Example

Example 4.2 DFA that accepts 1 at the second last or the third-last position.

A B C D 0,1 1 0,1 0,1

the original automaton

A B C D 0+1 1 0+1 0+1

We replace strings by a RegE.

A C D 0+1 1(0+1) 0+1

We eliminate B.

A D 0+1 1(0+1)(0+1)

We eliminate C. We get RegE: (0 + 1)∗1(0 + 1) + (0 + 1)∗1(0 + 1)(0 + 1). [Elimination Order] We start by non-accepting nor initial nodes q / ∈ F, q = q0.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 83 / 75 - 95

slide-10
SLIDE 10

From a RegE to λ–NFA

From a RegE to λ–NFA By induction by the structure of R. In each step we construct λ-NFA E that recognizes the same language L(R) = L(E) with three additional properties:

  • 1. Exactly one accepting state.
  • 2. No edges into the inital state.
  • 3. No edges from the accepting state.

Basis:

λ

Empty string λ Empty set ∅

a

A single string a INDUCTION: Addition R + S:

R S λ λ λ λ

Concatenation RS:

R S λ

Iteration R∗:

R λ λ λ λ

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 84 / 75 - 95

slide-11
SLIDE 11

Pattern search in the text

Static text: we create indexes rather than using RegE. RegE are useful in the dynamic text (like news). Example 4.3 (Search for streets in addresses on the web) Street identification Streen|St\.|Avenue|Ave\.|Road|Rd\ the name before ’[A-Z][a-z]*( [A-Z][a-z]*)*’ house number [0-9]+[A-Z]? all together ’[0-9]+[A-Z]? [A-Z][a-z]*( [A-Z][a-z]*)* Streen|St\.|Avenue|Ave\.|Road|Rd\. ’ We are missing: Bouleward, Place, Way Streets without any identifier (almost all Czech streets) Street names with numbers. . . .

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 85 / 75 - 95

slide-12
SLIDE 12

Converting Among Representations

Converting NFA to DFA λ closure in O(n3). Search n states multiplied by n2 arcs for λ transitions. Subset construction, DFA with possibly 2n states. For each state, O(n3) time to compute transitions. λ−NFA NFA RegE DFA O(n32n) O(n32n) O(n) O(n34n) O(n) O(n) Converting DFA to NFA Just modify transition table by putting set-brackets around states and adding column for λ in the case of λ−NFA. Automaton to Regular Expression Conversion O(n34n) RegE to Automaton Conversion λ−NFA in the time O(n).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 86 / 75 - 95

slide-13
SLIDE 13

String Substitution, (String) Homomorphism

Definition 4.3 (String Substitution, (String) Homomorphism) We have a finite alphabet Σ. For each x ∈ Σ we have σ(x) a language over the alphabet Yx. Further, we define: σ(λ) = {λ} σ(u.v) = σ(u).σ(v) The mapping σ : Σ∗ → P(Y ∗) where Y =

x∈Σ Yx is substitution.

σ(L) =

w∈L σ(w)

e–free, λ–free substitution is a substitution where none σ(x) contains λ. For w = a1 . . . an ∈ Σn σ(w) = σ(a1) . . . σ(an). Example 4.4 (substitution) σ(0) = {aibj, i, j ≥ 0}, σ(1) = {cd} σ(010) = {aibjcdakbl, i, j, k, l ≥ 0}

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 87 / 75 - 95

slide-14
SLIDE 14

(String) Homomorphism

Definition 4.4 ((String) Homomorphism) Homomorphism h is a special case of a substitution where h(x) = wx∀x ∈ Σ. If ∀x : wx = λ is is e–free (λ–free) homomorphism. Inverse homomorphism h−1(L) = {w|h(w) ∈ L}. Example 4.5 (homomorphism) The function h defined by: h(0) = ab, and h(1) = λ is a homomorphism. For example, h(0011) = abab. For L = 10∗1 is h(L) = (ab)∗. Theorem (Closure under homomorphism) If language L and all ∀x ∈ Σ σ(x) are regular, so is also σ(L), h(L), h−1(L).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 88 / 75 - 95

slide-15
SLIDE 15

Homomorphism preserves regularity

Theorem 4.2 If L is a regular language over alphabet Σ, and h is a homomorphism on Σ, then h(L) is also regular. Proof. Let L = L(R) for some regular expression R. The proof is done by structural induction on sub-expressions E of R: we claim L(h(E)) = h(L(E)). Basis: h({λ}) = λ, h(∅) = ∅. If E = a then L(E) = {a}, so h(L(E)) = {h(a)}. Thus, L(h(E)) = {h(a)}. Induction:

Union: L(h(F + G)) = L(h(F) + h(G)) = L(h(F)) ∪ L(h(G)) and h(L(F + G)) = h(L(F) ∪ L(G)) = h(L(F)) ∪ h(L(G)). Right sides are equal from inductive hypothesis therefore left sides also equal. concatenation, closure proofs are similar.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 89 / 75 - 95

slide-16
SLIDE 16

Inverse Homomorphism

Definition 4.5 (Inverse homomorphism) Suppose h is a homomorphism from some alphabet Σ to strings in another alphabet T. Then h−1(L) ’h inverse of L’ is the set of strings w in Σ∗ such that h(w) is in L. Example 4.6 Let L = (00 + 1)∗, h(a) = 01 and h(b) = 10. We claim h−1(L) = (ba)∗. Proof: h((ba)∗) ∈ L is easy to see. Other w generates isolated 0 (4 cases to consider).

L h(L) h h L h−1(L)

A homomorphism applied in the forward and inverse direction.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 90 / 75 - 95

slide-17
SLIDE 17

Inverse Homomorphism DFA

Theorem 4.3 If h is a homomorphism from alphabet Σ to alphabet T, and L is a regular language over T, then h−1(L) is also a regular language. Proof. The proof starts with a DFA A for L. We construct a DFA for h−1(L). For A = (Q, T, δ, q0, F) we define B(Q, Σ, δB, q0, F) where δB(q, a) = δ∗(q, h(a)) (δ∗ operates on strings). By induction on |w|, δ∗

B(q0, w) = δ∗(q0, h(w)).

Therefore, B accepts exactly those strings w that are in h−1(L).

Start

A

Input w Input h(w) to A h Accept/reject The DFA for h−1(L) ap- plies h to its input, and then simulates the DFA fo L

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 91 / 75 - 95

slide-18
SLIDE 18

Visit every state example

Example 4.7 Suppose A = (Q, Σ, δ, q0, F) is an DFA. The language L of all strings w in Σ∗ such that δ∗(q0, w) is in F and for every state q ∈ Q there is some prefix xq of w such that δ∗(q0, xq) = q. This language is regular. M = L(A) the language accepted by DFA A in the usual way. T We define a new alphabet T of triples {[paq]; p, q ∈ Q, a ∈ Σ, δ(p, a) = q}. h We define the homomorphism h([paq]) = a for all p, q, a. L1 Language L1 = h−1(M) is regular since M is regular (DFA and inverse homomorphism). h−1(101) includes 23 = 8 strings, like [p1p][q0q][p1p] ∈ {[p1p], [q1q]}{[p0q], [q0q]}{[p1p], [q1q]}. We construct L from L1 (next slide).

p q 1 0,1

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 92 / 75 - 95

slide-19
SLIDE 19

L2 Enforce start at q0. Define E1 =

a∈Σ,q∈Q{[q0aq]} =

E1 = {[q0a1q0], [q0a2q1], . . . , [q0amqn]}. Then, L2 = L1 ∩ L(E1.T ∗). L3 Adjacent states must equal. Define non-matching pairs E2 =

q=r,p,q,r,s∈Q,a,b∈Σ{[paq][rbs]}.

Define L3 = L2 − L(T ∗.E2.T ∗), L3 It ends in accepting state since we started from M language of accepting computations on the DFA A. L4 All states. For each state q ∈ Q, define Eq be the regular expression that is the sum of all the symbols in T such that q appears in neither its first or last

  • position. We substract L(E ∗

q ) from L3.

L4 = L3 −

q∈Q{E ∗ q }.

L Remove states, leave symbols. L = h(L4). We conclude L is regular. In brief: M = L(A) Inverse homomorphism L1 h−1(M) ⊆ {[qap]}∗ Intersection with a RL L2 + q0 Difference with a RL L3 + adjacent states equal Difference with a RL L4 + all states on the path Homomorphism L h([qap]) = a

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 93 / 75 - 95

slide-20
SLIDE 20

Decision Properties of Regular Languages

Lemma (Testing Emptiness of Regular Languages) For finite automatons, it is a question of graph reachability of any final state from the initial one. Reachability is O(n2). Lemma For regular expression, we can convert it to λ−NFA in O(n) time and than check reachability. It can be done also by direct inspection: Basis: ∅ denotes empty language; λ and a are not empty. Induction:

R = R1 + R2 is empty iff both L(R1) and L(R2) are empty. R = R1R2 is empty iff either L(R1) or L(R2) is empty. R = R∗

1 is never empty, in includets λ.

R = (R1) is empty iff R1 is empty, since they are the same language.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 94 / 75 - 95

slide-21
SLIDE 21

Testing Membership in a Regular Language

Given a string w; |w| = n and a regular language L, is w ∈ L? DFA: Run automaton; if |w| = n, suitable representation, constant time transitions, it is O(n). NFA with s states: running time O(ns2).Each input symbol can be processed by taking the previous set of states, which numbers at most s states. λ−NFA - first compute λ−closure. Then, for each symbol proceed it and compute λ− closure of the result. For a regular expression of size s we convert it to an λ−NFA with at most 2s states and then simulate, taking O(ns2).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 95 / 75 - 95