CS 301 Lecture 10 Chomsky Normal Form Stephen Checkoway February - - PowerPoint PPT Presentation

cs 301
SMART_READER_LITE
LIVE PREVIEW

CS 301 Lecture 10 Chomsky Normal Form Stephen Checkoway February - - PowerPoint PPT Presentation

CS 301 Lecture 10 Chomsky Normal Form Stephen Checkoway February 19, 2018 1 / 23 More CFLs A = { a i b j c k i j or i = k } B = { w w { a , b , c } contains the same number of a s as b s and c s combined } C


slide-1
SLIDE 1

CS 301

Lecture 10 – Chomsky Normal Form Stephen Checkoway February 19, 2018

1 / 23

slide-2
SLIDE 2

More CFLs

  • A = {aibjck ∣ i ≤ j or i = k}
  • B = {w ∣ w ∈ {a, b, c}∗ contains the same number of as as bs and cs combined}
  • C = {1m+1n=1m+n ∣ m, n ≥ 1}; Σ = {1, +, =}
  • D = (abb∗ ∣ bbaa)∗
  • E = {w ∣ w ∈ {0, 1}∗ and wR is a binary number not divisible by 5}

2 / 23

slide-3
SLIDE 3

Another proof that regular languages are context-free

We can encode the computation of a DFA on a string using a CFG Give a DFA M = (Q, Σ, δ, q0, F), we can construct an equivalent CFG G = (V, Σ, R, S) where

3 / 23

slide-4
SLIDE 4

Another proof that regular languages are context-free

We can encode the computation of a DFA on a string using a CFG Give a DFA M = (Q, Σ, δ, q0, F), we can construct an equivalent CFG G = (V, Σ, R, S) where

  • states of M are variables in G

3 / 23

slide-5
SLIDE 5

Another proof that regular languages are context-free

We can encode the computation of a DFA on a string using a CFG Give a DFA M = (Q, Σ, δ, q0, F), we can construct an equivalent CFG G = (V, Σ, R, S) where

  • states of M are variables in G
  • q0 is the start variable, and

3 / 23

slide-6
SLIDE 6

Another proof that regular languages are context-free

We can encode the computation of a DFA on a string using a CFG Give a DFA M = (Q, Σ, δ, q0, F), we can construct an equivalent CFG G = (V, Σ, R, S) where

  • states of M are variables in G
  • q0 is the start variable, and
  • transitions δ(q, t) = r become rules q → tr

3 / 23

slide-7
SLIDE 7

Another proof that regular languages are context-free

We can encode the computation of a DFA on a string using a CFG Give a DFA M = (Q, Σ, δ, q0, F), we can construct an equivalent CFG G = (V, Σ, R, S) where

  • states of M are variables in G
  • q0 is the start variable, and
  • transitions δ(q, t) = r become rules q → tr

If on input w = w1w2⋯wn, M goes through states r0, r1, . . . , rn, then r0 ⇒ w1r1 ⇒ w1w2r2 ⇒ ⋯ ⇒ w1w2⋯wnrn

3 / 23

slide-8
SLIDE 8

Another proof that regular languages are context-free

We can encode the computation of a DFA on a string using a CFG Give a DFA M = (Q, Σ, δ, q0, F), we can construct an equivalent CFG G = (V, Σ, R, S) where

  • states of M are variables in G
  • q0 is the start variable, and
  • transitions δ(q, t) = r become rules q → tr

If on input w = w1w2⋯wn, M goes through states r0, r1, . . . , rn, then r0 ⇒ w1r1 ⇒ w1w2r2 ⇒ ⋯ ⇒ w1w2⋯wnrn So G has derived the string wrn but this still has a variable What additional rules should we add to end up with a string of terminals? For each state q ∈ F, add a rule q → ε

3 / 23

slide-9
SLIDE 9

Formally

Proof.

Given a DFA M = (Q, Σ, δ, q0, F), we can construct an equivalent CFG G = (V, Σ, R, S) where V = Q S = q0 R = {q → tr ∶ δ(q, t) = r} ∪ {q → ε ∶ q ∈ F}

4 / 23

slide-10
SLIDE 10

Formally

Proof.

Given a DFA M = (Q, Σ, δ, q0, F), we can construct an equivalent CFG G = (V, Σ, R, S) where V = Q S = q0 R = {q → tr ∶ δ(q, t) = r} ∪ {q → ε ∶ q ∈ F} If r0, r1, . . . , rn is the computation of M on input w = w1w2⋯wn, then r0 = q0 and δ(ri−1, wi) = ri for 1 ≤ i ≤ n By construction r0 ⇒ w1r1 ⇒ w1w2r2

⇒ w1w2⋯wnrn Therefore, w ∈ L(M) iff rn ∈ F iff rn ⇒ ε iff q0

⇒ w iff w ∈ L(G)

4 / 23

slide-11
SLIDE 11

Returning to our language

E = {w ∣ w ∈ {0,1}∗ and wR is a binary number not divisible by 5} Q0 Q1 Q2 Q3 Q4 1 1 1 1 1

5 / 23

slide-12
SLIDE 12

Returning to our language

E = {w ∣ w ∈ {0,1}∗ and wR is a binary number not divisible by 5} Q0 Q1 Q2 Q3 Q4 1 1 1 1 1 Q0 → 0Q0 ∣ 1Q2 Q1 → 0Q3 ∣ 1Q0 ∣ ε Q2 → 0Q1 ∣ 1Q3 ∣ ε Q3 → 0Q4 ∣ 1Q1 ∣ ε Q4 → 0Q2 ∣ 1Q4 ∣ ε

5 / 23

slide-13
SLIDE 13

Chomsky Normal Form (CNF)

A CFG G = (V, Σ, R, S) is in Chomsky Normal Form if all rules have one of these forms

  • S → ε

where S is the start variable

  • A → BC

where A ∈ V and B, C ∈ V ∖ {S}

  • A → t

where A ∈ V and t ∈ Σ Note

  • The only rule with ε on the right has the start variable on the left
  • The start variable doesn’t appear on the right hand side of any rule

6 / 23

slide-14
SLIDE 14

CNF example

Let A = {w ∣ w ∈ {a, b}∗ and w = wR}. CFG in CNF S → AU ∣ BV ∣ a ∣ b ∣ ε T → AU ∣ BV ∣ a ∣ b U → TA V → TB A → a B → b Derivation of baaab S

7 / 23

slide-15
SLIDE 15

CNF example

Let A = {w ∣ w ∈ {a, b}∗ and w = wR}. CFG in CNF S → AU ∣ BV ∣ a ∣ b ∣ ε T → AU ∣ BV ∣ a ∣ b U → TA V → TB A → a B → b Derivation of baaab S ⇒ BV

7 / 23

slide-16
SLIDE 16

CNF example

Let A = {w ∣ w ∈ {a, b}∗ and w = wR}. CFG in CNF S → AU ∣ BV ∣ a ∣ b ∣ ε T → AU ∣ BV ∣ a ∣ b U → TA V → TB A → a B → b Derivation of baaab S ⇒ BV ⇒ bV

7 / 23

slide-17
SLIDE 17

CNF example

Let A = {w ∣ w ∈ {a, b}∗ and w = wR}. CFG in CNF S → AU ∣ BV ∣ a ∣ b ∣ ε T → AU ∣ BV ∣ a ∣ b U → TA V → TB A → a B → b Derivation of baaab S ⇒ BV ⇒ bV ⇒ bTB

7 / 23

slide-18
SLIDE 18

CNF example

Let A = {w ∣ w ∈ {a, b}∗ and w = wR}. CFG in CNF S → AU ∣ BV ∣ a ∣ b ∣ ε T → AU ∣ BV ∣ a ∣ b U → TA V → TB A → a B → b Derivation of baaab S ⇒ BV ⇒ bV ⇒ bTB ⇒ bAUB

7 / 23

slide-19
SLIDE 19

CNF example

Let A = {w ∣ w ∈ {a, b}∗ and w = wR}. CFG in CNF S → AU ∣ BV ∣ a ∣ b ∣ ε T → AU ∣ BV ∣ a ∣ b U → TA V → TB A → a B → b Derivation of baaab S ⇒ BV ⇒ bV ⇒ bTB ⇒ bAUB ⇒ baUB

7 / 23

slide-20
SLIDE 20

CNF example

Let A = {w ∣ w ∈ {a, b}∗ and w = wR}. CFG in CNF S → AU ∣ BV ∣ a ∣ b ∣ ε T → AU ∣ BV ∣ a ∣ b U → TA V → TB A → a B → b Derivation of baaab S ⇒ BV ⇒ bV ⇒ bTB ⇒ bAUB ⇒ baUB ⇒ baTAB

7 / 23

slide-21
SLIDE 21

CNF example

Let A = {w ∣ w ∈ {a, b}∗ and w = wR}. CFG in CNF S → AU ∣ BV ∣ a ∣ b ∣ ε T → AU ∣ BV ∣ a ∣ b U → TA V → TB A → a B → b Derivation of baaab S ⇒ BV ⇒ bV ⇒ bTB ⇒ bAUB ⇒ baUB ⇒ baTAB ⇒ baaAB

7 / 23

slide-22
SLIDE 22

CNF example

Let A = {w ∣ w ∈ {a, b}∗ and w = wR}. CFG in CNF S → AU ∣ BV ∣ a ∣ b ∣ ε T → AU ∣ BV ∣ a ∣ b U → TA V → TB A → a B → b Derivation of baaab S ⇒ BV ⇒ bV ⇒ bTB ⇒ bAUB ⇒ baUB ⇒ baTAB ⇒ baaAB ⇒ baaaB

7 / 23

slide-23
SLIDE 23

CNF example

Let A = {w ∣ w ∈ {a, b}∗ and w = wR}. CFG in CNF S → AU ∣ BV ∣ a ∣ b ∣ ε T → AU ∣ BV ∣ a ∣ b U → TA V → TB A → a B → b Derivation of baaab S ⇒ BV ⇒ bV ⇒ bTB ⇒ bAUB ⇒ baUB ⇒ baTAB ⇒ baaAB ⇒ baaaB ⇒ baaab

7 / 23

slide-24
SLIDE 24

Converting to CNF

Theorem

Every context-free language A is generated by some CFG in CNF.

Proof.

Given a CFG G = (V, Σ, R, S) generating A, we construct a new CFG G′ = (V ′, Σ, R′, S′) in CNF generating A. There are five steps. START Add a new start variable BIN Replace rules with RHS longer than two with multiple rules each of which has a RHS of length two DEL-ε Remove all ε-rules (A → ε) UNIT Remove all unit-rules (A → B) TERM Add a variable and rule for each terminal (T → t) and replace terminals

  • n the RHS of rules

8 / 23

slide-25
SLIDE 25

Proof continued

In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )+ START Add a new start variable S′ and a rule S′ → S

9 / 23

slide-26
SLIDE 26

Proof continued

In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )+ START Add a new start variable S′ and a rule S′ → S BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and repeat until the RHS of every rule has length at most two

9 / 23

slide-27
SLIDE 27

Proof continued

In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )+ START Add a new start variable S′ and a rule S′ → S BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and repeat until the RHS of every rule has length at most two DEL-ε For each rule of the form A → ε other than S′ → ε remove A → ε and update all rules with A in the RHS

9 / 23

slide-28
SLIDE 28

Proof continued

In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )+ START Add a new start variable S′ and a rule S′ → S BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and repeat until the RHS of every rule has length at most two DEL-ε For each rule of the form A → ε other than S′ → ε remove A → ε and update all rules with A in the RHS

  • B → A. Add rule B → ε unless B → ε has already been removed

9 / 23

slide-29
SLIDE 29

Proof continued

In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )+ START Add a new start variable S′ and a rule S′ → S BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and repeat until the RHS of every rule has length at most two DEL-ε For each rule of the form A → ε other than S′ → ε remove A → ε and update all rules with A in the RHS

  • B → A. Add rule B → ε unless B → ε has already been removed
  • B → AA. Add rule B → A and if B → ε has not already been

removed, add it

9 / 23

slide-30
SLIDE 30

Proof continued

In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )+ START Add a new start variable S′ and a rule S′ → S BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and repeat until the RHS of every rule has length at most two DEL-ε For each rule of the form A → ε other than S′ → ε remove A → ε and update all rules with A in the RHS

  • B → A. Add rule B → ε unless B → ε has already been removed
  • B → AA. Add rule B → A and if B → ε has not already been

removed, add it

  • B → xA or B → Ax. Add rule B → x

9 / 23

slide-31
SLIDE 31

Proof continued

In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )+ START Add a new start variable S′ and a rule S′ → S BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and repeat until the RHS of every rule has length at most two DEL-ε For each rule of the form A → ε other than S′ → ε remove A → ε and update all rules with A in the RHS

  • B → A. Add rule B → ε unless B → ε has already been removed
  • B → AA. Add rule B → A and if B → ε has not already been

removed, add it

  • B → xA or B → Ax. Add rule B → x

UNIT For each rule A → B, remove it and add rules A → u for each B → u unless A → u is a unit rule already removed

9 / 23

slide-32
SLIDE 32

Proof continued

In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )+ START Add a new start variable S′ and a rule S′ → S BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and repeat until the RHS of every rule has length at most two DEL-ε For each rule of the form A → ε other than S′ → ε remove A → ε and update all rules with A in the RHS

  • B → A. Add rule B → ε unless B → ε has already been removed
  • B → AA. Add rule B → A and if B → ε has not already been

removed, add it

  • B → xA or B → Ax. Add rule B → x

UNIT For each rule A → B, remove it and add rules A → u for each B → u unless A → u is a unit rule already removed TERM For each t ∈ Σ, add a new variable T and a rule T → t; replace each t in the RHS of nonunit rules with T

9 / 23

slide-33
SLIDE 33

Proof continued

In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )+ START Add a new start variable S′ and a rule S′ → S BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and repeat until the RHS of every rule has length at most two DEL-ε For each rule of the form A → ε other than S′ → ε remove A → ε and update all rules with A in the RHS

  • B → A. Add rule B → ε unless B → ε has already been removed
  • B → AA. Add rule B → A and if B → ε has not already been

removed, add it

  • B → xA or B → Ax. Add rule B → x

UNIT For each rule A → B, remove it and add rules A → u for each B → u unless A → u is a unit rule already removed TERM For each t ∈ Σ, add a new variable T and a rule T → t; replace each t in the RHS of nonunit rules with T Each of the five steps preserves the language generated by the grammar so L(G′) = A.

9 / 23

slide-34
SLIDE 34

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START:

10 / 23

slide-35
SLIDE 35

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε

10 / 23

slide-36
SLIDE 36

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε BIN: Replace A → BAB:

10 / 23

slide-37
SLIDE 37

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε BIN: Replace A → BAB: S → A A → BA1 ∣ B ∣ ε B → 00 ∣ ε A1 → AB

10 / 23

slide-38
SLIDE 38

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε BIN: Replace A → BAB: S → A A → BA1 ∣ B ∣ ε B → 00 ∣ ε A1 → AB DEL-ε: Remove A → ε:

10 / 23

slide-39
SLIDE 39

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε BIN: Replace A → BAB: S → A A → BA1 ∣ B ∣ ε B → 00 ∣ ε A1 → AB DEL-ε: Remove A → ε: S → A ∣ ε A → BA1 ∣ B B → 00 ∣ ε A1 → AB ∣ B

10 / 23

slide-40
SLIDE 40

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε BIN: Replace A → BAB: S → A A → BA1 ∣ B ∣ ε B → 00 ∣ ε A1 → AB DEL-ε: Remove A → ε: S → A ∣ ε A → BA1 ∣ B B → 00 ∣ ε A1 → AB ∣ B Remove B → ε:

10 / 23

slide-41
SLIDE 41

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε BIN: Replace A → BAB: S → A A → BA1 ∣ B ∣ ε B → 00 ∣ ε A1 → AB DEL-ε: Remove A → ε: S → A ∣ ε A → BA1 ∣ B B → 00 ∣ ε A1 → AB ∣ B Remove B → ε: S → A ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A ∣ ε Don’t add A → ε because we already removed it

10 / 23

slide-42
SLIDE 42

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε BIN: Replace A → BAB: S → A A → BA1 ∣ B ∣ ε B → 00 ∣ ε A1 → AB DEL-ε: Remove A → ε: S → A ∣ ε A → BA1 ∣ B B → 00 ∣ ε A1 → AB ∣ B Remove B → ε: S → A ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A ∣ ε Don’t add A → ε because we already removed it Remove A1 → ε:

10 / 23

slide-43
SLIDE 43

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε BIN: Replace A → BAB: S → A A → BA1 ∣ B ∣ ε B → 00 ∣ ε A1 → AB DEL-ε: Remove A → ε: S → A ∣ ε A → BA1 ∣ B B → 00 ∣ ε A1 → AB ∣ B Remove B → ε: S → A ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A ∣ ε Don’t add A → ε because we already removed it Remove A1 → ε: S → A ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Don’t add A → ε because we already removed it

10 / 23

slide-44
SLIDE 44

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε BIN: Replace A → BAB: S → A A → BA1 ∣ B ∣ ε B → 00 ∣ ε A1 → AB DEL-ε: Remove A → ε: S → A ∣ ε A → BA1 ∣ B B → 00 ∣ ε A1 → AB ∣ B Remove B → ε: S → A ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A ∣ ε Don’t add A → ε because we already removed it Remove A1 → ε: S → A ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Don’t add A → ε because we already removed it UNIT: Remove S → A

10 / 23

slide-45
SLIDE 45

Example

Convert to CNF A → BAB ∣ B ∣ ε B → 00 ∣ ε START: S → A A → BAB ∣ B ∣ ε B → 00 ∣ ε BIN: Replace A → BAB: S → A A → BA1 ∣ B ∣ ε B → 00 ∣ ε A1 → AB DEL-ε: Remove A → ε: S → A ∣ ε A → BA1 ∣ B B → 00 ∣ ε A1 → AB ∣ B Remove B → ε: S → A ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A ∣ ε Don’t add A → ε because we already removed it Remove A1 → ε: S → A ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Don’t add A → ε because we already removed it UNIT: Remove S → A S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A

10 / 23

slide-46
SLIDE 46

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A

11 / 23

slide-47
SLIDE 47

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → B

11 / 23

slide-48
SLIDE 48

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → B S → BA1 ∣ A1 ∣ ε ∣ 00 A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A

11 / 23

slide-49
SLIDE 49

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → B S → BA1 ∣ A1 ∣ ε ∣ 00 A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → A1

11 / 23

slide-50
SLIDE 50

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → B S → BA1 ∣ A1 ∣ ε ∣ 00 A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → A1 S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Don’t add S → B or S → A because we removed them

11 / 23

slide-51
SLIDE 51

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → B S → BA1 ∣ A1 ∣ ε ∣ 00 A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → A1 S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Don’t add S → B or S → A because we removed them Remove A → B

11 / 23

slide-52
SLIDE 52

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → B S → BA1 ∣ A1 ∣ ε ∣ 00 A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → A1 S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Don’t add S → B or S → A because we removed them Remove A → B S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ A1 ∣ 00 B → 00 A1 → AB ∣ B ∣ A

11 / 23

slide-53
SLIDE 53

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → B S → BA1 ∣ A1 ∣ ε ∣ 00 A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → A1 S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Don’t add S → B or S → A because we removed them Remove A → B S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ A1 ∣ 00 B → 00 A1 → AB ∣ B ∣ A Remove A → A1

11 / 23

slide-54
SLIDE 54

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → B S → BA1 ∣ A1 ∣ ε ∣ 00 A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → A1 S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Don’t add S → B or S → A because we removed them Remove A → B S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ A1 ∣ 00 B → 00 A1 → AB ∣ B ∣ A Remove A → A1 S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ B ∣ A Don’t add A → B because we removed it Don’t add A → A because it’s useless

11 / 23

slide-55
SLIDE 55

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → B S → BA1 ∣ A1 ∣ ε ∣ 00 A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → A1 S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Don’t add S → B or S → A because we removed them Remove A → B S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ A1 ∣ 00 B → 00 A1 → AB ∣ B ∣ A Remove A → A1 S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ B ∣ A Don’t add A → B because we removed it Don’t add A → A because it’s useless Remove A1 → B

11 / 23

slide-56
SLIDE 56

Example continued

From previous slide S → BA1 ∣ B ∣ A1 ∣ ε A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → B S → BA1 ∣ A1 ∣ ε ∣ 00 A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Remove S → A1 S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ B ∣ A1 B → 00 A1 → AB ∣ B ∣ A Don’t add S → B or S → A because we removed them Remove A → B S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ A1 ∣ 00 B → 00 A1 → AB ∣ B ∣ A Remove A → A1 S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ B ∣ A Don’t add A → B because we removed it Don’t add A → A because it’s useless Remove A1 → B S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ A ∣ 00

11 / 23

slide-57
SLIDE 57

Example continued

Copied from the previous slide S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ A ∣ 00 Remove A1 → A

12 / 23

slide-58
SLIDE 58

Example continued

Copied from the previous slide S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ A ∣ 00 Remove A1 → A S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ 00 ∣ BA1

12 / 23

slide-59
SLIDE 59

Example continued

Copied from the previous slide S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ A ∣ 00 Remove A1 → A S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ 00 ∣ BA1 TERM: Add Z → 0

12 / 23

slide-60
SLIDE 60

Example continued

Copied from the previous slide S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ A ∣ 00 Remove A1 → A S → BA1 ∣ ε ∣ 00 ∣ AB A → BA1 ∣ 00 ∣ AB B → 00 A1 → AB ∣ 00 ∣ BA1 TERM: Add Z → 0 S → BA1 ∣ ε ∣ ZZ ∣ AB A → BA1 ∣ ZZ ∣ AB B → ZZ A1 → AB ∣ ZZ ∣ BA1 Z → 0

12 / 23

slide-61
SLIDE 61

Caution

Sipser gives a different procedure

1 START 2 DEL-ε 3 UNIT 4 BIN 5 TERM

This procedure works but can lead to an exponential blow up in the number of rules! In general, if DEL-ε comes before BIN, then ∣G′∣ is O(22∣G∣); if BIN comes before DEL-ε, then ∣G′∣ is O(∣G∣2) UNIT is responsible for the quadratic blow up So use whichever procedure you’d like, but Sipser’s can be very bad (Sipser’s is bad if you have long rules with lots of variables with ε-rules)

13 / 23

slide-62
SLIDE 62

Example blow up

A → BCDEEDCB ∣ CBEDDEBC B → 0 ∣ ε C → 1 ∣ ε D → 2 ∣ ε E → 3 ∣ ε has five variables and 10 rules Converting using START, BIN, DEL-ε, UNIT, TERM gives a CFG with 18 variables and 125 rules

14 / 23

slide-63
SLIDE 63

Example blow up

A → BCDEEDCB ∣ CBEDDEBC B → 0 ∣ ε C → 1 ∣ ε D → 2 ∣ ε E → 3 ∣ ε has five variables and 10 rules Converting using START, BIN, DEL-ε, UNIT, TERM gives a CFG with 18 variables and 125 rules Converting using START, DEL-ε, UNIT, BIN, TERM gives a CFG with 1394 variables and 1953 rules

14 / 23

slide-64
SLIDE 64

Prefix

Recall Prefix(L) = {w ∣ for some x ∈ Σ∗, wx ∈ L}

Theorem

The class of context-free languages is closed under Prefix.

15 / 23

slide-65
SLIDE 65

Prefix

Recall Prefix(L) = {w ∣ for some x ∈ Σ∗, wx ∈ L}

Theorem

The class of context-free languages is closed under Prefix.

Proof idea

Consider the language {w#wR ∣ w ∈ {a, b}∗} generated by T → aTa ∣ bTb ∣ #

15 / 23

slide-66
SLIDE 66

Prefix

Recall Prefix(L) = {w ∣ for some x ∈ Σ∗, wx ∈ L}

Theorem

The class of context-free languages is closed under Prefix.

Proof idea

Consider the language {w#wR ∣ w ∈ {a, b}∗} generated by T → aTa ∣ bTb ∣ # Let’s convert to CNF

15 / 23

slide-67
SLIDE 67

Prefix

Recall Prefix(L) = {w ∣ for some x ∈ Σ∗, wx ∈ L}

Theorem

The class of context-free languages is closed under Prefix.

Proof idea

Consider the language {w#wR ∣ w ∈ {a, b}∗} generated by T → aTa ∣ bTb ∣ # Let’s convert to CNF S → AU ∣ BV ∣ # T → AU ∣ BV ∣ # U → TA V → TB A → a B → b

15 / 23

slide-68
SLIDE 68

Derivation of ab#ba

S ⇒ AU ⇒ aU ⇒ aTA ⇒ aBV A ⇒ abV A ⇒ abTBA ⇒ ab#BA ⇒ ab#bA ⇒ ab#ba The prefix ab# includes – all terminals from subtrees with a blue root; – some terminals from subtrees with a violet root; – no terminals from subtrees with a red root S A a U T B b V T # B b A a

16 / 23

slide-69
SLIDE 69

Desired derivation for the prefix

We would like a derivation like this S ⇒ AU ⇒ aU ⇒ aTA ⇒ aBV A ⇒ abV A ⇒ abTBA ⇒ ab#BA ⇒ ab#εA ⇒ ab#εε Everything left of the violet path is produced Everything right of the violet path becomes ε The leaf connected to the violet path is produced S A a U T B b V T # B ε A ε

17 / 23

slide-70
SLIDE 70

The proof idea

The violet path corresponds to the point where we “split” the prefix from the remainder of the string We want to construct a CFG that keeps track of whether a given variable in the derivation is L left of the split, S part of the split, or R right of the split We can construct a new CFG whose variables are ⟨A, L⟩, ⟨A, S⟩, or ⟨A, R⟩ where A is a variable in the original CFG We have to deal with the three types of rules

  • S → ε
  • A → BC
  • A → t

and produce new rules corresponding to the variable on the LHS being left of, right of,

  • r on the split

18 / 23

slide-71
SLIDE 71

Proof

If L = ∅, then Prefix(L) = ∅ which is CF. Otherwise, let L be CF and generated by the CFG G = (V, Σ, R, S) in CNF. Construct a new CFG (not in CNF) G′ = (V ′, Σ, R′, S′) where V = {⟨A, D⟩ ∣ A ∈ V and D ∈ {L, S, R}} S′ = ⟨S, S⟩ Now we just need to specify R′. We’ll start with R′ = ∅ and add rules to it

19 / 23

slide-72
SLIDE 72

Proof continued

Since L is nonempty, ε ∈ Prefix(L) so add the rule ⟨S, S⟩ → ε to R′ For each rule of the form A → BC in R, add the following rules to R′ ⟨A, L⟩ → ⟨B, L⟩⟨C, L⟩ left of the split ⟨A, S⟩ → ⟨B, L⟩⟨C, S⟩ ∣ ⟨B, S⟩⟨C, R⟩

  • ne of B or C is on the split

⟨A, R⟩ → ⟨B, R⟩⟨C, R⟩ right of the split For each rule of the form A → t in R, add the following rules to R′ ⟨A, L⟩ → t ⟨A, S⟩ → t ⟨A, R⟩ → ε

20 / 23

slide-73
SLIDE 73

Proof continued

For each w = w1w2⋯wn ∈ L, S

⇒ A1A2⋯An where Ai ⇒ wi By construction, ⟨S, S⟩

⇒ ⟨A1, L⟩⋯⟨Ai−1, L⟩⟨Ai, S⟩⟨Ai+1, R⟩⋯⟨An, R⟩

⇒ w1w2⋯wi for each 1 ≤ i ≤ n I.e., G′ derives the prefix of every string in L A similar argument works to show that if G′ derives a string then it’s a prefix of some string in L

21 / 23

slide-74
SLIDE 74

Applying the construction

Deriving ab# ⟨S, S⟩ ⇒ ⟨A, L⟩⟨U, S⟩ ⇒ a⟨U, S⟩ ⇒ a⟨T, S⟩⟨A, R⟩ ⇒ a⟨B, L⟩⟨V, S⟩⟨A, R⟩ ⇒ ab⟨V, S⟩⟨A, R⟩ ⇒ ab⟨T, S⟩⟨BA, R⟩ ⇒ ab#⟨B, R⟩⟨A, R⟩ ⇒ ab#⟨A, R⟩ ⇒ ab# ⟨S, S⟩ ⟨A, L⟩ a ⟨U, S⟩ ⟨T, S⟩ ⟨B, L⟩ b ⟨V, S⟩ ⟨T, S⟩ # ⟨B, R⟩ ε ⟨A, R⟩ ε

22 / 23

slide-75
SLIDE 75

Similarities with regular expression

Proving things about

  • Regular languages. Assume there exists a regular expression that generates the

language and consider the six cases

  • Context-free languages. Assume there exists a CFG that generates the language

and consider the three types of rules

23 / 23