INF2080 Context-Free Langugaes Daniel Lupp Universitetet i Oslo - - PowerPoint PPT Presentation

inf2080
SMART_READER_LITE
LIVE PREVIEW

INF2080 Context-Free Langugaes Daniel Lupp Universitetet i Oslo - - PowerPoint PPT Presentation

INF2080 Context-Free Langugaes Daniel Lupp Universitetet i Oslo 1st February 2018 Department of University of Informatics Oslo INF2080 Lecture :: 1st February 1 / 37 Repetition Weve looked at one of the simpler computational models:


slide-1
SLIDE 1

INF2080

Context-Free Langugaes Daniel Lupp

Universitetet i Oslo

1st February 2018

Department of Informatics University of Oslo

INF2080 Lecture :: 1st February 1 / 37

slide-2
SLIDE 2

Repetition

We’ve looked at one of the simpler computational models: finite automata

INF2080 Lecture :: 1st February 2 / 37

slide-3
SLIDE 3

Repetition

We’ve looked at one of the simpler computational models: finite automata defined (non)deterministic finite automata (NFAs/DFAs) and the languages they accept: regular languages

INF2080 Lecture :: 1st February 2 / 37

slide-4
SLIDE 4

Repetition

We’ve looked at one of the simpler computational models: finite automata defined (non)deterministic finite automata (NFAs/DFAs) and the languages they accept: regular languages defined regular expressions, useful as a shorthand for describing languages

INF2080 Lecture :: 1st February 2 / 37

slide-5
SLIDE 5

Repetition

We’ve looked at one of the simpler computational models: finite automata defined (non)deterministic finite automata (NFAs/DFAs) and the languages they accept: regular languages defined regular expressions, useful as a shorthand for describing languages a language L is regular ↔ there exists a regular expression that describes L

INF2080 Lecture :: 1st February 2 / 37

slide-6
SLIDE 6

Repetition

We’ve looked at one of the simpler computational models: finite automata defined (non)deterministic finite automata (NFAs/DFAs) and the languages they accept: regular languages defined regular expressions, useful as a shorthand for describing languages a language L is regular ↔ there exists a regular expression that describes L pumping lemma as a useful tool for determining whether a language is nonregular

INF2080 Lecture :: 1st February 2 / 37

slide-7
SLIDE 7

Pumping Lemma revisited

Recall example from last week: L = {anbn | n ≥ 0}

INF2080 Lecture :: 1st February 3 / 37

slide-8
SLIDE 8

Pumping Lemma revisited

Recall example from last week: L = {anbn | n ≥ 0} We used the pumping lemma to show that this language was not regular.

INF2080 Lecture :: 1st February 3 / 37

slide-9
SLIDE 9

Pumping Lemma revisited

Recall example from last week: L = {anbn | n ≥ 0} We used the pumping lemma to show that this language was not regular. What about the following language, for Σ = {a, b, c}: L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a}

INF2080 Lecture :: 1st February 3 / 37

slide-10
SLIDE 10

Pumping Lemma revisited

Recall example from last week: L = {anbn | n ≥ 0} We used the pumping lemma to show that this language was not regular. What about the following language, for Σ = {a, b, c}: L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Union of two languages: first language: all words of the form abncn

INF2080 Lecture :: 1st February 3 / 37

slide-11
SLIDE 11

Pumping Lemma revisited

Recall example from last week: L = {anbn | n ≥ 0} We used the pumping lemma to show that this language was not regular. What about the following language, for Σ = {a, b, c}: L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Union of two languages: first language: all words of the form abncn second language: all Σ∗ words that start with either 0 or 2 or more a’s. → L is a disjoint union

INF2080 Lecture :: 1st February 3 / 37

slide-12
SLIDE 12

Pumping Lemma revisited

Lemma (Pumping Lemma) If A is a regular language, then there is a number p, called the pumping length, where if w is a word in A of length ≥ p then w can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Does L satisfy the pumping lemma?

INF2080 Lecture :: 1st February 4 / 37

slide-13
SLIDE 13

Pumping Lemma revisited

Lemma (Pumping Lemma, shortened) If A is a regular, |w| ≥ p can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, |xy| ≤ p.

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Let p be the pumping length.

INF2080 Lecture :: 1st February 5 / 37

slide-14
SLIDE 14

Pumping Lemma revisited

Lemma (Pumping Lemma, shortened) If A is a regular, |w| ≥ p can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, |xy| ≤ p.

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Let p be the pumping length. Each w ∈ L is either of the form abncn or akw.

INF2080 Lecture :: 1st February 5 / 37

slide-15
SLIDE 15

Pumping Lemma revisited

Lemma (Pumping Lemma, shortened) If A is a regular, |w| ≥ p can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, |xy| ≤ p.

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = abncn, where n is such that |s| ≥ p.

INF2080 Lecture :: 1st February 6 / 37

slide-16
SLIDE 16

Pumping Lemma revisited

Lemma (Pumping Lemma, shortened) If A is a regular, |w| ≥ p can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, |xy| ≤ p.

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = abncn, where n is such that |s| ≥ p. choose x = ε, y = a, z = bncn. Then |y| > 0 and |xy| ≤ p.

INF2080 Lecture :: 1st February 6 / 37

slide-17
SLIDE 17

Pumping Lemma revisited

Lemma (Pumping Lemma, shortened) If A is a regular, |w| ≥ p can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, |xy| ≤ p.

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = abncn, where n is such that |s| ≥ p. choose x = ε, y = a, z = bncn. Then |y| > 0 and |xy| ≤ p. The string xz = bncn = a0bncn is of the form akw for k = 1 and w ∈ Σ∗ not starting with a.

INF2080 Lecture :: 1st February 6 / 37

slide-18
SLIDE 18

Pumping Lemma revisited

Lemma (Pumping Lemma, shortened) If A is a regular, |w| ≥ p can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, |xy| ≤ p.

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = abncn, where n is such that |s| ≥ p. choose x = ε, y = a, z = bncn. Then |y| > 0 and |xy| ≤ p. The string xz = bncn = a0bncn is of the form akw for k = 1 and w ∈ Σ∗ not starting with a. ⇒ xz ∈ L.

INF2080 Lecture :: 1st February 6 / 37

slide-19
SLIDE 19

Pumping Lemma revisited

Lemma (Pumping Lemma, shortened) If A is a regular, |w| ≥ p can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = abncn, where n is such that |s| ≥ p. choose x = ε, y = a, z = bncn. Then |y| > 0 and |xy| ≤ p. The string xyiz = bncn = aibncn for i ≥ 2 is of the form akw for k = 1 and w ∈ Σ∗ not starting with a.

INF2080 Lecture :: 1st February 7 / 37

slide-20
SLIDE 20

Pumping Lemma revisited

Lemma (Pumping Lemma, shortened) If A is a regular, |w| ≥ p can be divided into three parts, w = xyz, such that

1 xyiz ∈ A for every i ≥ 0, 2 |y| > 0, 3 |xy| ≤ p.

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = abncn, where n is such that |s| ≥ p. choose x = ε, y = a, z = bncn. Then |y| > 0 and |xy| ≤ p. The string xyiz = bncn = aibncn for i ≥ 2 is of the form akw for k = 1 and w ∈ Σ∗ not starting with a. ⇒ xyiz ∈ L.

INF2080 Lecture :: 1st February 7 / 37

slide-21
SLIDE 21

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p.

INF2080 Lecture :: 1st February 8 / 37

slide-22
SLIDE 22

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k = 0, choose x = ε, y = w1, z = w2 · · · wn. Then |y| > 0 and |xy| ≤ p.

INF2080 Lecture :: 1st February 8 / 37

slide-23
SLIDE 23

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k = 0, choose x = ε, y = w1, z = w2 · · · wn. Then |y| > 0 and |xy| ≤ p. The strings xz and xyiz for i > 2 are in Σ∗ and don’t start with a

INF2080 Lecture :: 1st February 8 / 37

slide-24
SLIDE 24

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k = 0, choose x = ε, y = w1, z = w2 · · · wn. Then |y| > 0 and |xy| ≤ p. The strings xz and xyiz for i > 2 are in Σ∗ and don’t start with a ⇒ xz, xyiz ∈ L.

INF2080 Lecture :: 1st February 8 / 37

slide-25
SLIDE 25

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k = 2, choose x = ε, y = aa, z = w1w2 · · · wn. Then |y| > 0 and |xy| ≤ p.

INF2080 Lecture :: 1st February 9 / 37

slide-26
SLIDE 26

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k = 2, choose x = ε, y = aa, z = w1w2 · · · wn. Then |y| > 0 and |xy| ≤ p. The string xz is in Σ∗ and doesn’t start with a.

INF2080 Lecture :: 1st February 9 / 37

slide-27
SLIDE 27

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k = 2, choose x = ε, y = aa, z = w1w2 · · · wn. Then |y| > 0 and |xy| ≤ p. The string xz is in Σ∗ and doesn’t start with a. xz ∈ L

INF2080 Lecture :: 1st February 9 / 37

slide-28
SLIDE 28

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k = 2, choose x = ε, y = aa, z = w1w2 · · · wn. Then |y| > 0 and |xy| ≤ p. The string xz is in Σ∗ and doesn’t start with a. xz ∈ L The string xyiz for i ≥ 1 starts with 2 or more a’s, followed by a word w ∈ Σ∗ that does not start with an a.

INF2080 Lecture :: 1st February 9 / 37

slide-29
SLIDE 29

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k = 2, choose x = ε, y = aa, z = w1w2 · · · wn. Then |y| > 0 and |xy| ≤ p. The string xz is in Σ∗ and doesn’t start with a. xz ∈ L The string xyiz for i ≥ 1 starts with 2 or more a’s, followed by a word w ∈ Σ∗ that does not start with an a. ⇒ xyiz ∈ L.

INF2080 Lecture :: 1st February 9 / 37

slide-30
SLIDE 30

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k ≥ 3, choose x = ε, y = a, z = ak−1w2 · · · wn. Then |y| > 0 and |xy| ≤ p.

INF2080 Lecture :: 1st February 10 / 37

slide-31
SLIDE 31

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k ≥ 3, choose x = ε, y = a, z = ak−1w2 · · · wn. Then |y| > 0 and |xy| ≤ p. The string xz is of the form ak−1w, where w ∈ Σ∗ and doesn’t start with a.

INF2080 Lecture :: 1st February 10 / 37

slide-32
SLIDE 32

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k ≥ 3, choose x = ε, y = a, z = ak−1w2 · · · wn. Then |y| > 0 and |xy| ≤ p. The string xz is of the form ak−1w, where w ∈ Σ∗ and doesn’t start with a. xz ∈ L

INF2080 Lecture :: 1st February 10 / 37

slide-33
SLIDE 33

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k ≥ 3, choose x = ε, y = a, z = ak−1w2 · · · wn. Then |y| > 0 and |xy| ≤ p. The string xz is of the form ak−1w, where w ∈ Σ∗ and doesn’t start with a. xz ∈ L The string xyiz for i ≥ 1 is of the form ak+i−1w where w ∈ Σ∗ that does not start with an a.

INF2080 Lecture :: 1st February 10 / 37

slide-34
SLIDE 34

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} Assume s = akw1w2 · · · wn, for k = 1 and w ∈ Σ∗ not starting with a, where n, k are such that |s| ≥ p. if k ≥ 3, choose x = ε, y = a, z = ak−1w2 · · · wn. Then |y| > 0 and |xy| ≤ p. The string xz is of the form ak−1w, where w ∈ Σ∗ and doesn’t start with a. xz ∈ L The string xyiz for i ≥ 1 is of the form ak+i−1w where w ∈ Σ∗ that does not start with an a. ⇒ xyiz ∈ L.

INF2080 Lecture :: 1st February 10 / 37

slide-35
SLIDE 35

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} can be pumped!! Does that mean L is regular?

INF2080 Lecture :: 1st February 11 / 37

slide-36
SLIDE 36

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} can be pumped!! Does that mean L is regular? If L is regular, then so is L ∩ abΣ∗ (recall: regular languages are closed under intersection).

INF2080 Lecture :: 1st February 11 / 37

slide-37
SLIDE 37

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} can be pumped!! Does that mean L is regular? If L is regular, then so is L ∩ abΣ∗ (recall: regular languages are closed under intersection). L ∩ abΣ∗ = {abncn | n ≥ 1}

INF2080 Lecture :: 1st February 11 / 37

slide-38
SLIDE 38

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} can be pumped!! Does that mean L is regular? If L is regular, then so is L ∩ abΣ∗ (recall: regular languages are closed under intersection). L ∩ abΣ∗ = {abncn | n ≥ 1} Exercise: show that this language is nonregular! (analogous to proof for anbn) So L is nonregular...is this a counter-example to the pumping lemma?

INF2080 Lecture :: 1st February 11 / 37

slide-39
SLIDE 39

Pumping Lemma revisited

L = {abncn | n ≥ 0} ∪ {akw | k = 1, and w ∈ Σ∗ doesn’t start with a} can be pumped!! Does that mean L is regular? If L is regular, then so is L ∩ abΣ∗ (recall: regular languages are closed under intersection). L ∩ abΣ∗ = {abncn | n ≥ 1} Exercise: show that this language is nonregular! (analogous to proof for anbn) So L is nonregular...is this a counter-example to the pumping lemma? No, pumping lemma is not an if and only if statement!

INF2080 Lecture :: 1st February 11 / 37

slide-40
SLIDE 40

Context-Free Grammars

Today: Context-free grammars and languages

INF2080 Lecture :: 1st February 12 / 37

slide-41
SLIDE 41

Context-Free Grammars

Today: Context-free grammars and languages grammars describe the syntax of a language; they try to describe the relationship of all the parts to one another, such as placement of nouns/verbs in sentences

INF2080 Lecture :: 1st February 12 / 37

slide-42
SLIDE 42

Context-Free Grammars

Today: Context-free grammars and languages grammars describe the syntax of a language; they try to describe the relationship of all the parts to one another, such as placement of nouns/verbs in sentences useful for programming languages, specifically compilers and parsers: if the grammar of a programming language is available, parsing is very straightforward.

INF2080 Lecture :: 1st February 12 / 37

slide-43
SLIDE 43

Context-Free Grammars

First example: S → aSb S → ε

INF2080 Lecture :: 1st February 13 / 37

slide-44
SLIDE 44

Context-Free Grammars

First example: S → aSb S → ε Every grammar consists of rules, which are a pair consisting of one variable (to the left of →) and a string of variables and symbols (to the right of →)

INF2080 Lecture :: 1st February 13 / 37

slide-45
SLIDE 45

Context-Free Grammars

First example: S → aSb S → ε Every grammar consists of rules, which are a pair consisting of one variable (to the left of →) and a string of variables and symbols (to the right of →) Every grammar contains a start variable (above: variable S). Common convention: the first listed variable is the start variable (if you choose a different start variable, you must specify!).

INF2080 Lecture :: 1st February 13 / 37

slide-46
SLIDE 46

Context-Free Grammars

First example: S → aSb S → ε Every grammar consists of rules, which are a pair consisting of one variable (to the left of →) and a string of variables and symbols (to the right of →) Every grammar contains a start variable (above: variable S). Common convention: the first listed variable is the start variable (if you choose a different start variable, you must specify!). Words are generated by starting with the start variable and recursively replacing variables with the righthand side of a rule. S aSb aaSbb aaεbb aabb

INF2080 Lecture :: 1st February 13 / 37

slide-47
SLIDE 47

Parse Trees

Derivations of the form S aSb aaSbb aaεbb aabb can also be encoded as a parse tree: S a S b S ε b b

INF2080 Lecture :: 1st February 14 / 37

slide-48
SLIDE 48

Context-Free Grammars

Second example: S → aSa S → bSb S → cSc S → ε

INF2080 Lecture :: 1st February 15 / 37

slide-49
SLIDE 49

Context-Free Grammars

Second example: S → aSa S → bSb S → cSc S → ε To simplify notation, you can summarize multiple rules into one line: S → aSa | bSb | cSc | ε.

INF2080 Lecture :: 1st February 15 / 37

slide-50
SLIDE 50

Context-Free Grammars

Second example: S → aSa S → bSb S → cSc S → ε To simplify notation, you can summarize multiple rules into one line: S → aSa | bSb | cSc | ε. The symbol | takes on the meaning of “or.”

INF2080 Lecture :: 1st February 15 / 37

slide-51
SLIDE 51

Context-Free Grammars

Second example: S → aSa S → bSb S → cSc S → ε To simplify notation, you can summarize multiple rules into one line: S → aSa | bSb | cSc | ε. The symbol | takes on the meaning of “or.” → palindromes of even length over {a, b, c}.

INF2080 Lecture :: 1st February 15 / 37

slide-52
SLIDE 52

Context-Free Grammar

Definition (Context-Free Grammar) A context-free grammar is a 4-tuple (V , Σ, R, S) where

1 V is a finite set of variables 2 Σ is a finite set disjoint from V of terminals 3 R is a finite set of rules, each consisting of a variable and of a string of variables and

terminals

4 and S is the start variable INF2080 Lecture :: 1st February 16 / 37

slide-53
SLIDE 53

Context-Free Grammar

Definition (Context-Free Grammar) A context-free grammar is a 4-tuple (V , Σ, R, S) where

1 V is a finite set of variables 2 Σ is a finite set disjoint from V of terminals 3 R is a finite set of rules, each consisting of a variable and of a string of variables and

terminals

4 and S is the start variable

We call L(G) the language generated by a context-free grammar. A language is called a context-free language if it is generated by a context-free grammar.

INF2080 Lecture :: 1st February 16 / 37

slide-54
SLIDE 54

Context-Free Grammar

So what can context-free grammars (CFGs) express?

INF2080 Lecture :: 1st February 17 / 37

slide-55
SLIDE 55

Context-Free Grammar

So what can context-free grammars (CFGs) express? Regular languages?

INF2080 Lecture :: 1st February 17 / 37

slide-56
SLIDE 56

Context-Free Grammar

So what can context-free grammars (CFGs) express? Regular languages? Is the class of context-free languages closed under union/intersection/concatanation/complement/Kleene star?

INF2080 Lecture :: 1st February 17 / 37

slide-57
SLIDE 57

Context-Free Grammar

So what can context-free grammars (CFGs) express? Regular languages? Is the class of context-free languages closed under union/intersection/concatanation/complement/Kleene star? Regular languages could be modelled by an automaton with finite memory...what about context-free languages?

INF2080 Lecture :: 1st February 17 / 37

slide-58
SLIDE 58

Context-Free Grammar

So what can context-free grammars (CFGs) express? Regular languages? Is the class of context-free languages closed under union/intersection/concatanation/complement/Kleene star? Regular languages could be modelled by an automaton with finite memory...what about context-free languages? Answers to these over the course of this and next lecture (and group sessions)

INF2080 Lecture :: 1st February 17 / 37

slide-59
SLIDE 59

RLs and CFLs

Can regular languages be described using context-free grammars?

INF2080 Lecture :: 1st February 18 / 37

slide-60
SLIDE 60

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L

INF2080 Lecture :: 1st February 18 / 37

slide-61
SLIDE 61

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L What if we encode traversing the DFA into grammar rules, i.e., for each transition δ(q1, a) = q2 we create a rule Q1 → aQ2

INF2080 Lecture :: 1st February 18 / 37

slide-62
SLIDE 62

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L What if we encode traversing the DFA into grammar rules, i.e., for each transition δ(q1, a) = q2 we create a rule Q1 → aQ2 the variables of our grammar correspond to the states in Q, with Q0 as the start variable.

INF2080 Lecture :: 1st February 18 / 37

slide-63
SLIDE 63

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L What if we encode traversing the DFA into grammar rules, i.e., for each transition δ(q1, a) = q2 we create a rule Q1 → aQ2 the variables of our grammar correspond to the states in Q, with Q0 as the start variable. How do we deal with accept states?

INF2080 Lecture :: 1st February 18 / 37

slide-64
SLIDE 64

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L What if we encode traversing the DFA into grammar rules, i.e., for each transition δ(q1, a) = q2 we create a rule Q1 → aQ2 the variables of our grammar correspond to the states in Q, with Q0 as the start variable. How do we deal with accept states? for each qi ∈ F, add rule Qi → ε

INF2080 Lecture :: 1st February 18 / 37

slide-65
SLIDE 65

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L What if we encode traversing the DFA into grammar rules, i.e., for each transition δ(q1, a) = q2 we create a rule Q1 → aQ2 the variables of our grammar correspond to the states in Q, with Q0 as the start variable. How do we deal with accept states? for each qi ∈ F, add rule Qi → ε Theorem Every regular language is context-free.

INF2080 Lecture :: 1st February 18 / 37

slide-66
SLIDE 66

Properties of CFLs

Closure under union/concatanation/Kleene star?

INF2080 Lecture :: 1st February 19 / 37

slide-67
SLIDE 67

Properties of CFLs

Closure under union/concatanation/Kleene star? Let G1 = (V1, Σ1, R1, S1) and G2 = (V2, Σ2, R2, S2) be two grammars that generate L1, L2 respectively.

INF2080 Lecture :: 1st February 19 / 37

slide-68
SLIDE 68

Properties of CFLs

Closure under union/concatanation/Kleene star? Let G1 = (V1, Σ1, R1, S1) and G2 = (V2, Σ2, R2, S2) be two grammars that generate L1, L2 respectively. Union: create grammar GL1∪L2 that generates all words w ∈ L1 ∪ L2.

INF2080 Lecture :: 1st February 19 / 37

slide-69
SLIDE 69

Properties of CFLs

Closure under union/concatanation/Kleene star? Let G1 = (V1, Σ1, R1, S1) and G2 = (V2, Σ2, R2, S2) be two grammars that generate L1, L2 respectively. Union: create grammar GL1∪L2 that generates all words w ∈ L1 ∪ L2. Create new start variable S. GL1∪L2 = (V , Σ, R, S) where V = V1 ∪ V2 ∪ {S}, Σ = Σ1 ∪ Σ2, and R = R1 ∪ R2 ∪ {S → S1 | S2}.

INF2080 Lecture :: 1st February 19 / 37

slide-70
SLIDE 70

CFL Union: Example

S1 → aS1b | ε ∪ S2 → aS2a | bS2b | cS2c | ε

INF2080 Lecture :: 1st February 20 / 37

slide-71
SLIDE 71

CFL Union: Example

S1 → aS1b | ε ∪ S2 → aS2a | bS2b | cS2c | ε S → S1 | S2 S1 → aS1b | ε S2 → aS2a | bS2b | cS2c | ε

INF2080 Lecture :: 1st February 20 / 37

slide-72
SLIDE 72

Properties of CFLs: Concatanation

Let G1 = (V1, Σ1, R1, S1) and G2 = (V2, Σ2, R2, S2) be two grammars that generate L1, L2 respectively. Concatanation: create grammar GL1L2 = (V , Σ, R, S) that accepts all words w = w1w2, where w1 ∈ L1 and w2 ∈ L2.

INF2080 Lecture :: 1st February 21 / 37

slide-73
SLIDE 73

Properties of CFLs: Concatanation

Let G1 = (V1, Σ1, R1, S1) and G2 = (V2, Σ2, R2, S2) be two grammars that generate L1, L2 respectively. Concatanation: create grammar GL1L2 = (V , Σ, R, S) that accepts all words w = w1w2, where w1 ∈ L1 and w2 ∈ L2. new start variable S V = V1 ∪ V2 ∪ {S}, Σ = Σ1 ∪ Σ2, and R = R1 ∪ R2 ∪ {S → S1S2}.

INF2080 Lecture :: 1st February 21 / 37

slide-74
SLIDE 74

CFL Concatanation: Example

S1 → aS1b | ε S2 → aS2a | bS2b | cS2c | ε

INF2080 Lecture :: 1st February 22 / 37

slide-75
SLIDE 75

CFL Concatanation: Example

S1 → aS1b | ε S2 → aS2a | bS2b | cS2c | ε S → S1S2 S1 → aS1b | ε S2 → aS2a | bS2b | cS2c | ε

INF2080 Lecture :: 1st February 22 / 37

slide-76
SLIDE 76

Properties of CFLs: Kleene star

Let G1 = (V1, Σ1, R1, S1) generate language L1. Kleene star: create grammar G = (V , Σ, R, S) that generates all words in L∗

1.

INF2080 Lecture :: 1st February 23 / 37

slide-77
SLIDE 77

Properties of CFLs: Kleene star

Let G1 = (V1, Σ1, R1, S1) generate language L1. Kleene star: create grammar G = (V , Σ, R, S) that generates all words in L∗

1.

V = V1, Σ = Σ1, R = R1 ∪ {S1 → ε, S1 → S1S1}, S = S1.

INF2080 Lecture :: 1st February 23 / 37

slide-78
SLIDE 78

Properties of CFLs: Kleene star

Let G1 = (V1, Σ1, R1, S1) generate language L1. Kleene star: create grammar G = (V , Σ, R, S) that generates all words in L∗

1.

V = V1, Σ = Σ1, R = R1 ∪ {S1 → ε, S1 → S1S1}, S = S1. Example: S1 → aS1b | ε S1 → ε | S1S1 S1 → aS1b | ε

INF2080 Lecture :: 1st February 23 / 37

slide-79
SLIDE 79

Properties of CFLs

Closure under complement/intersection?

INF2080 Lecture :: 1st February 24 / 37

slide-80
SLIDE 80

Properties of CFLs

Closure under complement/intersection? No, but we need to know more before we can determine if a language is not context-free. (next week)

INF2080 Lecture :: 1st February 24 / 37

slide-81
SLIDE 81

Ambiguity

Consider the grammar E → E + E | E × E | (E) | a

INF2080 Lecture :: 1st February 25 / 37

slide-82
SLIDE 82

Ambiguity

Consider the grammar E → E + E | E × E | (E) | a Here: the alphabet is {a, +, ×, (, )}.

INF2080 Lecture :: 1st February 25 / 37

slide-83
SLIDE 83

Ambiguity

Consider the grammar E → E + E | E × E | (E) | a Here: the alphabet is {a, +, ×, (, )}. → arithmetic expressions over a

INF2080 Lecture :: 1st February 25 / 37

slide-84
SLIDE 84

Ambiguity

Consider the grammar E → E + E | E × E | (E) | a Here: the alphabet is {a, +, ×, (, )}. → arithmetic expressions over a What does the parse tree for the string a + a × a look like?

INF2080 Lecture :: 1st February 25 / 37

slide-85
SLIDE 85

Ambiguity

E E a + E E a × E a

INF2080 Lecture :: 1st February 26 / 37

slide-86
SLIDE 86

Ambiguity

E E a + E E a × E a

Intuitively corresponds to a + (a × a)

INF2080 Lecture :: 1st February 26 / 37

slide-87
SLIDE 87

Ambiguity

E E a + E E a × E a

Intuitively corresponds to a + (a × a)

E E E a + E a × E a

INF2080 Lecture :: 1st February 26 / 37

slide-88
SLIDE 88

Ambiguity

E E a + E E a × E a

Intuitively corresponds to a + (a × a)

E E E a + E a × E a

Intuitively corresponds to (a + a) × a

INF2080 Lecture :: 1st February 26 / 37

slide-89
SLIDE 89

Ambiguity

E E a + E E a × E a

Intuitively corresponds to a + (a × a)

E E E a + E a × E a

Intuitively corresponds to (a + a) × a This is called ambiguity

INF2080 Lecture :: 1st February 26 / 37

slide-90
SLIDE 90

Ambiguity

But just having multiple possible derivations does not mean that a grammar is ambiguous.

INF2080 Lecture :: 1st February 27 / 37

slide-91
SLIDE 91

Ambiguity

But just having multiple possible derivations does not mean that a grammar is ambiguous. Two derivations could look different, yet “structurally” the same: apply the same rules to the same variables, yet in a different order.

INF2080 Lecture :: 1st February 27 / 37

slide-92
SLIDE 92

Ambiguity

But just having multiple possible derivations does not mean that a grammar is ambiguous. Two derivations could look different, yet “structurally” the same: apply the same rules to the same variables, yet in a different order. E E + E E + E × E a + E × E a + a × E a + a × a E E + E a + E a + E × E a + a × E a + a × a

INF2080 Lecture :: 1st February 27 / 37

slide-93
SLIDE 93

Ambiguity

But just having multiple possible derivations does not mean that a grammar is ambiguous. Two derivations could look different, yet “structurally” the same: apply the same rules to the same variables, yet in a different order. E E + E E + E × E a + E × E a + a × E a + a × a E E + E a + E a + E × E a + a × E a + a × a Both have the same parse tree!

E E a + E E a × E a

INF2080 Lecture :: 1st February 27 / 37

slide-94
SLIDE 94

Ambiguity

We are interested in structurally different derivations, i.e., two derivations of the same word that, given a predefined order of derivation, are different

INF2080 Lecture :: 1st February 28 / 37

slide-95
SLIDE 95

Ambiguity

We are interested in structurally different derivations, i.e., two derivations of the same word that, given a predefined order of derivation, are different Definition A leftmost derivation of a string replaces, in each derivation step, the leftmost variable. Then a string is derived ambiguously over a grammar G if it has two or more leftmost derivations over G.

INF2080 Lecture :: 1st February 28 / 37

slide-96
SLIDE 96

Ambiguity

We are interested in structurally different derivations, i.e., two derivations of the same word that, given a predefined order of derivation, are different Definition A leftmost derivation of a string replaces, in each derivation step, the leftmost variable. Then a string is derived ambiguously over a grammar G if it has two or more leftmost derivations over G. If L(G) contains a string that is derived ambiguously, we say that G is ambiguous.

INF2080 Lecture :: 1st February 28 / 37

slide-97
SLIDE 97

Chomsy Normal Form

Context-free languages have a nice property: Every CFL can be described by a CFG in Chomsky Normal Form: Definition A grammar is in Chomsky Normal Form if every rule is of the form: A → BC A → a where a is any terminal, A is any variable, B, C are any variables that are not the start variable. In addition the rule S → ε is permitted.

INF2080 Lecture :: 1st February 29 / 37

slide-98
SLIDE 98

Definition A grammar is in Chomsky Normal Form if every rule is of the form: A → BC A → a where a is any terminal, A is any variable, B, C are any variables that are not the start variable. In addition the rule S → ε is permitted. Proof sketch: Given an arbitrary grammar G. First, add new start variable S0 and new rule S0 → S to G.

INF2080 Lecture :: 1st February 30 / 37

slide-99
SLIDE 99

Definition A grammar is in Chomsky Normal Form if every rule is of the form: A → BC A → a where a is any terminal, A is any variable, B, C are any variables that are not the start variable. In addition the rule S → ε is permitted. Proof sketch: Given an arbitrary grammar G. First, add new start variable S0 and new rule S0 → S to G.Then, remove all rules A → ε, followed by all “unit” rules A → B.

INF2080 Lecture :: 1st February 30 / 37

slide-100
SLIDE 100

Definition A grammar is in Chomsky Normal Form if every rule is of the form: A → BC A → a where a is any terminal, A is any variable, B, C are any variables that are not the start variable. In addition the rule S → ε is permitted. Proof sketch: Given an arbitrary grammar G. First, add new start variable S0 and new rule S0 → S to G.Then, remove all rules A → ε, followed by all “unit” rules A → B. For each such

  • ccurence of A in the righthand side of a rule, add a new rule with ε (resp. B) substituted for

A (see examples on next slide).

INF2080 Lecture :: 1st February 30 / 37

slide-101
SLIDE 101

Definition A grammar is in Chomsky Normal Form if every rule is of the form: A → BC A → a where a is any terminal, A is any variable, B, C are any variables that are not the start variable. In addition the rule S → ε is permitted. Proof sketch: Given an arbitrary grammar G. First, add new start variable S0 and new rule S0 → S to G.Then, remove all rules A → ε, followed by all “unit” rules A → B. For each such

  • ccurence of A in the righthand side of a rule, add a new rule with ε (resp. B) substituted for

A (see examples on next slide). Finally, split all rules with more than 3 righthandside symbols into multiple rules containing only 2 symbols.

INF2080 Lecture :: 1st February 30 / 37

slide-102
SLIDE 102

CNF - Example

Grammar; S → ASA | aB A → B | S B → b | ε First, add new start variable:

INF2080 Lecture :: 1st February 31 / 37

slide-103
SLIDE 103

CNF - Example

Grammar; S → ASA | aB A → B | S B → b | ε First, add new start variable: S0 → S S → ASA | aB A → B | S B → b | ε

INF2080 Lecture :: 1st February 31 / 37

slide-104
SLIDE 104

CNF - Example

S0 → S S → ASA | aB A → B | S B → b | ε Then, remove B → ε:

INF2080 Lecture :: 1st February 32 / 37

slide-105
SLIDE 105

CNF - Example

S0 → S S → ASA | aB A → B | S B → b | ε Then, remove B → ε: S0 → S S → ASA | aB | a A → B | ε | S B → b

INF2080 Lecture :: 1st February 32 / 37

slide-106
SLIDE 106

CNF - Example

S0 → S S → ASA | aB | a A → B | ε | S B → b Then, remove A → ε:

INF2080 Lecture :: 1st February 33 / 37

slide-107
SLIDE 107

CNF - Example

S0 → S S → ASA | aB | a A → B | ε | S B → b Then, remove A → ε: S0 → S S → ASA | SA | AS | S | aB | a A → S | B B → b

INF2080 Lecture :: 1st February 33 / 37

slide-108
SLIDE 108

CNF - Example

S0 → S S → ASA | SA | AS | S | aB | a A → B | S B → b Then remove S → S:

INF2080 Lecture :: 1st February 34 / 37

slide-109
SLIDE 109

CNF - Example

S0 → S S → ASA | SA | AS | S | aB | a A → B | S B → b Then remove S → S: S0 → S S → ASA | SA | AS | aB | a A → B | S B → b

INF2080 Lecture :: 1st February 34 / 37

slide-110
SLIDE 110

CNF - Example

S0 → S S → ASA | SA | AS | aB | a A → B | S B → b Remove unit rule S0 → S:

INF2080 Lecture :: 1st February 35 / 37

slide-111
SLIDE 111

CNF - Example

S0 → S S → ASA | SA | AS | aB | a A → B | S B → b Remove unit rule S0 → S: S0 → ASA | SA | AS | aB | a S → ASA | SA | AS | aB | a A → B | S B → b

INF2080 Lecture :: 1st February 35 / 37

slide-112
SLIDE 112

CNF - Example

S0 → ASA | SA | AS | aB | a S → ASA | SA | AS | aB | a A → B | S B → b and you would continue to remove the unit rules A → S, etc....

INF2080 Lecture :: 1st February 36 / 37

slide-113
SLIDE 113

CNF - Example

S0 → ASA | SA | AS | aB | a S → ASA | SA | AS | aB | a A → B | S B → b and you would continue to remove the unit rules A → S, etc....But how to convert, say, S → ASA into rules with only two symbols on the right?

INF2080 Lecture :: 1st February 36 / 37

slide-114
SLIDE 114

CNF - Example

S0 → ASA | SA | AS | aB | a S → ASA | SA | AS | aB | a A → B | S B → b and you would continue to remove the unit rules A → S, etc....But how to convert, say, S → ASA into rules with only two symbols on the right? introduce help variables! S → ASA S → AA1, A1 → SA

INF2080 Lecture :: 1st February 36 / 37

slide-115
SLIDE 115

CNF

Thus, we see how all CFGs can be converted to CFGs in CNF.

INF2080 Lecture :: 1st February 37 / 37

slide-116
SLIDE 116

CNF

Thus, we see how all CFGs can be converted to CFGs in CNF. Useful property to have, both for practical purposes and theoretical work: knowing what the grammar looks like can be very beneficial (we will see an example next week)

INF2080 Lecture :: 1st February 37 / 37

slide-117
SLIDE 117

CNF

Thus, we see how all CFGs can be converted to CFGs in CNF. Useful property to have, both for practical purposes and theoretical work: knowing what the grammar looks like can be very beneficial (we will see an example next week) how can finite automata be enriched so as to accept context-free languages?

INF2080 Lecture :: 1st February 37 / 37

slide-118
SLIDE 118

CNF

Thus, we see how all CFGs can be converted to CFGs in CNF. Useful property to have, both for practical purposes and theoretical work: knowing what the grammar looks like can be very beneficial (we will see an example next week) how can finite automata be enriched so as to accept context-free languages? → next week!

INF2080 Lecture :: 1st February 37 / 37