CS 301 Lecture 09 Context-free grammars Stephen Checkoway - - PowerPoint PPT Presentation

cs 301
SMART_READER_LITE
LIVE PREVIEW

CS 301 Lecture 09 Context-free grammars Stephen Checkoway - - PowerPoint PPT Presentation

CS 301 Lecture 09 Context-free grammars Stephen Checkoway February 14, 2018 1 / 22 Context-free grammars (CFGs) Method of generating (or describing) languages by giving rules to derive strings Rules contain Terminals symbols from an


slide-1
SLIDE 1

CS 301

Lecture 09 – Context-free grammars Stephen Checkoway February 14, 2018

1 / 22

slide-2
SLIDE 2

Context-free grammars (CFGs)

Method of generating (or describing) languages by giving rules to derive strings Rules contain Terminals symbols from an alphabet (written in typewriter font) Variables which expand to sequences of terminals and variables (typically upper case letters) Rules have a variable on the left, an arrow (→), and a sequence of terminals and variables on the right Example:

2 / 22

slide-3
SLIDE 3

Context-free grammars (CFGs)

Method of generating (or describing) languages by giving rules to derive strings Rules contain Terminals symbols from an alphabet (written in typewriter font) Variables which expand to sequences of terminals and variables (typically upper case letters) Rules have a variable on the left, an arrow (→), and a sequence of terminals and variables on the right Example: S → AB A → aA A → ε B → bB B → ε

2 / 22

slide-4
SLIDE 4

Context-free grammars (CFGs)

Method of generating (or describing) languages by giving rules to derive strings Rules contain Terminals symbols from an alphabet (written in typewriter font) Variables which expand to sequences of terminals and variables (typically upper case letters) Rules have a variable on the left, an arrow (→), and a sequence of terminals and variables on the right Example: S → AB A → aA A → ε B → bB B → ε We often combine multiple rules with the same left-hand side using ∣ S → AB A → aA ∣ ε B → bB ∣ ε

2 / 22

slide-5
SLIDE 5

Deriving strings

A CFG derives a string by starting with the start variable (usually the variable on the left in the first rule) and applying rules until no variables remain The CFG S → AB A → aA ∣ ε B → bB ∣ ε derives the following strings S ⇒ AB ⇒ εB ⇒ εε = ε S ⇒ AB ⇒ aAB ⇒ aεB ⇒ aεε = a S ⇒ AB ⇒ aAB ⇒ aaAB ⇒ aaεB ⇒ aabB ⇒ aabε = aab ⋮

3 / 22

slide-6
SLIDE 6

Derivations

The order in which we replace a variable in a derivation with the RHS of a production rule doesn’t matter1 In a left-most derivation, we replace the left-most variable in each step In a right-most derivation, we replace the right-most variable in each step

1except in one case we’ll get to 4 / 22

slide-7
SLIDE 7

Left-most/right-most derivation example

S → ST ∣ aTa T → S ∣ aTa ∣ b Left-most derivation of aabaaaba: S ⇒

5 / 22

slide-8
SLIDE 8

Left-most/right-most derivation example

S → ST ∣ aTa T → S ∣ aTa ∣ b Left-most derivation of aabaaaba: S ⇒ ST

5 / 22

slide-9
SLIDE 9

Left-most/right-most derivation example

S → ST ∣ aTa T → S ∣ aTa ∣ b Left-most derivation of aabaaaba: S ⇒ ST ⇒ aTaT

5 / 22

slide-10
SLIDE 10

Left-most/right-most derivation example

S → ST ∣ aTa T → S ∣ aTa ∣ b Left-most derivation of aabaaaba: S ⇒ ST ⇒ aTaT ⇒ aaTaaT

5 / 22

slide-11
SLIDE 11

Left-most/right-most derivation example

S → ST ∣ aTa T → S ∣ aTa ∣ b Left-most derivation of aabaaaba: S ⇒ ST ⇒ aTaT ⇒ aaTaaT ⇒ aabaaT

5 / 22

slide-12
SLIDE 12

Left-most/right-most derivation example

S → ST ∣ aTa T → S ∣ aTa ∣ b Left-most derivation of aabaaaba: S ⇒ ST ⇒ aTaT ⇒ aaTaaT ⇒ aabaaT ⇒ aabaaaTa

5 / 22

slide-13
SLIDE 13

Left-most/right-most derivation example

S → ST ∣ aTa T → S ∣ aTa ∣ b Left-most derivation of aabaaaba: S ⇒ ST ⇒ aTaT ⇒ aaTaaT ⇒ aabaaT ⇒ aabaaaTa ⇒ aabaaaba

5 / 22

slide-14
SLIDE 14

Left-most/right-most derivation example

S → ST ∣ aTa T → S ∣ aTa ∣ b Left-most derivation of aabaaaba: S ⇒ ST ⇒ aTaT ⇒ aaTaaT ⇒ aabaaT ⇒ aabaaaTa ⇒ aabaaaba Right-most derivation of aabaaaba: S ⇒ ST ⇒ SaTa ⇒ Saba ⇒ aTaaba ⇒ aaTaaaba ⇒ aabaaaba

5 / 22

slide-15
SLIDE 15

Another example

The CFG S → aSb ∣ ε derives S ⇒ ε S ⇒ aSb ⇒ ab S ⇒ aSb ⇒ aaSbb ⇒ aabb ⋮ S ⇒ aSb ⇒ ⋯ ⇒ anSbn ⇒ anbn ⋮ The language of this CFG is {anbn ∣ n ≥ 0}

6 / 22

slide-16
SLIDE 16

Nested brackets

Given the alphabet Σ = {(, ), [, ]}, design a CFG that generates the language of properly nested brackets.

  • ε
  • ()
  • []
  • ([])[](())
  • . . .

7 / 22

slide-17
SLIDE 17

Nested brackets

Given the alphabet Σ = {(, ), [, ]}, design a CFG that generates the language of properly nested brackets.

  • ε
  • ()
  • []
  • ([])[](())
  • . . .

S → P ∣ B ∣ SS ∣ ε P → (S) B → [S]

7 / 22

slide-18
SLIDE 18

More CFG examples

Let Σ = {a, b} Construct a CFG for the languages over Σ

  • A = Σ∗
  • B = {w ∣ w contains at least three bs}
  • C = {w ∣ w starts and ends with different symbols}
  • D = {w ∣ the length of w is odd and the middle symbol is b}
  • E = {w ∣ w = wR}
  • F = ∅

8 / 22

slide-19
SLIDE 19

Formally speaking

A CFG is a 4-tuple G = (V, Σ, R, S) where

  • V is a finite set of variables (or nonterminals)
  • Σ is a finite set of terminals (V ∩ Σ = ∅)
  • R is a finite set of production rules
  • S ∈ V is the start variable

9 / 22

slide-20
SLIDE 20

Formally speaking

A CFG is a 4-tuple G = (V, Σ, R, S) where

  • V is a finite set of variables (or nonterminals)
  • Σ is a finite set of terminals (V ∩ Σ = ∅)
  • R is a finite set of production rules
  • S ∈ V is the start variable

If u, v, w ∈ (Σ ∪ V )∗ and G has a rule A → v, then we say uAw yields uvw and write uAw ⇒ uvw

9 / 22

slide-21
SLIDE 21

Formally speaking

A CFG is a 4-tuple G = (V, Σ, R, S) where

  • V is a finite set of variables (or nonterminals)
  • Σ is a finite set of terminals (V ∩ Σ = ∅)
  • R is a finite set of production rules
  • S ∈ V is the start variable

If u, v, w ∈ (Σ ∪ V )∗ and G has a rule A → v, then we say uAw yields uvw and write uAw ⇒ uvw We say u derives v, written u

⇒ v to mean either u = v or there exist u1, u2, . . . , un ∈ (Σ ∪ V )∗ such that u = u1 ⇒ u2 ⇒ ⋯ ⇒ un = v

9 / 22

slide-22
SLIDE 22

Formally speaking

A CFG is a 4-tuple G = (V, Σ, R, S) where

  • V is a finite set of variables (or nonterminals)
  • Σ is a finite set of terminals (V ∩ Σ = ∅)
  • R is a finite set of production rules
  • S ∈ V is the start variable

If u, v, w ∈ (Σ ∪ V )∗ and G has a rule A → v, then we say uAw yields uvw and write uAw ⇒ uvw We say u derives v, written u

⇒ v to mean either u = v or there exist u1, u2, . . . , un ∈ (Σ ∪ V )∗ such that u = u1 ⇒ u2 ⇒ ⋯ ⇒ un = v The language of G is L(G) = {w ∣ w ∈ Σ∗ and S

⇒ w} We say G generates a language A if L(G) = A

9 / 22

slide-23
SLIDE 23

Arithmetic expressions

Given the alphabet Σ = {(, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, design a CFG that generates the language of arithmetic expressions

  • 37
  • 8+22-8/6
  • 10*(8-2)
  • . . .

An expression can be a number or two expressions separated by an operator or a parenthesized expression A number is one or more digits

10 / 22

slide-24
SLIDE 24

Arithmetic expressions

Given the alphabet Σ = {(, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, design a CFG that generates the language of arithmetic expressions

  • 37
  • 8+22-8/6
  • 10*(8-2)
  • . . .

An expression can be a number or two expressions separated by an operator or a parenthesized expression A number is one or more digits E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9

10 / 22

slide-25
SLIDE 25

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E Parse tree E

11 / 22

slide-26
SLIDE 26

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E Parse tree E E + E

11 / 22

slide-27
SLIDE 27

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E Parse tree E E + E E * E

11 / 22

slide-28
SLIDE 28

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E ⇒ N+E*E Parse tree E E N + E E * E

11 / 22

slide-29
SLIDE 29

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E Parse tree E E N D + E E * E

11 / 22

slide-30
SLIDE 30

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E Parse tree E E N D 3 + E E * E

11 / 22

slide-31
SLIDE 31

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E Parse tree E E N D 3 + E E N * E

11 / 22

slide-32
SLIDE 32

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E Parse tree E E N D 3 + E E N D * E

11 / 22

slide-33
SLIDE 33

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E Parse tree E E N D 3 + E E N D 8 * E

11 / 22

slide-34
SLIDE 34

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N Parse tree E E N D 3 + E E N D 8 * E N

11 / 22

slide-35
SLIDE 35

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D Parse tree E E N D 3 + E E N D 8 * E N D

11 / 22

slide-36
SLIDE 36

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D ⇒ 3+8*7 Parse tree E E N D 3 + E E N D 8 * E N D 7

11 / 22

slide-37
SLIDE 37

Parse trees give a way to visualize a derivation

E → N ∣ E*E ∣ E/E ∣ E+E ∣ E-E ∣ (E) N → DN ∣ D D → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9 Derivation E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D ⇒ 3+8*7 Parse tree E E N D 3 + E E N D 8 * E N D 7

11 / 22

slide-38
SLIDE 38

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E E E E

12 / 22

slide-39
SLIDE 39

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E E E + E E ⇒ E+E E E + E

12 / 22

slide-40
SLIDE 40

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E E E + E E * E E ⇒ E+E ⇒ N+E E E N + E

12 / 22

slide-41
SLIDE 41

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E ⇒ N+E*E E E N + E E * E E ⇒ E+E ⇒ N+E ⇒ D+E E E N D + E

12 / 22

slide-42
SLIDE 42

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E E E N D + E E * E E ⇒ E+E ⇒ N+E ⇒ D+E ⇒ 3+E E E N D 3 + E

12 / 22

slide-43
SLIDE 43

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E E E N D 3 + E E * E E ⇒ E+E ⇒ N+E ⇒ D+E ⇒ 3+E ⇒ 3+E*E E E N D 3 + E E * E

12 / 22

slide-44
SLIDE 44

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E E E N D 3 + E E N * E E ⇒ E+E ⇒ N+E ⇒ D+E ⇒ 3+E ⇒ 3+E*E ⇒ 3+N*E E E N D 3 + E E N * E

12 / 22

slide-45
SLIDE 45

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E E E N D 3 + E E N D * E E ⇒ E+E ⇒ N+E ⇒ D+E ⇒ 3+E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E E E N D 3 + E E N D * E

12 / 22

slide-46
SLIDE 46

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E E E N D 3 + E E N D 8 * E E ⇒ E+E ⇒ N+E ⇒ D+E ⇒ 3+E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E E E N D 3 + E E N D 8 * E

12 / 22

slide-47
SLIDE 47

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N E E N D 3 + E E N D 8 * E N E ⇒ E+E ⇒ N+E ⇒ D+E ⇒ 3+E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N E E N D 3 + E E N D 8 * E N

12 / 22

slide-48
SLIDE 48

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D E E N D 3 + E E N D 8 * E N D E ⇒ E+E ⇒ N+E ⇒ D+E ⇒ 3+E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D E E N D 3 + E E N D 8 * E N D

12 / 22

slide-49
SLIDE 49

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D ⇒ 3+8*7 E E N D 3 + E E N D 8 * E N D 7 E ⇒ E+E ⇒ N+E ⇒ D+E ⇒ 3+E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D ⇒ 3+8*7 E E N D 3 + E E N D 8 * E N D 7

12 / 22

slide-50
SLIDE 50

Different derivations can give rise to the same parse tree

Two different derivations give the same parse tree E ⇒ E+E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D ⇒ 3+8*7 E E N D 3 + E E N D 8 * E N D 7 E ⇒ E+E ⇒ N+E ⇒ D+E ⇒ 3+E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D ⇒ 3+8*7 E E N D 3 + E E N D 8 * E N D 7 You can think of the derivations as filling out the tree in different orders

12 / 22

slide-51
SLIDE 51

Different derivations can give rise to different parse trees

Two different left-most derivations give rise to different parse trees E ⇒ E+E ⇒ N+E ⇒ D+E ⇒ 3+E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D ⇒ 3+8*7 E E N D 3 + E E N D 8 * E N D 7 E ⇒ E*E ⇒ E+E*E ⇒ N+E*E ⇒ D+E*E ⇒ 3+E*E ⇒ 3+N*E ⇒ 3+D*E ⇒ 3+8*E ⇒ 3+8*N ⇒ 3+8*D ⇒ 3+8*7 E E E N D 3 + E N D 8 * E N D 7

13 / 22

slide-52
SLIDE 52

Ambiguity

Our grammar can derive this string in two different ways E E N D 3 + E E N D 8 * E N D 7 E E E N D 3 + E N D 8 * E N D 7 This grammar is ambiguous because it has two different parse trees for the same string in the language Imagine a calculator or a compiler parsing this expression Depending on which parse tree it used, it gets different results

14 / 22

slide-53
SLIDE 53

Resolving ambiguity

In some cases, we can redesign the grammar to get rid of ambiguity Instead of just expressions, let’s have expressions (E), terms (T), and factors (F) E → E+T ∣ E-T ∣ T T → T*F ∣ T/F ∣ F F → (E) ∣ N N → DN ∣ D D → 0 ∣ 1 ∣ ⋯ ∣ 9 This CFG has exactly the same lan- guage as the previous one but now there’s exactly one way to parse 3+8*7

15 / 22

slide-54
SLIDE 54

Resolving ambiguity

In some cases, we can redesign the grammar to get rid of ambiguity Instead of just expressions, let’s have expressions (E), terms (T), and factors (F) E → E+T ∣ E-T ∣ T T → T*F ∣ T/F ∣ F F → (E) ∣ N N → DN ∣ D D → 0 ∣ 1 ∣ ⋯ ∣ 9 This CFG has exactly the same lan- guage as the previous one but now there’s exactly one way to parse 3+8*7 E E T F N D 3 + T T F N D 8 * F N D 7

15 / 22

slide-55
SLIDE 55

Ambiguity

Equivalent statements about a CFG G

1 G is ambiguous if a word in L(G) has two different parse trees 2 G is ambiguous if a word in L(G) has two different left-most derivations 3 G is ambiguous if a word in L(G) has two different right-most derivations

It is not the case that G is ambiguous if a word merely has two different derivations

16 / 22

slide-56
SLIDE 56

Context-free languages

A language A is a context-free language (CFL) if there is a CFG G that generates A (i.e., L(G) = A)

Theorem

Context-free languages are closed under union, concatenation, and Kleene star.

17 / 22

slide-57
SLIDE 57

Union

Proof.

Let G1 = (V1, Σ, R1, S1) generate A and G2 = (V2, Σ, R2, S2) generate B (assume V1 ∩ V2 = ∅, otherwise rename some variables)

18 / 22

slide-58
SLIDE 58

Union

Proof.

Let G1 = (V1, Σ, R1, S1) generate A and G2 = (V2, Σ, R2, S2) generate B (assume V1 ∩ V2 = ∅, otherwise rename some variables) Construct a new CFG G = (V, Σ, R, S) to generate A ∪ B where V = V1 ∪ V2 ∪ {S} R = R1 ∪ R2 ∪ {S → S1 ∣ S2}

18 / 22

slide-59
SLIDE 59

Union

Proof.

Let G1 = (V1, Σ, R1, S1) generate A and G2 = (V2, Σ, R2, S2) generate B (assume V1 ∩ V2 = ∅, otherwise rename some variables) Construct a new CFG G = (V, Σ, R, S) to generate A ∪ B where V = V1 ∪ V2 ∪ {S} R = R1 ∪ R2 ∪ {S → S1 ∣ S2} If w ∈ A, then G1 derives w, S1

⇒ w, and so G derives w via S ⇒ S1

⇒ w.

18 / 22

slide-60
SLIDE 60

Union

Proof.

Let G1 = (V1, Σ, R1, S1) generate A and G2 = (V2, Σ, R2, S2) generate B (assume V1 ∩ V2 = ∅, otherwise rename some variables) Construct a new CFG G = (V, Σ, R, S) to generate A ∪ B where V = V1 ∪ V2 ∪ {S} R = R1 ∪ R2 ∪ {S → S1 ∣ S2} If w ∈ A, then G1 derives w, S1

⇒ w, and so G derives w via S ⇒ S1

⇒ w. If w ∈ B, then S2

⇒ w so S ⇒ S2

⇒ w. In either case w ∈ L(G).

18 / 22

slide-61
SLIDE 61

Union

Proof.

Let G1 = (V1, Σ, R1, S1) generate A and G2 = (V2, Σ, R2, S2) generate B (assume V1 ∩ V2 = ∅, otherwise rename some variables) Construct a new CFG G = (V, Σ, R, S) to generate A ∪ B where V = V1 ∪ V2 ∪ {S} R = R1 ∪ R2 ∪ {S → S1 ∣ S2} If w ∈ A, then G1 derives w, S1

⇒ w, and so G derives w via S ⇒ S1

⇒ w. If w ∈ B, then S2

⇒ w so S ⇒ S2

⇒ w. In either case w ∈ L(G). If w ∈ L(G), then either S ⇒ S1

⇒ w or S ⇒ S2

⇒ w. Thus w ∈ A ∪ B.

18 / 22

slide-62
SLIDE 62

Concatenation

Proof.

Let G1 = (V1, Σ, R1, S1) generate A and G2 = (V2, Σ, R2, S2) generate B

19 / 22

slide-63
SLIDE 63

Concatenation

Proof.

Let G1 = (V1, Σ, R1, S1) generate A and G2 = (V2, Σ, R2, S2) generate B Construct a new CFG G = (V, Σ, R, S) to generate A ◦ B where V = V1 ∪ V2 ∪ {S} R = R1 ∪ R2 ∪ {S → S1S2} A similar argument shows why L(G) = A ◦ B

19 / 22

slide-64
SLIDE 64

Kleene star

Proof.

Let G1 = (V1, Σ, R1, S1) generate A.

20 / 22

slide-65
SLIDE 65

Kleene star

Proof.

Let G1 = (V1, Σ, R1, S1) generate A. Construct a new CFG G = (V, Σ, R, S) to generate A∗ where V = V1 ∪ {S} R = R1 ∪ {S → SS1 ∣ ε}

20 / 22

slide-66
SLIDE 66

Regular languages are context-free

Theorem

Every regular language is context-free.

21 / 22

slide-67
SLIDE 67

Regular languages are context-free

Theorem

Every regular language is context-free.

Proof.

We can use induction on the structure of regular expressions. Three base cases

21 / 22

slide-68
SLIDE 68

Regular languages are context-free

Theorem

Every regular language is context-free.

Proof.

We can use induction on the structure of regular expressions. Three base cases

  • ∅. S → S

21 / 22

slide-69
SLIDE 69

Regular languages are context-free

Theorem

Every regular language is context-free.

Proof.

We can use induction on the structure of regular expressions. Three base cases

  • ∅. S → S
  • ε. S → ε

21 / 22

slide-70
SLIDE 70

Regular languages are context-free

Theorem

Every regular language is context-free.

Proof.

We can use induction on the structure of regular expressions. Three base cases

  • ∅. S → S
  • ε. S → ε
  • t for t ∈ Σ. S → t

21 / 22

slide-71
SLIDE 71

Regular languages are context-free

Theorem

Every regular language is context-free.

Proof.

We can use induction on the structure of regular expressions. Three base cases

  • ∅. S → S
  • ε. S → ε
  • t for t ∈ Σ. S → t

Three inductive cases.

  • R1R2
  • R1 ∣ R2
  • R∗

1

21 / 22

slide-72
SLIDE 72

Regular languages are context-free

Theorem

Every regular language is context-free.

Proof.

We can use induction on the structure of regular expressions. Three base cases

  • ∅. S → S
  • ε. S → ε
  • t for t ∈ Σ. S → t

Three inductive cases.

  • R1R2
  • R1 ∣ R2
  • R∗

1

By the inductive hypothesis, L(R1) and L(R2) are context-free and context-free languages are closed under concatenation, union, and star.

21 / 22

slide-73
SLIDE 73

Ambiguity

An inherently ambiguous context-free language is one in which every context-free grammar is ambiguous {aibjck ∣ i = j or j = k} is inherently ambiguous

22 / 22