Theory of Computer Science C5. Context-free Languages: Normal Form - - PowerPoint PPT Presentation

theory of computer science
SMART_READER_LITE
LIVE PREVIEW

Theory of Computer Science C5. Context-free Languages: Normal Form - - PowerPoint PPT Presentation

Theory of Computer Science C5. Context-free Languages: Normal Form and PDA Gabriele R oger University of Basel April 1, 2020 Gabriele R oger (University of Basel) Theory of Computer Science April 1, 2020 1 / 34 Theory of Computer


slide-1
SLIDE 1

Theory of Computer Science

  • C5. Context-free Languages: Normal Form and PDA

Gabriele R¨

  • ger

University of Basel

April 1, 2020

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 1 / 34

slide-2
SLIDE 2

Theory of Computer Science

April 1, 2020 — C5. Context-free Languages: Normal Form and PDA

C5.1 Context-free Grammars and ε-Rules C5.2 Chomsky Normal Form C5.3 Push-Down Automata C5.4 Summary

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 2 / 34

slide-3
SLIDE 3

Overview

Automata & Formal Languages Languages & Grammars Regular Languages Context-free Languages ε-rules Chomsky Normal Form PDAs Pumping Lemma Closure Properties Decidability Context-sensitive & Type-0 Languages

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 3 / 34

slide-4
SLIDE 4
  • C5. Context-free Languages: Normal Form and PDA

Context-free Grammars and ε-Rules

C5.1 Context-free Grammars and ε-Rules

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 4 / 34

slide-5
SLIDE 5
  • C5. Context-free Languages: Normal Form and PDA

Context-free Grammars and ε-Rules

Repetition: Context-free Grammars

Definition (Context-free Grammar) A context-free grammar is a 4-tuple Σ, V , P, S with

1 Σ finite alphabet of terminal symbols, 2 V finite set of variables (with V ∩ Σ = ∅), 3 P ⊆ (V × (V ∪ Σ)+) ∪ {S, ε} finite set of rules, 4 If S → ε ∈ P, then all other rules in V × ((V \ {S}) ∪ Σ)+. 5 S ∈ V start variable.

Rule X → ε is only allowed if X = S and S never occurs on a right-hand side. With regular grammars, this restriction could be lifted. How about context-free grammars?

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 5 / 34

slide-6
SLIDE 6
  • C5. Context-free Languages: Normal Form and PDA

Context-free Grammars and ε-Rules

Overview

Automata & Formal Languages Languages & Grammars Regular Languages Context-free Languages ε-rules Chomsky Normal Form PDAs Pumping Lemma Closure Properties Decidability Context-sensitive & Type-0 Languages

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 6 / 34

slide-7
SLIDE 7
  • C5. Context-free Languages: Normal Form and PDA

Context-free Grammars and ε-Rules

Reminder: Start Variable in Right-Hand Side of Rules

For every type-0 language L there is a grammar where the start variable does not occur on the right-hand side of any rule. Theorem For every grammar G = Σ, V , P, S there is a grammar G ′ = Σ, V ′, P′, S with rules P′ ⊆ (V ′ ∪ Σ)+ × (V ′ \ {S} ∪ Σ)∗ such that L(G) = L(G ′). In the proof we constructed a suitable grammar, where the rules in P′ were not fundamentally different from the rules in P: ◮ for rules from V × (V ∪ Σ)+, we only introduced additional rules from V ′ × (V ′ ∪ Σ)+, and ◮ for rules from V × ε, we only introduced rules from V ′ × ε, where V ′ = V ∪ {S′} for some new variable S′ ∈ V .

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 7 / 34

slide-8
SLIDE 8
  • C5. Context-free Languages: Normal Form and PDA

Context-free Grammars and ε-Rules

ε-Rules

Theorem For every grammar G with rules P ⊆ V × (V ∪ Σ)∗ there is a context-free grammar G ′ with L(G) = L(G ′). Proof. Let G = Σ, V , P, S be a grammar with P ⊆ V × (V ∪ Σ)∗. Let G ′ = Σ, V ′, P′, S be a grammar with L(G) = L(G ′) with P′ ⊆ V ′ × ((V ′ \ S) ∪ Σ)∗. Let Vε = {A ∈ V ′ | A ⇒∗

G ′ ε}. We can find this set Vε by first

collecting all variables A with rule A → ε ∈ P′ and then successively adding additional variables B if there is a rule B → A1A2 . . . Ak ∈ P′ and the variables Ai are already in the set for all 1 ≤ i ≤ k. . . .

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 8 / 34

slide-9
SLIDE 9
  • C5. Context-free Languages: Normal Form and PDA

Context-free Grammars and ε-Rules

ε-Rules

Theorem For every grammar G with rules P ⊆ V × (V ∪ Σ)∗ there is a context-free grammar G ′ with L(G) = L(G ′). Proof (continued). Let P′′ be the rule set that is constructed from P′ by ◮ adding rules that obviate the need for A → ε rules: for every existing rule B → w with B ∈ V ′, w ∈ (V ′ ∪ Σ)+, let Iε be the set of positions where w contains a variable A ∈ Vε. For every non-empty set I ′ ⊆ Iε, add a new rule B → w′, where w′ is constructed from w by removing the variables at all positions in I ′. ◮ removing all rules of the form A → ε (A = S). Then G ′′ = Σ, V ′, P′′, S is context-free and L(G) = L(G ′′).

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 9 / 34

slide-10
SLIDE 10
  • C5. Context-free Languages: Normal Form and PDA

Chomsky Normal Form

C5.2 Chomsky Normal Form

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 10 / 34

slide-11
SLIDE 11
  • C5. Context-free Languages: Normal Form and PDA

Chomsky Normal Form

Overview

Automata & Formal Languages Languages & Grammars Regular Languages Context-free Languages ε-rules Chomsky Normal Form PDAs Pumping Lemma Closure Properties Decidability Context-sensitive & Type-0 Languages

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 11 / 34

slide-12
SLIDE 12
  • C5. Context-free Languages: Normal Form and PDA

Chomsky Normal Form

Chomsky Normal Form: Motivation

As in logical formulas (and other kinds of structured objects), normal forms for grammars are useful: ◮ they show which aspects are critical for defining grammars and which ones are just syntactic sugar ◮ they allow proofs and algorithms to be restricted to a limited set of grammars (inputs): those in normal form Hence we now consider a normal form for context-free grammars.

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 12 / 34

slide-13
SLIDE 13
  • C5. Context-free Languages: Normal Form and PDA

Chomsky Normal Form

Chomsky Normal Form: Definition

Definition (Chomsky Normal Form) A context-free grammar G is in Chomsky normal form (CNF) if all rules have one of the following three forms: ◮ A → BC with variables A, B, C, or ◮ A → a with variable A, terminal symbol a, or ◮ S → ε with start variable S.

German: Chomsky-Normalform

in short: rule set P ⊆ (V × (VV ∪ Σ)) ∪ {S, ε}

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 13 / 34

slide-14
SLIDE 14
  • C5. Context-free Languages: Normal Form and PDA

Chomsky Normal Form

Chomsky Normal Form: Theorem

Theorem For every context-free grammar G there is a context-free grammar G ′ in Chomsky normal form with L(G) = L(G ′). Proof. The following algorithm converts the rule set of G into CNF: Step 1: Eliminate rules of the form A → B with variables A, B. If there are sets of variables {B1, . . . , Bk} with rules B1 → B2, B2 → B3, . . . , Bk−1 → Bk, Bk → B1, then replace these variables by a new variable B. Define a strict total order < on the variables such that A → B ∈ P implies that A < B. Iterate from the largest to the smallest variable A and eliminate all rules of the form A → B while adding rules A → w for every rule B → w with w ∈ (V ∪ Σ)+. . . .

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 14 / 34

slide-15
SLIDE 15
  • C5. Context-free Languages: Normal Form and PDA

Chomsky Normal Form

Chomsky Normal Form: Theorem

Theorem For every context-free grammar G there is a context-free grammar G ′ in Chomsky normal form with L(G) = L(G ′). Proof (continued). Step 2: Eliminate rules with terminal symbols on the right-hand side that do not have the form A → a. For every terminal symbol a ∈ Σ add a new variable Aa and the rule Aa → a. Replace all terminal symbols in all rules that do not have the form A → a with the corresponding newly added variables. . . .

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 15 / 34

slide-16
SLIDE 16
  • C5. Context-free Languages: Normal Form and PDA

Chomsky Normal Form

Chomsky Normal Form: Theorem

Theorem For every context-free grammar G there is a context-free grammar G ′ in Chomsky normal form with L(G) = L(G ′). Proof (continued). Step 3: Eliminate rules of the form A → B1B2 . . . Bk with k > 2 For every rule of the form A → B1B2 . . . Bk with k > 2, add new variables C2, . . . , Ck−1 and replace the rule with A → B1C2 C2 → B2C3 . . . Ck−1 → Bk−1Bk

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 16 / 34

slide-17
SLIDE 17
  • C5. Context-free Languages: Normal Form and PDA

Chomsky Normal Form

Chomsky Normal Form: Length of Derivations

Observation Let G be a grammar in Chomsky normal form, and let w ∈ L(G) be a non-empty word generated by G. Then all derivations of w have exactly 2|w| − 1 derivation steps. Proof. Exercises

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 17 / 34

slide-18
SLIDE 18
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

C5.3 Push-Down Automata

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 18 / 34

slide-19
SLIDE 19
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Overview

Automata & Formal Languages Languages & Grammars Regular Languages Context-free Languages ε-rules Chomsky Normal Form PDAs Pumping Lemma Closure Properties Decidability Context-sensitive & Type-0 Languages

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 19 / 34

slide-20
SLIDE 20
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Limitations of Finite Automata

q0 q1 q2 0,1

◮ Language L is regular. ⇐ ⇒ There is a finite automaton that accepts L. ◮ What information can a finite automaton “store” about the already read part of the word? ◮ Infinite memory would be required for L = {x1x2 . . . xnxn . . . x2x1 | n > 0, xi ∈ {a, b}}. ◮ therefore: extension of the automata model with memory

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 20 / 34

slide-21
SLIDE 21
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Stack

A stack is a data structure following the last-in-first-out (LIFO) principle supporting the following operations: ◮ push: puts an object on top

  • f the stack

◮ pop: removes the object at the top of the stack ◮ peek: returns the top object without removing it

Pop Push

German: Keller, Stapel

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 21 / 34

slide-22
SLIDE 22
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Push-down Automata: Visually

Input tape

I n p u t

Read head Push-down automaton Stack access Stack

German: Kellerautomat, Eingabeband, Lesekopf, Kellerzugriff

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 22 / 34

slide-23
SLIDE 23
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Push-down Automata: Definition

Definition (Push-down Automaton) A push-down automaton (PDA) is a 6-tuple M = Q, Σ, Γ, δ, q0, # with ◮ Q finite set of states ◮ Σ the input alphabet ◮ Γ the stack alphabet ◮ δ : Q × (Σ ∪ {ε}) × Γ → Pf(Q × Γ∗) the transition function (where Pf is the set of all finite subsets) ◮ q0 ∈ Q the start state ◮ # ∈ Γ the bottommost stack symbol

German: Kellerautomat, Eingabealphabet, Kelleralphabet, German: ¨ Uberf¨ uhrungsfunktion

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 23 / 34

slide-24
SLIDE 24
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Push-down Automata: Transition Function

Let M = Q, Σ, Γ, δ, q0, # be a push-down automaton. What is the Intuitive Meaning of the Transition Function δ? ◮ q′, B1 . . . Bk ∈ δ(q, a, A): If M is in state q, reads symbol a and has A as the topmost stack symbol, then M can transition to q′ in the next step while replacing A with B1 . . . Bk (afterwards B1 is the topmost stack symbol)

q q′ a, A → B1 . . . Bk

◮ special case a = ε is allowed (spontaneous transition)

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 24 / 34

slide-25
SLIDE 25
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Push-down Automata: Example

q q′ a, A → AA a, B → AB a, # → A# b, A → BA b, B → BB b, # → B# a, A → ε b, B → ε a, A → ε b, B → ε ε, # → ε

M = {q, q′}, {a, b}, {A, B, #}, δ, q, # with

δ(q, a, A) = {q, AA, q′, ε} δ(q, b, A) = {q, BA} δ(q, ε, A) = ∅ δ(q, a, B) = {q, AB} δ(q, b, B) = {q, BB, q′, ε} δ(q, ε, B) = ∅ δ(q, a, #) = {q, A#} δ(q, b, #) = {q, B#} δ(q, ε, #) = ∅ δ(q′, a, A) = {q′, ε} δ(q′, b, A) = ∅ δ(q′, ε, A) = ∅ δ(q′, a, B) = ∅ δ(q′, b, B) = {q′, ε} δ(q′, ε, B) = ∅ δ(q′, a, #) = ∅ δ(q′, b, #) = ∅ δ(q′, ε, #) = {q′, ε}

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 25 / 34

slide-26
SLIDE 26
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Push-down Automata: Configuration

Definition (Configuration of a Push-down Automaton) A configuration of a push-down automaton M = Q, Σ, Γ, δ, q0, # is given by a triple c ∈ Q × Σ∗ × Γ∗.

German: Konfiguration

Example

I n p u t

q

Configuration q, ut, BAC#.

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 26 / 34

slide-27
SLIDE 27
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Push-down Automata: Steps

Definition (Transition/Step of a Push-down Automaton) We write c ⊢M c′ if a push-down automaton M = Q, Σ, Γ, δ, q0, # can transition from configuration c to configuration c′ in one step. Exactly the following transitions are possible: q, a1 . . . an, A1 . . . Am ⊢M            q′, a2 . . . an, B1 . . . BkA2 . . . Am if q′, B1 . . . Bk ∈ δ(q, a1, A1) q′, a1a2 . . . an, B1 . . . BkA2 . . . Am if q′, B1 . . . Bk ∈ δ(q, ε, A1)

German: ¨ Ubergang

If M is clear from context, we only write c ⊢ c′.

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 27 / 34

slide-28
SLIDE 28
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Push-down Automata: Reachability of Configurations

Definition (Reachable Configuration) Configuration c′ is reachable from configuration c in PDA M (c ⊢∗

M c′) if there are configurations c0, . . . , cn (n ≥ 0) where

◮ c0 = c, ◮ ci ⊢M ci+1 for all i ∈ {0, . . . , n − 1}, and ◮ cn = c′.

German: c′ ist in M von c erreichbar

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 28 / 34

slide-29
SLIDE 29
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Push-down Automata: Recognized Words

Definition (Recognized Word of a Push-down Automaton) PDA M = Q, Σ, Γ, δ, q0, # recognizes the word w = a1 . . . an iff the configuration q, ε, ε (word processed and stack empty) for some q ∈ Q is reachable from the start configuration q0, w, #. M recognizes w iff q0, w, # ⊢∗

M q, ε, ε for some q ∈ Q.

German: M erkennt w, Startkonfiguration

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 29 / 34

slide-30
SLIDE 30
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Push-down Automata: Recognized Word Example

q q′ a, A → AA a, B → AB a, # → A# b, A → BA b, B → BB b, # → B# a, A → ε b, B → ε a, A → ε b, B → ε ε, # → ε

example: this PDA recognizes bbabbabb blackboard

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 30 / 34

slide-31
SLIDE 31
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

Push-down Automata: Accepted Language

Definition (Accepted Language of a Push-down Automaton) Let M be a push-down automaton with input alphabet Σ. The language accepted by M is defined as L(M) = {w ∈ Σ∗ | M recognizes w}. example: blackboard

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 31 / 34

slide-32
SLIDE 32
  • C5. Context-free Languages: Normal Form and PDA

Push-Down Automata

PDAs Accept Exactly the Context-free Languages

Theorem A language L is context-free if and only if L is accepted by a push-down automaton.

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 32 / 34

slide-33
SLIDE 33
  • C5. Context-free Languages: Normal Form and PDA

Summary

C5.4 Summary

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 33 / 34

slide-34
SLIDE 34
  • C5. Context-free Languages: Normal Form and PDA

Summary

Summary

◮ Every context-free language has a grammar in Chomsky normal form. All rules have form

◮ A → BC with variables A, B, C, or ◮ A → a with variable A, terminal symbol a, or ◮ S → ε with start variable S.

◮ Push-down automata (PDAs) extend NFAs with memory. ◮ PDAs accept not with end states but with an empty stack. ◮ The languages accepted by PDAs are exactly the context-free languages.

Gabriele R¨

  • ger (University of Basel)

Theory of Computer Science April 1, 2020 34 / 34