COMP3630/6360: Theory of Computation Semester 1, 2020 The - PowerPoint PPT Presentation

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Normal Forms and Closure Properties 1 / 33

This lecture covers Chapter 7 of HMU: Properties of CFLs � Chomsky Normal Form � Pumping Lemma for CFGs � Closure Properties of CFLs � Decision Properties of CFLs Additional Reading: Chapter 7 of HMU.

Chomsky Normal Form (CNF) for CFG Chomsky Normal Form (CNF) for CFG 3 / 33

Chomsky Normal Form (CNF) for CFG Chomsky Normal Forms ∠ A normal or canonical form (be it in algebra, matrices, or languages) is a standardized way of presenting the object (in this case, languages). ∠ A normal form for CFGs provides a prescribed structure to the grammar without compromising on its power to define all context-free languages. ∠ Every non-empty language L with ǫ / ∈ L has Chomsky Normal Form grammar G = ( V , T , P , S ) where every production rule is of the form: ∠ A − → BC for A , B , C ∈ V , or ∠ A − → a for A ∈ V and a ∈ T . ∠ CNF disallows: ∠ ✘✘✘ ✘ A − → ǫ [ ǫ -productions]. ✭ ∠ ✭✭✭ A − → B for A , B ∈ V . [Unit productions]. ✭ ∠ ✭✭✭✭✭✭ A − → B 1 · · · B k , A ∈ V , B i ∈ V ∪ T for k ≥ 2 [Complex productions]. 4 / 33

Chomsky Normal Form (CNF) for CFG Towards CNF [Step 1: Remove ǫ -Productions] ∠ ǫ -production: A − → ǫ for some A ∈ V . ∗ ∠ Let us call a variable A ∈ V as nullable if A ⇒ G ǫ . ∠ We can identify nullable variables as follows: ∠ Basis: A ∈ V is nullable if A − → ǫ is a production rule in P . ∠ Induction: B ∈ V is nullable if B − → A 1 · · · A k is in P , and each A i is nullable. Procedure to Eliminate ǫ -Productions ∠ Given G = ( V , T , P , S ) define G no- ǫ = ( V , T , P no- ǫ , S ) as follows: 1. Start with P no- ǫ = P . Find all nullable variables of G . 3. For each production rule in P do the following: ∠ If the body contains k > 0 nullable variables, add 2 k productions to P no- ǫ obtained by choosing a subset of nullable variables and replacing each by ǫ 4. Delete any production in P no- ǫ of the form Y → ǫ for any Y ∈ V . For example, suppose that in a given grammar, B , D are nullable and C is not. If A − → BCD is a rule in P , then A − → BCD | CD | BC | C are rules in P no- ǫ . Similarly, if A − → BD is a rule in P , then A − → BD | B | D are rules in P no- ǫ . 5 / 33

Chomsky Normal Form (CNF) for CFG Towards CNF [Step 1: Remove ǫ -Productions] An Example Suppose G = ( { A , B , C } , { 0 , 1 } , P , A ) with P : A − → BC ; B − → 0 B | ǫ ; C − → C 11 | ǫ . ∠ B and C are nullable since B − → ǫ and C − → ǫ . Then, A is also nullable. ∠ Define G no- ǫ = ( { A , B , C } , { 0 , 1 } , P no- ǫ , A ) with P no- ǫ containing ∠ A − ✁ → BC | B | C ✁ | ǫ ∠ B − ✁ → 0 A ✁ | ǫ ∠ C − ✁ → C 11 ✁ | ǫ Theorem 7.1.1 The above induction procedure described in Slide 4 identifies all nullable variables. Theorem 7.1.2 L ( G no- ǫ ) = L ( G ) \ { ǫ } . a a Proof in the Additional Proofs Section at the end 6 / 33

Chomsky Normal Form (CNF) for CFG Towards CNF [Step 2: Remove Unit Productions] ∗ ∠ Given a grammar G and variables A , B ∈ V , we say ( A , B ) form a unit pair if A ⇒ G B using unit productions alone. ∠ We can identify unit pairs as follows: ∗ ∠ Basis: For each A ∈ V , ( A , A ) is a unit pair (since A ⇒ G A ). ∠ Induction: If ( A , B ) is a unit pair, and B → C is a production in P , then ( A , C ) is a unit pair. ∗ ∠ Note: Suppose A − → BC and C − → ǫ are productions then A ⇒ G B , but ( A , B ) is not a unit pair. Procedure to Eliminate Unit Productions ∠ Given G = ( V , T , P , S ) define G no-unit = ( V , T , P no-unit , S ) as follows: 1. Start with P no-unit = P . Find all unit pairs of G . 2. For every unit pair ( A , B ) and non-unit production rule B − → α , add rule A − → α to P no-unit . 3. Delete all unit production rules in P no-unit . 7 / 33

Chomsky Normal Form (CNF) for CFG Towards CNF [Step 2: Remove Unit Productions] An Example Suppose G = ( { A , B , C , D } , { a , b } , P , A ) with P : A − → B | aC ; B − → A | bD ; C − → aC | ǫ ; D − → bD | ǫ . ∠ ( A , B ) and ( B , A ) are the only two non-trivial pairs of unit variables. ∠ Define G no-unit = ( { A , B , C , D } , { a , b } , P no-unit , A ) with P no-unit containing ∠ A − � → aC | bD � | B ∠ B − ✓ → bD | aC ✓ | A ∠ C − → aC | ǫ ∠ D − → bD | ǫ ∠ Note: Rules with B being the head can never be used. Theorem 7.1.3 The induction procedure on Slide 6 identifies all unit pairs. Theorem 7.1.4 L ( G no-unit ) = L ( G ) . b b Outline of the proof is given in the Additional Proofs Section at the end 8 / 33

Chomsky Normal Form (CNF) for CFG Towards CNF [Step 3: Remove Useless Variables] ∠ A symbol X ∈ V ∪ T is said to be ∗ ∠ generating if X G w for some w ∈ T ∗ ; ⇒ ∗ ∠ reachable if S ⇒ G α X β for some α, β ∈ ( V ∪ T ) ∗ ; and G w for some w ∈ T ∗ and α, β ∈ ( V ∪ T ) ∗ . ∗ ∗ ∠ useful if S ⇒ ⇒ G α X β (Useful ⇒ Reachable + Generating, but not necessarily vice versa!) ∠ Given a grammar G , we can identify generating variables as follows: ∗ ∠ Basis: For each s ∈ T , s ⇒ G s . So s is generating ∠ Induction: If A − → α , and every symbol of α is generating, so is A . ∠ Given a grammar G , we can identify reachable variables as follows: ∗ ∠ Basis: S ⇒ G S so S is reachable. ∠ Induction: If A − → α , and A is reachable, so is every symbol of α . 9 / 33

Chomsky Normal Form (CNF) for CFG Towards CNF [Step 3: Remove Useless Variables] Procedure to Eliminate Useless Variables ∠ Given G = ( V , T , P , S ) define G G = ( V G , T , P G , S ) as follows: ∠ Find all generating symbols of G ∠ V G is the set of all generating variables. ∠ P G is the set of production rules involving only generating symbols. ∠ Now, define G GR = ( V GR , T GR , P GR , S ) as follows: ∠ Find all reachable symbols of G G ∠ V GR is the set of all reachable variables. ∠ P GR is the set of production rules involving only reachable symbols. The Order of Eliminating Variables is Important! ∠ Consider G = ( { A , B , S } , { 0 , 1 } , P , S ) with P : S − → AB | 0 ; A − → 1 A ; B − → 1. ∠ A is not generating. Removing A and the rules S − → AB and A − → 1 A results in B being unreachable. Removing B and B → 1 yields G GR = ( { S } , { 0 } , S − → 0 , S ) . ∠ Reversing the order, we first see that all symbols are reachable; removing then the non-generating symbol A and production rules S − → AB and A − → 1 A yields G RG = ( { B , S } , { 0 } , S − → 0 and B − → 0 , S ) . But B is unreachable now! 10 / 33

Chomsky Normal Form (CNF) for CFG Towards CNF [Step 3: Remove Useless Variables] Theorem 7.1.5 The induction procedure on Slide 9 identifies all generating variables. Theorem 7.1.6 The induction procedure on Slide 9 identifies all reachable variables. Theorem 7.1.7 (1) L ( G ) = L ( G GR ) ; and (2) Every symbol in G GR is useful. c c Proof in the Additional Proofs Section at the end 11 / 33

Chomsky Normal Form (CNF) for CFG Towards CNF [Step 4: Remove Complex Productions] Procedure to Eliminate Complex Productions ∠ Given G = ( V , T , P , S ) , define ˆ G = ( ˆ V , T , ˆ P , S ) as follows: ∠ Start with ˆ G = G and do the following operations. ∠ For every terminal a ∈ T that appears in the body of length 2 or more, introduce a new variable A and a new production rule A − → a . ∠ Replace the occurrence all such terminals in the body of length 2 or more by the introduced variables. ∠ Replace every rule A − → B 1 · · · B k for k > 2, by introducing k − 2 variables D 1 , . . . , k − 2, and by replacing the rule by the following k − 1 rules: A − → B 1 D 1 D 2 − → B 3 D 3 · · · D k − 2 − → B k − 1 B k D 1 − → B 2 D 2 · · · D k − 3 − → B k − 2 D k − 2 ∠ Note: Each introduced variable appears in the head exactly once. Theorem 7.1.8 L ( G ) = L ( ˆ G ) . d d Outline of the proof is given in the Additional Proofs Section at the end 12 / 33

Chomsky Normal Form (CNF) for CFG The Chomsky Normal Form Theorem 7.1.9 For every context-free language L containing a non-empty string, there exists a grammar G in Chomsky Normal Form such that L \ { ǫ } = L ( G ) . Proof ∠ Since L is a CFL, it must correspond to some CFG G . ∠ Eliminate ǫ productions (Step 1) to derive a grammar G 1 from G such that L ( G 1 ) = L ( G ) \ { ǫ } . ∠ Eliminate unit productions (Step 2) to derive a grammar G 2 from G 1 such that L ( G 2 ) = L ( G 1 ) . ∠ Eliminate useless variables (Step 3) to derive a grammar G 3 from G 2 such that L ( G 3 ) = L ( G 2 ) . ∠ Eliminate complex productions (Step 4) to derive a grammar G 4 from G 3 such that L ( G 4 ) = L ( G 3 ) . ∠ G 4 contains no ǫ -productions, no unit productions, no useless variables, and no productions with body consisting of 3 or more symbols; Hence G 4 is in CNF. 13 / 33

Pumping Lemma for CFLs Pumping Lemma for CFLs 14 / 33

COMP3630/6360: Theory of Computation Semester 1, 2020 The - PowerPoint PPT Presentation

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Normal Forms and Closure Properties 1 / 33 This lecture covers Chapter 7 of HMU: Properties of CFLs Chomsky Normal Form Pumping Lemma for CFGs

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Finite

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Turing

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Context

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Regular

Semester projects Semester projects Semester projects Semester projects Principles of Complex

Formal Definition of Computation Formal Definition of Computation p.1/28 Computation

CS 6360: Educational Technology Lecture 1: Overview Promise Why should you take this class?

L2_PythonCrashCourse August 17, 2017 1 Lecture 2: Python Crash Course CSCI 4360/6360: Data

Assignment 1 Postmortem CSCI 4360/6360 Data Science II Tuesday, September 5, 2017 Poll Review

BU CS 332 Theory of Computation Lecture 17: Reading: Midterm II review Sipser Ch 3.1

Theory of Computation CS3102 Gabriel Robins Department of Computer Science University of

Game Theory: Spring 2020 Ulle Endriss Institute for Logic, Language and Computation University

The European Semester of economic policy coordination Alexia Zammit European Semester Officer

Theory of Computation Textbook The Nature of Computation by Cristopher Moore and (CS

Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National

Fixing problems with grammars Informatics 2A: Lecture 12 John Longley & Alex Simpson School

4.9: Chomsky Normal Form In this section, we study a special form of grammars called Chomsky

Homework Homework #3 returned Chomsky Normal Form Homework #4 due today Homework #5

SI425 : NLP Set 8 PCFGs and the CKY Algorithm PCFGs We saw how CFGs can model English (sort

Normalform unwanted in CFG : S x variables not used in successful

MA/CSSE 474 Theory of Computation More about Ambiguity Removal Normal Forms (Chomsky and

CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G.

Sambuz

Useful Links

Newsletter

Mail Us

COMP3630/6360: Theory of Computation Semester 1, 2020 The - PowerPoint PPT Presentation

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Normal Forms and Closure Properties 1 / 33 This lecture covers Chapter 7 of HMU: Properties of CFLs Chomsky Normal Form Pumping Lemma for CFGs

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Finite

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Turing

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Context

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Regular

Semester projects Semester projects Semester projects Semester projects Principles of Complex

Formal Definition of Computation Formal Definition of Computation p.1/28 Computation

CS 6360: Educational Technology Lecture 1: Overview Promise Why should you take this class?

L2_PythonCrashCourse August 17, 2017 1 Lecture 2: Python Crash Course CSCI 4360/6360: Data

Assignment 1 Postmortem CSCI 4360/6360 Data Science II Tuesday, September 5, 2017 Poll Review

BU CS 332 Theory of Computation Lecture 17: Reading: Midterm II review Sipser Ch 3.1

Theory of Computation CS3102 Gabriel Robins Department of Computer Science University of

Game Theory: Spring 2020 Ulle Endriss Institute for Logic, Language and Computation University

The European Semester of economic policy coordination Alexia Zammit European Semester Officer

Theory of Computation Textbook The Nature of Computation by Cristopher Moore and (CS

Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National

Fixing problems with grammars Informatics 2A: Lecture 12 John Longley &amp; Alex Simpson School

4.9: Chomsky Normal Form In this section, we study a special form of grammars called Chomsky

Homework Homework #3 returned Chomsky Normal Form Homework #4 due today Homework #5

SI425 : NLP Set 8 PCFGs and the CKY Algorithm PCFGs We saw how CFGs can model English (sort

Normalform unwanted in CFG : S x variables not used in successful

MA/CSSE 474 Theory of Computation More about Ambiguity Removal Normal Forms (Chomsky and

CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G.

Sambuz

Useful Links

Newsletter

Mail Us

Fixing problems with grammars Informatics 2A: Lecture 12 John Longley & Alex Simpson School