CISC4090: Theory of Computation Chapter 2 Context-Free Languages - PowerPoint PPT Presentation

CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Spring, 2014 Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Overview In Chapter 1, we introduced two equivalent methods for describing a language: finite automata and regular expressions. In this chapter, we do something analogous. We introduce context-free grammars (CFGs) We introduce pushdown automata (PDAs) PDAs recognize CFGs We have another Pumping Lemma Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Why Context-Free Grammars? First used to study human languages You may have even seen something like them before. They are definitely used for many typical computer languages (C, C++, . . . ). They define the language. A parser uses the grammar to parse the input. Of course, you can also parse English. Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Section 2.1: Context-Free Grammars Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

A context-free grammar Here is G 1 , an example of a CFG: A → 0 A 1 A → B B → # A grammar has substitution rules or productions : Each rule has a variable, an arrow, and a combination of variables and terminal symbols . We capitalize variables, but not terminal symbols. The start symbol is a special variable: usually on left-hand side of topmost rule. Here: variables are A and B , terminals are 0 , 1 , # . Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Using the grammar Use the grammar to generate a language by replacing variables using the rules in the grammar. Start with the start variable. Give me some strings that G 1 generates? Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Using the grammar Use the grammar to generate a language by replacing variables using the rules in the grammar. Start with the start variable. Give me some strings that G 1 generates? One answer: 000#111 . Sequence of steps: the derivation . For this example, the derivation is A → 0 A 1 → 00 A 11 → 000 A 111 → 000 B 1111 → 000#111 . Can also represent with a parse tree. Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

The language of grammar G 1 L ( G 1 ) is the language of all strings generated by G 1 . L ( G 1 ) = { 0 n #1 n : n ≥ 0 } . This should look familiar. Can we generate this with an FA? Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

An example: simplified English grammar G 2 � sentence � → � noun-phrase �� verb-phrase � � noun-phrase � → � cmplx-noun � | � cmplx-noun �� prep-phrase � � verb-phrase � → � cmplx-verb � | � cmplx-verb �� prep-phrase � � prep-phrase � → � prep �� cmplx-noun � � cmplx-noun � → � article �� noun � � cmplx-verb � → � verb � | � verb �� noun-phrase � � article � → a | the � noun � → boy | girl | flower � verb � → touches | likes | sees � prep � → with Derivation for “ a boy sees ”? Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Formal definition of a CFG A context-free grammar (CFG) is a 4-tuple ( V , Σ , R , S ), where V is a finite set of variables , Σ is a finite set, disjoint from V , of terminals , R is a finite set of rules , with each rule v → s consisting of a variable v ∈ V and a string s ∈ ( V ∪ Σ) ∗ of variables and terminals, and S ∈ V is the start variable . Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Example Grammar G 3 = ( { S } , { a , b } , R , S ), where the set R consists of only one rule, namely, S → a S b | SS | ε. What does this generate? Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Example Grammar G 3 = ( { S } , { a , b } , R , S ), where the set R consists of only one rule, namely, S → a S b | SS | ε. What does this generate? abab , aaabbb , aababb , . . . If you view a as ( and b as ) , then you get all strings of properly nested parentheses. Note that ()() is permissible. Key property? You have as many a ’s to the left of any given point in the string as you do b ’s. Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Another example Grammar G 4 = ( V , Σ , R , � expr � ), where V = {� expr � , � term � , � factor �} and Σ = { a , + , × , ( , ) } . Productions: � expr � → � expr � + � term � | � term � � term � → � term � × � factor � | � factor � � factor � → ( � expr � ) | a Let’s do parse trees for a + a × a and ( a + a ) × a Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Designing CFGs Like designing FA, some creativity is required. CFGs perhaps harder than FAs, since they are more expressive. (We’ll show that soon.) Here are some guidelines: If the CFL is the union of simpler CFLs, design grammars for the simpler ones and then combine. For example, S → G 1 | G 2 | G 3 . If the language is regular, then can design a CFG that mimics a DFA: Make a variable R i for every state q i . If δ ( q i , a ) = q j , then add rule R i → a R j . Add R i → ε if q i is an accepting state. Make R 0 the start variable, where q 0 is the start state of the DFA. Assuming that this really works, what did we just show? Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Designing CFGs (continued) A final guideline: Certain CFLs contain strings that are linked, in the sense that a machine for recognizing this language would need to remember an unbounded amount of information about one substring to “verify” the other substring. This is sometimes trivial with a CFG. Example: The language 0 n 1 n . Grammar is: Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Designing CFGs (continued) A final guideline: Certain CFLs contain strings that are linked, in the sense that a machine for recognizing this language would need to remember an unbounded amount of information about one substring to “verify” the other substring. This is sometimes trivial with a CFG. Example: The language 0 n 1 n . Grammar is: S → 0 S 1 | ε. Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Ambiguity Sometimes a grammar can generate the same string in multiple ways. If a grammar generates even a single string in multiple ways, the grammar is ambiguous . Example: � expr � → � expr � + � expr � | � expr � × � expr � | ( � expr � ) | a . This generates the string a + a × a ambiguously. Try it: generate two parse trees. Using your extensive knowledge of arithmetic, insert parentheses to show what each parse tree really expresses. Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

An English example from grammar G 2 Grammar G 2 ambiguously generates “ the girl touches the boy with the flower ”. Given your extensive knowledge of English, what are the two meanings of this phrase? Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Definition of ambiguity A grammar generates a string ambiguously if there are two different parse trees for said string. Two derivations may differ in the order that the rules are applied, but if they generate the same parse tree, it is not really ambiguous. Definitions: A derivation is a leftmost derivation if at every step, the leftmost remaining string is replaced. A string w is derived ambiguously in a CFG if it has two or more different leftmost derivations. Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Chomsky Normal Form It is often convenient to convert a CFG into a simplified form. A CFG is in Chomsky normal form if every rule is of the form A → BC or A → a , where a is any terminal and A , B , and C are variables, except that neither B nor C can be the start variable. The start variable can also go to ε , i.e., we permit S → ε . Any CFL can be generated by a CFG in Chomsky normal form. Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Converting CFG to Chomsky Normal Form Add rule S 0 → S , where S was original start variable. Remove ε -rules whose LHS is not the start variable: Remove A → ε , and for each occurrence of such an A on RHS, add a new rule with that A deleted. Example: Replace R → uAvAw by R → uvAw R → uAvw . R → uvw Handle all unit rules. Example: If we had A → B , then whenever a rule B → u exists, we add A → u . Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Converting CFG to Chomsky Normal Form (cont’d) Replace rules A → u 1 u 2 . . . u k with A → u 1 A 1 A 1 → u 2 A 2 A 2 → u 3 A 3 . . . . A k − 2 → u k − 1 u k You will have a homework question like this. Prior to doing same, go over Example 2.10 in the textbook (page 108). Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

Section 2.2: Pushdown Automata Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring, 2014/Chapter 2

CISC4090: Theory of Computation Chapter 2 Context-Free Languages - PowerPoint PPT Presentation

CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Spring, 2014 Courtesy of Prof. Arthur G. Werschulz CISC4090 /Spring,

Formal Definition of Computation Formal Definition of Computation p.1/28 Computation

Theory of Computation CS3102 Gabriel Robins Department of Computer Science University of

BU CS 332 Theory of Computation Lecture 17: Reading: Midterm II review Sipser Ch 3.1

Computation theory with atoms I. Sets with atoms II. Computation models with atoms S awomir

Probabilistic Computation Lecture 13 BPP vs. PH 1 Recap 2 Recap Probabilistic computation 2

Massively Parallel Computation Philip Bille Sequential Computation Computation. Read and

Model of Computation and Runtime Analysis Model of Computation Model of Computation Specifies

randomized computation Sometimes randomness helps in computation. randomized computation Augment

Chapter 2- -3 3 Chapter 2 Definition of Theory: A theory is a systematic Definition of

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Finite

Mathematical Structure of Computation What is computation? Great Ideas in Computing

Game Theory: Spring 2020 Ulle Endriss Institute for Logic, Language and Computation University

Theory of Interaction what is a model theory of computer science? Yuxi Fu BASICS, Shanghai

Theory of Computation Textbook The Nature of Computation by Cristopher Moore and (CS

Theory of Computation Chapter 7 Guan-Shieng Huang Apr. 14, 2003 Feb. 19, 2006 0-0

Secure Outsourcing of Computation Ron Rothblum MIT Outsourcing Computation Motivation: allow a

MA/CSSE 474 Theory of Computation More about Ambiguity Removal Normal Forms (Chomsky and

Normalform unwanted in CFG : S x variables not used in successful

SI425 : NLP Set 8 PCFGs and the CKY Algorithm PCFGs We saw how CFGs can model English (sort

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Normal

Next Chapter 2: Context-Free Languages (CFL) Context-Free Grammars (CFG) Chomsky

Computational Linguistics II: Parsing Formal Languages: Context Free Languages III Frank Richter

BU CS 332 Theory of Computation Lecture 13: Reading: Mid Semester Feedback Sipser Ch

Weighted Context-Free Grammars over Bimonoids George Rahonis and Faidra Torpari Aristotle