COMP3630/6360: Theory of Computation Semester 1, 2020 The - PowerPoint PPT Presentation

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Context Free Languages 1 / 15

This lecture covers Chapter 5 of HMU: Context-free Grammars � (Context-free) Grammars � (Leftmost and Rightmost) Derivations � Parse Trees � An Equivalence between Derivations and Parse Trees � Ambiguity in Grammars Additional Reading: Chapter 5 of HMU.

Grammars Introduction to Grammars � We have so far seen machine-like means (e.g., DFAs) and declarative means (e.g., regular expressions) of defining languages � Grammars are a generative means of defining languages. � Grammars can be used to create a strictly larger class of languages. � They are especially useful in compiler and parser design; they can be used to check if: ∠ parantheses are balanced in a program, ∠ else occurrences have a matching if , etc. 3 / 15

Grammars Grammars: Formal Definition � A context-free grammar (CFG) G = ( V , T , P , S ) , where ∠ V is a finite set whose elements are called variables or non-terminal symbols . Notation: upper case letters, e.g., A , B , . . . . ∠ T is a finite set whose elements are called terminal symbols ; T is precisely the alphabet of the language generated by the grammar G . Notation: lower case letters, e.g., s 1 , s 2 , . . . . ∠ P ⊆ V × ( V ∪ T ) ∗ is a finite set of production rules . ∠ Each production rule ( A , α ) is also written as A − → α . Terminology: A , α are called the head and body of the production rule, resp. ∠ S ∈ T is the unique variable/non-terminal that ‘generates’ the language. Notation ∠ Strings consisting of non-terminals and/or terminals will be denoted by greek symbols, e.g., α, β, . . . . ∠ Strings of terminals will be denoted by lower case letters, e.g., w , u , v 4 / 15

Derivations How do Grammars Generate Languages? � A string w ∈ T ∗ is in the language L ( G ) generated by G = ( V , T , P , S ) iff we can derive w from S , i.e., start from S and use production rule(s) repeatedly to replace heads of the rules by their bodies until a string in T ∗ is obtained. Example 11011 1111 11111 111 Let G = ( { S } , { 0 , 1 } , P , S ) be 101 11 a CFG with P given by 11 S 11 0110 � ( S , ǫ ) , ( S , 0 ) , ( S , 1 ) 1 S 1 01010 � › (1) 01 S 10 ( S , 0 S 0 ) , ( S , 1 S 1 ) 01110 0 S (Start) 10101 S − → ǫ 10 S 01 1 S − → 0 10001 0 S 0 (2) S − → 1 1001 00 S − → 0 S 0 00 S 00 000 010 S − → 1 S 1 0000 00100 (3) S − → ǫ | 0 | 1 | 0 S 0 | 1 S 1 00000 5 / 15

Derivations Derivation: Formal Definition Definition Given G = ( V , T , P , S ) and α, β ∈ ( V ∪ T ) ∗ , a derivation of β from α is a finite sequence of strings γ 1 ⇒ G γ 2 ⇒ G · · · ⇒ G γ k for some k ∈ N where 1. γ 1 = α and γ k = β ; 2. γ 1 , . . . , γ k ∈ ( V ∪ T ) ∗ 3. For each i = 1 , . . . , k − 1 , either γ i = γ i + 1 or γ i + 1 is obtained from γ i by replacing the head of a production rule of P by its body. The following phrases are used interchangeably. ∗ β is derived from α ⇔ there exists a derivation of β from α ⇔ α ⇒ G β. Example For the grammar G = ( { S } , { 0 , 1 } , P , S ) with P given by S − → ǫ | 0 | 1 | 0 S 0 | 1 S 1 , the following is a derivation of 010111010 from S S ⇒ 0 S 0 ⇒ 01 S 10 ⇒ 010 S 010 ⇒ 0101 S 1010 ⇒ 010111010 . G G G G G S → 0 S 0 S → 1 S 1 S → 0 S 0 S → 1 S 1 S → 1 6 / 15

Derivations Sentential Forms and Language Generated by a Grammar: Definitions Definition Given G = ( V , T , P , S ) , any string in ( V ∪ T ) ∗ derived from S is a sentential form. � The set of all sentential forms of G (denoted by SF ( G ) ) is defined inductively: ∠ Basis: S ∈ SF ( G ) ∠ Induction: if α A γ ∈ SF ( G ) for some α, γ ∈ ( V ∪ T ) ∗ and A ∈ V , and A − → β is a production rule, then αβγ ∈ SF ( G ) . ∠ Only those strings that are generated by the above induction are sentential forms. Definition Given CFG G = ( V , T , P , S ) , the language L ( G ) generated by G are the sentential forms that are in T ∗ , i.e., L ( G ) = SF ( G ) ∩ T ∗ . Example For the CFG G = ( { S } , { 0 , 1 } , P , S ) with P given by S − → ǫ | 0 | 1 | 0 S 0 | 1 S 1 , (1) S, ǫ , 0 , 1 0 S 0 , 00 , 000 , 010 , 1 S 1 , 11 , 101 , 111 , . . . are all sentential forms. (2) S, ǫ , 0 , 1 0 S 0 , 00 , 000 , 010 , 1 S 1 , 11 , 101 , 111 , . . . are in L ( G ) . 7 / 15

Derivations Other Sentential Forms � At each step of a derivation, one can replace any variable by a suitable production. � If at each non-trivial step of the derivation the leftmost (or rightmost ) variable is replaced by a production rule, then the derivation is said to be a leftmost (or ∗ ∗ rightmost ) derivation, respectively. We let α ⇒ LM β (or α ⇒ RM β ) to denote the existence of a leftmost (or rightmost) derivation of β from α , respectively. � Sentential forms derived via leftmost (or rightmost ) derivations are known as leftmost (or rightmost ) sentential forms, respectively. Balanced Parantheses Example Consider the CFG G = ( { S } , { ( , ) } , P , S ) with P given by S − → SS | ( S ) | () . [ Derivation ] S ↑ ⇒ G S ↑ S ⇒ G ( S ) S ↑ ⇒ G ( S ↑ )() ⇒ G (())() [ Leftmost Derivation ] S ↑ ⇒ G S ↑ S ⇒ G ( S ↑ ) S ⇒ G (()) S ↑ ⇒ G (())() [ Rightmost Derivation S ↑ ⇒ G SS ↑ ⇒ G S ↑ () ⇒ G ( S ↑ )() ⇒ G (())() In the above, ↑ indicates the variable that is replaced in the following step 8 / 15

Parse Trees Parse Trees � Parse trees are a graphical method of representing derivations. � They are used in compilers to represent the source program. Definition Given a CFG G = ( V , T , P , S ) , a parse tree for G is any G = ( { S } ; { ( ; ) } ; P ; S ) directed labelled tree that meets the following three P : S − ! SS | ( S ) | › conditions: S ∠ every interior node is labelled by a non-terminal (i.e., variable); S S ∠ every leaf node is labelled by a non-terminal, or a terminal or ǫ ; however if it is labelled by ǫ , it is the ( ) ( ) S S sole child of its parent. ∠ if an interior node is labelled by A ∈ V , and it’s ( ) › S children are labelled s 1 , . . . , s k ∈ V ∪ T ∪ { ǫ } , then A − → s 1 · · · s k is a production rule in P . yield = (())() › The yield of a parse tree is the string formed from the labels of the tree leaves read from left to right. Note: The yield is not necessarily a string of terminals. 9 / 15

An Equivalence between Parse Trees and Derivations Derivations and Parse Trees � Parse trees, derivations, leftmost derivations, and rightmost derivations are equivalent means of generating the language L ( G ) of a CFG G . � The proof for equivalence of rightmost derivations mirrors that of leftmost derivations. (So we’ll not delve into rightmost derivations). Theorem 5.5.1 Let CFG G = ( V , T , P , S ) be given. Let A ∈ V and w ∈ T ∗ . Then, ∗ ∗ ∗ A ⇒ G w ⇔ A ⇒ LM w ⇔ there exists a parse tree with root A and yield w ⇔ A ⇒ RM w . Proof Idea We’ll show the following implications. Existence of a parse tree with root A and yield w (b) (a) By Definition ∗ ∗ A LM w A G w ⇒ ⇒ 10 / 15

An Equivalence between Parse Trees and Derivations Part (a) of Proof of Theorem 5.5.1: A ∗ G w ⇒ ∃ Parse Tree ⇒ � We prove the following generalization of Part (a) by induction on the length of the derivation. Lemma 5.5.2 Let CFG G = ( V , T , P , S ) be given. Let A ∈ V and α ∈ SF ( G ) with α � = A. Then, ∗ A ⇒ G α ⇒ there exists a parse tree with root A and yield α Basis: A Proof of Lemma 5.5.2 (Induction on the length of derivation) ∠ Since α � = A the minimum length of the derivation is at least 1. s ‘ s 1 s 2 · · · ∠ Basis: Let A ⇒ G α be a one-step derivation. Since α � = A , ¸ = s 1 · · · s ‘ this derivation has to be the production rule A − → α . ( A; ¸ ) ≡ ( A − ! ¸ ) 2 P ∠ Hence, the parse tree is trivially the one on the right. 11 / 15

An Equivalence between Parse Trees and Derivations Part (a) of Proof of Theorem 5.5.1: A ∗ G w ⇒ ∃ Parse Tree ⇒ Proof of Lemma 5.5.2 (Induction on the length of derivation) ∠ Induction: Suppose that the claim is true for all Parse tree for derivations of length k − 1 or lesser for some k ≥ 2. ∗ A ˛–! = ¸ ⇒ G ∠ Suppose a derivation of α from A in k steps exists. A A = γ 1 ⇒ G γ 2 ⇒ G γ 3 ⇒ G · · · ⇒ G γ k − 1 ⇒ G γ k = α ∠ We may assume γ k − 1 � = A . So by the induction Parse tree for hypothesis, there exists a parse tree with root A and ∗ A ‚ k − 1 ⇒ yield γ k − 1 . [If γ k − 1 = A , the derivation contains one G B step, and the basis case applies.] | {z } | {z } ! ˛ ∠ We may assume that γ k − 1 � = γ k or else the derivation of γ k − 1 from A , which has a corresponding parse tree is B − ! – also a parse tree with yield α and root label A . | {z } – ∠ Thus, the last step involves the application of a production rule. Hence, γ k − 1 = β B ω and α = βλω where (a) β, ω ∈ ( V ∪ T ) ∗ , (b) B ∈ V , and (b) B − → λ is a production rule. 12 / 15

COMP3630/6360: Theory of Computation Semester 1, 2020 The - PowerPoint PPT Presentation

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Context Free Languages 1 / 15 This lecture covers Chapter 5 of HMU: Context-free Grammars (Context-free) Grammars (Leftmost and Rightmost)

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Finite

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Turing

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Normal

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Regular

Semester projects Semester projects Semester projects Semester projects Principles of Complex

Formal Definition of Computation Formal Definition of Computation p.1/28 Computation

CS 6360: Educational Technology Lecture 1: Overview Promise Why should you take this class?

L2_PythonCrashCourse August 17, 2017 1 Lecture 2: Python Crash Course CSCI 4360/6360: Data

Assignment 1 Postmortem CSCI 4360/6360 Data Science II Tuesday, September 5, 2017 Poll Review

BU CS 332 Theory of Computation Lecture 17: Reading: Midterm II review Sipser Ch 3.1

Theory of Computation CS3102 Gabriel Robins Department of Computer Science University of

Game Theory: Spring 2020 Ulle Endriss Institute for Logic, Language and Computation University

The European Semester of economic policy coordination Alexia Zammit European Semester Officer

Theory of Computation Textbook The Nature of Computation by Cristopher Moore and (CS

Compiler Design Spring 2018 3.0 Frontend Thomas R. Gross Computer Science Department ETH

Syntactic Analysis Sebastian Hack (based on slides by Reinhard Wilhelm and Mooly Sagiv)

CSCI-2325 CLite Syntax MOHAMMAD T. IRFAN Review of definiBons

Describing Syntax and Semantics of Progr a mming L a ngu a ges Part II 1 Ambiguity A grammar

Compilers and computer architecture From strings to ASTs (2): context free grammars Martin Berger

CSE 105 THEORY OF COMPUTATION Fall 2016 http://cseweb.ucsd.edu/classes/fa16/cse105-abc/ Today's

Grammars and Parsing Forth mini-homework If there is a number on the stack, and we enter dup

Compiler Construction Lecture 6: Top-down parsing and LL(1) parser construction 2020-01-24