comp3630 6360 theory of computation semester 1 2020 the
play

COMP3630/6360: Theory of Computation Semester 1, 2020 The - PowerPoint PPT Presentation

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Context Free Languages 1 / 15 This lecture covers Chapter 5 of HMU: Context-free Grammars (Context-free) Grammars (Leftmost and Rightmost)


  1. COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Context Free Languages 1 / 15

  2. This lecture covers Chapter 5 of HMU: Context-free Grammars � (Context-free) Grammars � (Leftmost and Rightmost) Derivations � Parse Trees � An Equivalence between Derivations and Parse Trees � Ambiguity in Grammars Additional Reading: Chapter 5 of HMU.

  3. Grammars Introduction to Grammars � We have so far seen machine-like means (e.g., DFAs) and declarative means (e.g., regular expressions) of defining languages � Grammars are a generative means of defining languages. � Grammars can be used to create a strictly larger class of languages. � They are especially useful in compiler and parser design; they can be used to check if: ∠ parantheses are balanced in a program, ∠ else occurrences have a matching if , etc. 3 / 15

  4. Grammars Grammars: Formal Definition � A context-free grammar (CFG) G = ( V , T , P , S ) , where ∠ V is a finite set whose elements are called variables or non-terminal symbols . Notation: upper case letters, e.g., A , B , . . . . ∠ T is a finite set whose elements are called terminal symbols ; T is precisely the alphabet of the language generated by the grammar G . Notation: lower case letters, e.g., s 1 , s 2 , . . . . ∠ P ⊆ V × ( V ∪ T ) ∗ is a finite set of production rules . ∠ Each production rule ( A , α ) is also written as A − → α . Terminology: A , α are called the head and body of the production rule, resp. ∠ S ∈ T is the unique variable/non-terminal that ‘generates’ the language. Notation ∠ Strings consisting of non-terminals and/or terminals will be denoted by greek symbols, e.g., α, β, . . . . ∠ Strings of terminals will be denoted by lower case letters, e.g., w , u , v 4 / 15

  5. Derivations How do Grammars Generate Languages? � A string w ∈ T ∗ is in the language L ( G ) generated by G = ( V , T , P , S ) iff we can derive w from S , i.e., start from S and use production rule(s) repeatedly to replace heads of the rules by their bodies until a string in T ∗ is obtained. Example 11011 1111 11111 111 Let G = ( { S } , { 0 , 1 } , P , S ) be 101 11 a CFG with P given by 11 S 11 0110 � ( S , ǫ ) , ( S , 0 ) , ( S , 1 ) 1 S 1 01010 � › (1) 01 S 10 ( S , 0 S 0 ) , ( S , 1 S 1 ) 01110 0 S (Start) 10101 S − → ǫ 10 S 01 1 S − → 0 10001 0 S 0 (2) S − → 1 1001 00 S − → 0 S 0 00 S 00 000 010 S − → 1 S 1 0000 00100 (3) S − → ǫ | 0 | 1 | 0 S 0 | 1 S 1 00000 5 / 15

  6. Derivations Derivation: Formal Definition Definition Given G = ( V , T , P , S ) and α, β ∈ ( V ∪ T ) ∗ , a derivation of β from α is a finite sequence of strings γ 1 ⇒ G γ 2 ⇒ G · · · ⇒ G γ k for some k ∈ N where 1. γ 1 = α and γ k = β ; 2. γ 1 , . . . , γ k ∈ ( V ∪ T ) ∗ 3. For each i = 1 , . . . , k − 1 , either γ i = γ i + 1 or γ i + 1 is obtained from γ i by replacing the head of a production rule of P by its body. The following phrases are used interchangeably. ∗ β is derived from α ⇔ there exists a derivation of β from α ⇔ α ⇒ G β. Example For the grammar G = ( { S } , { 0 , 1 } , P , S ) with P given by S − → ǫ | 0 | 1 | 0 S 0 | 1 S 1 , the following is a derivation of 010111010 from S S ⇒ 0 S 0 ⇒ 01 S 10 ⇒ 010 S 010 ⇒ 0101 S 1010 ⇒ 010111010 . G G G G G S → 0 S 0 S → 1 S 1 S → 0 S 0 S → 1 S 1 S → 1 6 / 15

  7. Derivations Sentential Forms and Language Generated by a Grammar: Definitions Definition Given G = ( V , T , P , S ) , any string in ( V ∪ T ) ∗ derived from S is a sentential form. � The set of all sentential forms of G (denoted by SF ( G ) ) is defined inductively: ∠ Basis: S ∈ SF ( G ) ∠ Induction: if α A γ ∈ SF ( G ) for some α, γ ∈ ( V ∪ T ) ∗ and A ∈ V , and A − → β is a production rule, then αβγ ∈ SF ( G ) . ∠ Only those strings that are generated by the above induction are sentential forms. Definition Given CFG G = ( V , T , P , S ) , the language L ( G ) generated by G are the sentential forms that are in T ∗ , i.e., L ( G ) = SF ( G ) ∩ T ∗ . Example For the CFG G = ( { S } , { 0 , 1 } , P , S ) with P given by S − → ǫ | 0 | 1 | 0 S 0 | 1 S 1 , (1) S, ǫ , 0 , 1 0 S 0 , 00 , 000 , 010 , 1 S 1 , 11 , 101 , 111 , . . . are all sentential forms. (2) S, ǫ , 0 , 1 0 S 0 , 00 , 000 , 010 , 1 S 1 , 11 , 101 , 111 , . . . are in L ( G ) . 7 / 15

  8. Derivations Other Sentential Forms � At each step of a derivation, one can replace any variable by a suitable production. � If at each non-trivial step of the derivation the leftmost (or rightmost ) variable is replaced by a production rule, then the derivation is said to be a leftmost (or ∗ ∗ rightmost ) derivation, respectively. We let α ⇒ LM β (or α ⇒ RM β ) to denote the existence of a leftmost (or rightmost) derivation of β from α , respectively. � Sentential forms derived via leftmost (or rightmost ) derivations are known as leftmost (or rightmost ) sentential forms, respectively. Balanced Parantheses Example Consider the CFG G = ( { S } , { ( , ) } , P , S ) with P given by S − → SS | ( S ) | () . [ Derivation ] S ↑ ⇒ G S ↑ S ⇒ G ( S ) S ↑ ⇒ G ( S ↑ )() ⇒ G (())() [ Leftmost Derivation ] S ↑ ⇒ G S ↑ S ⇒ G ( S ↑ ) S ⇒ G (()) S ↑ ⇒ G (())() [ Rightmost Derivation S ↑ ⇒ G SS ↑ ⇒ G S ↑ () ⇒ G ( S ↑ )() ⇒ G (())() In the above, ↑ indicates the variable that is replaced in the following step 8 / 15

  9. Parse Trees Parse Trees � Parse trees are a graphical method of representing derivations. � They are used in compilers to represent the source program. Definition Given a CFG G = ( V , T , P , S ) , a parse tree for G is any G = ( { S } ; { ( ; ) } ; P ; S ) directed labelled tree that meets the following three P : S − ! SS | ( S ) | › conditions: S ∠ every interior node is labelled by a non-terminal (i.e., variable); S S ∠ every leaf node is labelled by a non-terminal, or a terminal or ǫ ; however if it is labelled by ǫ , it is the ( ) ( ) S S sole child of its parent. ∠ if an interior node is labelled by A ∈ V , and it’s ( ) › S children are labelled s 1 , . . . , s k ∈ V ∪ T ∪ { ǫ } , then A − → s 1 · · · s k is a production rule in P . yield = (())() › The yield of a parse tree is the string formed from the labels of the tree leaves read from left to right. Note: The yield is not necessarily a string of terminals. 9 / 15

  10. An Equivalence between Parse Trees and Derivations Derivations and Parse Trees � Parse trees, derivations, leftmost derivations, and rightmost derivations are equivalent means of generating the language L ( G ) of a CFG G . � The proof for equivalence of rightmost derivations mirrors that of leftmost derivations. (So we’ll not delve into rightmost derivations). Theorem 5.5.1 Let CFG G = ( V , T , P , S ) be given. Let A ∈ V and w ∈ T ∗ . Then, ∗ ∗ ∗ A ⇒ G w ⇔ A ⇒ LM w ⇔ there exists a parse tree with root A and yield w ⇔ A ⇒ RM w . Proof Idea We’ll show the following implications. Existence of a parse tree with root A and yield w (b) (a) By Definition ∗ ∗ A LM w A G w ⇒ ⇒ 10 / 15

  11. An Equivalence between Parse Trees and Derivations Part (a) of Proof of Theorem 5.5.1: A ∗ G w ⇒ ∃ Parse Tree ⇒ � We prove the following generalization of Part (a) by induction on the length of the derivation. Lemma 5.5.2 Let CFG G = ( V , T , P , S ) be given. Let A ∈ V and α ∈ SF ( G ) with α � = A. Then, ∗ A ⇒ G α ⇒ there exists a parse tree with root A and yield α Basis: A Proof of Lemma 5.5.2 (Induction on the length of derivation) ∠ Since α � = A the minimum length of the derivation is at least 1. s ‘ s 1 s 2 · · · ∠ Basis: Let A ⇒ G α be a one-step derivation. Since α � = A , ¸ = s 1 · · · s ‘ this derivation has to be the production rule A − → α . ( A; ¸ ) ≡ ( A − ! ¸ ) 2 P ∠ Hence, the parse tree is trivially the one on the right. 11 / 15

  12. An Equivalence between Parse Trees and Derivations Part (a) of Proof of Theorem 5.5.1: A ∗ G w ⇒ ∃ Parse Tree ⇒ Proof of Lemma 5.5.2 (Induction on the length of derivation) ∠ Induction: Suppose that the claim is true for all Parse tree for derivations of length k − 1 or lesser for some k ≥ 2. ∗ A ˛–! = ¸ ⇒ G ∠ Suppose a derivation of α from A in k steps exists. A A = γ 1 ⇒ G γ 2 ⇒ G γ 3 ⇒ G · · · ⇒ G γ k − 1 ⇒ G γ k = α ∠ We may assume γ k − 1 � = A . So by the induction Parse tree for hypothesis, there exists a parse tree with root A and ∗ A ‚ k − 1 ⇒ yield γ k − 1 . [If γ k − 1 = A , the derivation contains one G B step, and the basis case applies.] | {z } | {z } ! ˛ ∠ We may assume that γ k − 1 � = γ k or else the derivation of γ k − 1 from A , which has a corresponding parse tree is B − ! – also a parse tree with yield α and root label A . | {z } – ∠ Thus, the last step involves the application of a production rule. Hence, γ k − 1 = β B ω and α = βλω where (a) β, ω ∈ ( V ∪ T ) ∗ , (b) B ∈ V , and (b) B − → λ is a production rule. 12 / 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend