Lecture Slides for MAT-73006 Theoretical computer science PART Ib: - PowerPoint PPT Presentation

Lecture Slides for MAT-73006 Theoretical computer science PART Ib: Automata and Languages. Context-Free languages Henri Hansen January 26, 2015 1

Context-free languages • There are several very simple languages that are not regular, such as { 0 n 1 n | n ≥ 0 } • They are ”simple” to describe mathematically, but computa- tionally the situation is different • An important class of languages is context-free languages . • We shall explore a way of describing these languages, called context-free grammars . 2

• An important area of application for these grammars is found in programming languages

Context-free grammar • Let us start with an example of a grammar: A → 0 A 1 A → B B → # • These three rules are substitution rules . The left hand side of each rule contains a variable , and the right hand side contains a string consisting of variables and terminal symbols 3

• Terminal symbols are symbols of the language that is being defined, i.e., Σ is the set of terminal symbols • A grammar describes a language by generating the strings in the language. This happens by the following the proce- dure: 1. Write down the start variable. Unless otherwise stated, it is the left-hand side of the topmost rule 2. Find a variable that has been written down, and a rule that has this variable as it left-hand side. Replace the written down variable with the right-hand side of the rule 3. Repeat step 2 until no variables remain.

• For example, the example grammar can generate the string 000#111 • The sequence of substitutions that results in the string is called a derivation . • A derivation can also have a graphic representation as a parse tree . • The set of strings that can be generated by a given grammar is called the language of the grammar .

A more complicated example � SENTENCE � → � NOUN - PHRASE � � VERB - PHRASE � � NOUN - PHRASE � → � CMPLX - NOUN � | � CMPLX - NOUN � � PREP - PHRASE � � VERB - PHRASE � → � CMPLX - VERB � | � CMPLX - VERB � � PREP - PHRASE � � PREP - PHRASE � → � PREP � � CMPLX - NOUN � � CMPLX - NOUN � → � ARTICLE � � NOUN � � CMPLX - VERB � → � VERB � | � VERB � � NOUN - PHRASE � � ARTICLE � → a | the � NOUN � → boy | girl | flower � VERB � → likes | sees | touches � PREP � → with 4

Formal definition of CFG • A context-free grammar is a 4-tuple ( V, Σ , R, S ) , where 1. V is a finite set called variables 2. Σ is a finite set, disjoint from V called terminals (AKA alphabet) 3. R is a finite set of rules , a rule being a pair ( v, σ ) where v is a variable and σ is s string of variables and terminals; also written as v → σ 4. S ∈ V is the starting variable 5

• if u , v and w are strings of variables and terminals, and A → w is a rule of the grammar, then uAv yields the string uwv , written uAv ⇒ uwv . • We say that u derives v , written u ⇒ ∗ v if u = v or if there is some sequence u ⇒ u 1 ⇒ u 2 ⇒ · · · ⇒ u k ⇒ v • The language of the grammar is the set { w ∈ Σ ∗ | S ⇒ ∗ w }

Examples of CFGs. • Often we write a CFG by simply giving the rules; the variables are the symbols that appear at left-hand sides and the others are terminals. • S ⇒ aSb | SS | ǫ (think of a as "(" and b as ")") • E → E + T | T T → T × F | F F → ( E ) | n 6

Where the alphabet is { n, + , × , ( , ) } • A compiler of a programming language translates code into another form; CFG:s are used, for instance in describing programming language syntax • the process by which the meaning of a string is found by relating it to a grammar, is known as parsing .

Ambiguity • Consider the grammar rule E → E + E | E × E | ( E ) | a . There are several derivations for strings such as a + a × a • Definition: A grammar is ambiguous if there are two or more ways of deriving a string of its language • Ambiguity makes (unique) parsing impossible, so obviously one should strive to describe languages unambiguously when- ever possible, • Some languages are inherently ambiguous , i.e., all grammars that generate them, are ambiguous 7

Pushdown automata • Regular languages were defined as languages that are recognized by some finite automaton • Context-free languages can similarly be recognized by cer- tain kind of automata, due to the recursive nature of context- free languages, some form of memory is needed. • Informally, pushdown automata are like nondeterministic finite automata, but instead of simply moving from one state to another, they use a stack to store information about what the automaton has done in the past, and this information affects what the automaton does next 8

• When a pushdown automaton is in a given state, it responds to the alphabet that is read from the input, and to the variable that is on top of the stack. • Let us mark Σ ǫ the set Σ ∪ { ǫ } (and similarly for Γ ǫ • Formally: A pushdown automaton is a 6-tuple ( Q, Σ , Γ , δ, q 0 , F ) , where 1. Q is the (finite) set of states 2. Σ is the input alphabet 3. Γ is the stack alphabet

4. δ : Q × Σ ǫ × Γ ǫ �→ 2 Q × Γ ǫ is the nondeterministic transition function 5. q 0 ∈ Q is the start state 6. F ⊆ Q is the set of accept states • A pushdown automaton (PDA) M = ( Q, Σ , Γ , δ, q 0 , F ) accepts an input a 1 · · · a n (where a i ∈ Σ ǫ ) if and only if there is some sequence of states q 0 q 1 · · · q n and a set of strings g 0 , g 1 , · · · , g n of Γ ∗ ǫ such that the following conditions are met: 1. g 0 = ǫ , i.e., the automaton starts with an empty stack

2. for 0 ≤ i ≤ n − 1 we have ( q i +1 , x ) ∈ δ ( q i , a i +1 , y ) and g i = yt and g i +1 = xt ; i.e., the content of the stack is the same after the move, except possibly the topmost element 3. q n ∈ F • To understand the transition function, if ( q i +1 , x ) ∈ δ ( q i , a i +1 , y ) , then this transition can executed if y is on top of the stack, the automaton is in state q i and the next read input symbol is a i +1 . After it is executed, y is removed from the stack and x is put on top, and the automaton has moved to state q i +1

Example • Consider the language { a i b j c k | i = j or i = k } i.e., either the number of b s or the number of c s is the same as the number of a s. • Informally, it is relatively easy to consider a PDA that accepts the language: First read all a s, pushing a counter into the stack. Then, nondeterministically choose to count either the b s or the c s and match their number with a s. 9

c, ǫ → ǫ b, a → ǫ ǫ, $ → ǫ q 2 q 3 ǫ, ǫ → ǫ ǫ, ǫ → $ ǫ, ǫ → ǫ ǫ, ǫ → ǫ ǫ, $ → ǫ q 0 q 1 q 4 q 5 q 6 a, ǫ → a c, a → ǫ b, ǫ → ǫ

Equivalence • Pushdown automata and context-free grammars are equiv- alent in the same way as regular expressions and finite automata are: • Theorem: A language is context-free if and only if there is a pushdown automaton that recognizes it • First we explain how to prove this in the other direction. Let A be a context free language. By definition then, it has a CFG, say G that generates it 10

• The idea of the proof is as follows: We generate a nondeterministic PDA that, when reaging an input "guesses" what substitutions are needed for a given string. 1. Initially, the PDA puts the start variable on the stack 2. After this, the automaton always looks at the top symbol of the stack. If it is a variable, then it nondeterministically chooses a rule to apply, removes the variable and replaces the variable with the right-hand side of the rule (in reverse order) 3. If the top symbol is a terminal, then it compares it to the next input. If the symbols differ, this branch rejects; otherwise the top symbol is simply removed.

4. If the stack is empty when the input ends, the automaton accepts. • Please verify that the automaton accepts exactly the strings that are generated by the grammar! • The other direction is proven so that we generate a context free grammar from the transition relation of a PDA • Given a PDA P three modifications are made: 1. It will contain only one accepting state, q a . This is not a problem, because nondeterminism is allowed

2. The automaton only accepts after it has emptied the stack. This is not a restriction either 3. Every transition either pushes a symbol (but does not remove) or removes a symbol (but does not add) to the stack. Again, this is not a restriction, because transitions can be "split" into two. • The PDA is then used as a recipe for creating a grammar that generates exactly the language that is accepted by the PDA; let p be the first state and q be the last state (the unique accept state). • When P is computing on a string, say x , conditions 2 and 3 require that the first operation adds and the last operation

removes a symbol of the stack. If the symbols are different, then the stack must have been empty at some point (why??) • If the symbols are the same, we create the rule A pq → aA rs b , where a is the input read at the first move and b at the last move. • If the symbols are not the same, then the there is some state r in which the stack is empty. we create a rule A pq → A pr A rq , and so on. • To formalize the proof, let ( Q, Σ , Γ , δ, q 0 , { q a } ) be a PDA (after the modification)

Lecture Slides for MAT-73006 Theoretical computer science PART Ib: - PowerPoint PPT Presentation

Lecture Slides for MAT-73006 Theoretical computer science PART Ib: Automata and Languages. Context-Free languages Henri Hansen January 26, 2015 1 Context-free languages There are several very simple languages that are not regu- lar, such

Lecture Slides for MAT-73006 Theoretical computer science PART IIc: Reducibility Henri Hansen

Lecture Slides for MAT-73006 Theoretical computer science PART IIb: Decidability Henri Hansen

Lecture Slides for MAT-73006 Theoretical computer science PART IIIa: Space Complexity Henri

Brite-Mat Sizes/Print Areas: Rectangular mouse mat 240x190mm, Circular Mouse Mat 200mm diameter,

CS233601: Discret e CS233601: Discret e CS233601: Discret e Mat hemat ics Mat hemat ics Mat

Advanc e d MAT L AB Stanley Liang, PhD York University MAT L AB Sc ript vs. MAT L AB F unc

PLANNING & DESIGN Mat Su Career & Technical High School Matanuska Susitna Borough Mat Su

BDC Meeting # 5 Mat Su Career & Technical High School Matanus ka S us itna Borough Mat S

YOUR B2B ACCELERATOR IN THE SECTOR OF INDUSTRIAL MINERALS www.in-mat-lab.eu In.mat-Lab

CL I MAT E I NT E RVE NT I ON Marcia McNutt (Committee Chair) BOARD ON AT BOARD ON AT

www.in-mat-lab.eu In.mat-Lab Definition & Services In novative, Insulating and

MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN

Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides

EE 109 Unit 20 Theoretical Computer Science and Turing Machines Credit: Adapted from Gaurav

EE 109 Unit 20 Theoretical Computer Science and Turing Machines 2 Credit: Adapted from

SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides

Interactive Systems in Cultural Heritage and Creative Industries Nuno Correia

Ian Clot Argonne National Laboratory Collaborators Lei Chang Adelaide Craig Roberts

NOVA Microhypervisor on ARMv8-A FOSDEM 2020 Udo Steinberg BedRock Systems, Inc. February 2,

GWA Board Meeting GWA Board Meeting November 14, 2018 November 14, 2018 Agenda Approval of

Regularity Problems of Process Rewrite Systems Fei Yang 1 (Based on joint work with Yuxi Fu 2 ) 1

Mobile Station Execution Mobile Station Execution Environment (MExE MExE) ) Environment (

1 CS Master Introduction to the Theory of Computation CS Master Introduction to the

NASA Student Launch 2018-2019 Preliminary Design Review (PDR) Presentation Diameter (inches)

Lecture Slides for MAT-73006 Theoretical computer science PART Ib: - PowerPoint PPT Presentation

Lecture Slides for MAT-73006 Theoretical computer science PART Ib: Automata and Languages. Context-Free languages Henri Hansen January 26, 2015 1 Context-free languages There are several very simple languages that are not regu- lar, such

Lecture Slides for MAT-73006 Theoretical computer science PART IIc: Reducibility Henri Hansen

Lecture Slides for MAT-73006 Theoretical computer science PART IIb: Decidability Henri Hansen

Lecture Slides for MAT-73006 Theoretical computer science PART IIIa: Space Complexity Henri

Brite-Mat Sizes/Print Areas: Rectangular mouse mat 240x190mm, Circular Mouse Mat 200mm diameter,

CS233601: Discret e CS233601: Discret e CS233601: Discret e Mat hemat ics Mat hemat ics Mat

Advanc e d MAT L AB Stanley Liang, PhD York University MAT L AB Sc ript vs. MAT L AB F unc

PLANNING &amp; DESIGN Mat Su Career &amp; Technical High School Matanuska Susitna Borough Mat Su

BDC Meeting # 5 Mat Su Career &amp; Technical High School Matanus ka S us itna Borough Mat S

YOUR B2B ACCELERATOR IN THE SECTOR OF INDUSTRIAL MINERALS www.in-mat-lab.eu In.mat-Lab

CL I MAT E I NT E RVE NT I ON Marcia McNutt (Committee Chair) BOARD ON AT BOARD ON AT

www.in-mat-lab.eu In.mat-Lab Definition &amp; Services In novative, Insulating and

MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN

Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides

EE 109 Unit 20 Theoretical Computer Science and Turing Machines Credit: Adapted from Gaurav

EE 109 Unit 20 Theoretical Computer Science and Turing Machines 2 Credit: Adapted from

SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides

Interactive Systems in Cultural Heritage and Creative Industries Nuno Correia

Ian Clot Argonne National Laboratory Collaborators Lei Chang Adelaide Craig Roberts

NOVA Microhypervisor on ARMv8-A FOSDEM 2020 Udo Steinberg BedRock Systems, Inc. February 2,

GWA Board Meeting GWA Board Meeting November 14, 2018 November 14, 2018 Agenda Approval of

Regularity Problems of Process Rewrite Systems Fei Yang 1 (Based on joint work with Yuxi Fu 2 ) 1

Mobile Station Execution Mobile Station Execution Environment (MExE MExE) ) Environment (

1 CS Master Introduction to the Theory of Computation CS Master Introduction to the

NASA Student Launch 2018-2019 Preliminary Design Review (PDR) Presentation Diameter (inches)

PLANNING & DESIGN Mat Su Career & Technical High School Matanuska Susitna Borough Mat Su

BDC Meeting # 5 Mat Su Career & Technical High School Matanus ka S us itna Borough Mat S

www.in-mat-lab.eu In.mat-Lab Definition & Services In novative, Insulating and