INF2080 Context-Free Langugaes Daniel Lupp Universitetet i Oslo - - PowerPoint PPT Presentation

inf2080
SMART_READER_LITE
LIVE PREVIEW

INF2080 Context-Free Langugaes Daniel Lupp Universitetet i Oslo - - PowerPoint PPT Presentation

INF2080 Context-Free Langugaes Daniel Lupp Universitetet i Oslo 1st February 2016 Department of University of Informatics Oslo INF2080 Lecture :: 1st February 1 / 23 Repetition Weve looked at one of the simpler computational models:


slide-1
SLIDE 1

INF2080

Context-Free Langugaes Daniel Lupp

Universitetet i Oslo

1st February 2016

Department of Informatics University of Oslo

INF2080 Lecture :: 1st February 1 / 23

slide-2
SLIDE 2

Repetition

We’ve looked at one of the simpler computational models: finite automata

INF2080 Lecture :: 1st February 2 / 23

slide-3
SLIDE 3

Repetition

We’ve looked at one of the simpler computational models: finite automata defined (non)deterministic finite automata (NFAs/DFAs) and the languages they accept: regular languages

INF2080 Lecture :: 1st February 2 / 23

slide-4
SLIDE 4

Repetition

We’ve looked at one of the simpler computational models: finite automata defined (non)deterministic finite automata (NFAs/DFAs) and the languages they accept: regular languages defined regular expressions, useful as a shorthand for describing languages

INF2080 Lecture :: 1st February 2 / 23

slide-5
SLIDE 5

Repetition

We’ve looked at one of the simpler computational models: finite automata defined (non)deterministic finite automata (NFAs/DFAs) and the languages they accept: regular languages defined regular expressions, useful as a shorthand for describing languages a language L is regular ↔ there exists a regular expression that describes L

INF2080 Lecture :: 1st February 2 / 23

slide-6
SLIDE 6

Repetition

We’ve looked at one of the simpler computational models: finite automata defined (non)deterministic finite automata (NFAs/DFAs) and the languages they accept: regular languages defined regular expressions, useful as a shorthand for describing languages a language L is regular ↔ there exists a regular expression that describes L pumping lemma as a useful tool for determining whether a language is nonregular

INF2080 Lecture :: 1st February 2 / 23

slide-7
SLIDE 7

Context-Free Grammars

Today: Context-free grammars and languages

INF2080 Lecture :: 1st February 3 / 23

slide-8
SLIDE 8

Context-Free Grammars

Today: Context-free grammars and languages grammars describe the syntax of a language; they try to describe the relationship of all the parts to one another, such as placement of nouns/verbs in sentences

INF2080 Lecture :: 1st February 3 / 23

slide-9
SLIDE 9

Context-Free Grammars

Today: Context-free grammars and languages grammars describe the syntax of a language; they try to describe the relationship of all the parts to one another, such as placement of nouns/verbs in sentences useful for programming languages, specifically compilers and parsers: if the grammar of a programming language is available, parsing is very straightforward.

INF2080 Lecture :: 1st February 3 / 23

slide-10
SLIDE 10

Context-Free Grammars

Recall example from last week: L = {anbn | n ≥ 0}

INF2080 Lecture :: 1st February 4 / 23

slide-11
SLIDE 11

Context-Free Grammars

Recall example from last week: L = {anbn | n ≥ 0} We used the pumping lemma to show that this language was not regular.

INF2080 Lecture :: 1st February 4 / 23

slide-12
SLIDE 12

Context-Free Grammars

Recall example from last week: L = {anbn | n ≥ 0} We used the pumping lemma to show that this language was not regular. → first example of a context-free language

INF2080 Lecture :: 1st February 4 / 23

slide-13
SLIDE 13

Context-Free Grammars

First example: S → aSb S → ε

INF2080 Lecture :: 1st February 5 / 23

slide-14
SLIDE 14

Context-Free Grammars

First example: S → aSb S → ε Every grammar consists of rules, which are a pair consisting of one variable (to the left of →) and a string of variables and symbols (to the right of →)

INF2080 Lecture :: 1st February 5 / 23

slide-15
SLIDE 15

Context-Free Grammars

First example: S → aSb S → ε Every grammar consists of rules, which are a pair consisting of one variable (to the left of →) and a string of variables and symbols (to the right of →) Every grammar contains a start variable (above: variable S). Common convention: the first listed variable is the start variable (if you choose a different start variable, you must specify!).

INF2080 Lecture :: 1st February 5 / 23

slide-16
SLIDE 16

Context-Free Grammars

First example: S → aSb S → ε Every grammar consists of rules, which are a pair consisting of one variable (to the left of →) and a string of variables and symbols (to the right of →) Every grammar contains a start variable (above: variable S). Common convention: the first listed variable is the start variable (if you choose a different start variable, you must specify!). Words are generated by starting with the start variable and recursively replacing variables with the righthand side of a rule. S aSb aaSbb aaεbb aabb

INF2080 Lecture :: 1st February 5 / 23

slide-17
SLIDE 17

Parse Trees

Derivations of the form S aSb aaSbb aaεbb aabb can also be encoded as a parse tree: S a S b S ε b b

INF2080 Lecture :: 1st February 6 / 23

slide-18
SLIDE 18

Context-Free Grammars

Second example: S → aSa S → bSb S → cSc S → ε

INF2080 Lecture :: 1st February 7 / 23

slide-19
SLIDE 19

Context-Free Grammars

Second example: S → aSa S → bSb S → cSc S → ε To simplify notation, you can summarize multiple rules into one line: S → aSa | bSb | cSc | ε.

INF2080 Lecture :: 1st February 7 / 23

slide-20
SLIDE 20

Context-Free Grammars

Second example: S → aSa S → bSb S → cSc S → ε To simplify notation, you can summarize multiple rules into one line: S → aSa | bSb | cSc | ε. The symbol | takes on the meaning of “or.”

INF2080 Lecture :: 1st February 7 / 23

slide-21
SLIDE 21

Context-Free Grammars

Second example: S → aSa S → bSb S → cSc S → ε To simplify notation, you can summarize multiple rules into one line: S → aSa | bSb | cSc | ε. The symbol | takes on the meaning of “or.” → palindromes of even length over {a, b, c}.

INF2080 Lecture :: 1st February 7 / 23

slide-22
SLIDE 22

Context-Free Grammar

Definition (Context-Free Grammar) A context-free grammar is a 4-tuple (V , Σ, R, S) where

1 V is a finite set of variables 2 Σ is a finite set disjoint from V of terminals 3 R is a finite set of rules, each consisting of a variable and of a string of variables and

terminals

4 and S is the start variable INF2080 Lecture :: 1st February 8 / 23

slide-23
SLIDE 23

Context-Free Grammar

Definition (Context-Free Grammar) A context-free grammar is a 4-tuple (V , Σ, R, S) where

1 V is a finite set of variables 2 Σ is a finite set disjoint from V of terminals 3 R is a finite set of rules, each consisting of a variable and of a string of variables and

terminals

4 and S is the start variable

We call L(G) the language generated by a context-free grammar. A language is called a context-free language if it is generated by a context-free grammar.

INF2080 Lecture :: 1st February 8 / 23

slide-24
SLIDE 24

Context-Free Grammar

So what can context-free grammars (CFGs) express?

INF2080 Lecture :: 1st February 9 / 23

slide-25
SLIDE 25

Context-Free Grammar

So what can context-free grammars (CFGs) express? Regular languages?

INF2080 Lecture :: 1st February 9 / 23

slide-26
SLIDE 26

Context-Free Grammar

So what can context-free grammars (CFGs) express? Regular languages? Is the class of context-free languages closed under union/intersection/concatanation/complement/Kleene star?

INF2080 Lecture :: 1st February 9 / 23

slide-27
SLIDE 27

Context-Free Grammar

So what can context-free grammars (CFGs) express? Regular languages? Is the class of context-free languages closed under union/intersection/concatanation/complement/Kleene star? Regular languages could be modelled by an automaton with finite memory...what about context-free languages?

INF2080 Lecture :: 1st February 9 / 23

slide-28
SLIDE 28

Context-Free Grammar

So what can context-free grammars (CFGs) express? Regular languages? Is the class of context-free languages closed under union/intersection/concatanation/complement/Kleene star? Regular languages could be modelled by an automaton with finite memory...what about context-free languages? Answers to these over the course of this and next lecture (and group sessions)

INF2080 Lecture :: 1st February 9 / 23

slide-29
SLIDE 29

RLs and CFLs

Can regular languages be described using context-free grammars?

INF2080 Lecture :: 1st February 10 / 23

slide-30
SLIDE 30

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L

INF2080 Lecture :: 1st February 10 / 23

slide-31
SLIDE 31

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L What if we encode traversing the DFA into grammar rules, i.e., for each transition δ(q1, a) = q2 we create a rule Q1 → aQ2

INF2080 Lecture :: 1st February 10 / 23

slide-32
SLIDE 32

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L What if we encode traversing the DFA into grammar rules, i.e., for each transition δ(q1, a) = q2 we create a rule Q1 → aQ2 the variables of our grammar correspond to the states in Q, with Q0 as the start variable.

INF2080 Lecture :: 1st February 10 / 23

slide-33
SLIDE 33

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L What if we encode traversing the DFA into grammar rules, i.e., for each transition δ(q1, a) = q2 we create a rule Q1 → aQ2 the variables of our grammar correspond to the states in Q, with Q0 as the start variable. How do we deal with accept states?

INF2080 Lecture :: 1st February 10 / 23

slide-34
SLIDE 34

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L What if we encode traversing the DFA into grammar rules, i.e., for each transition δ(q1, a) = q2 we create a rule Q1 → aQ2 the variables of our grammar correspond to the states in Q, with Q0 as the start variable. How do we deal with accept states? for each qi ∈ F, add rule Qi → ε

INF2080 Lecture :: 1st February 10 / 23

slide-35
SLIDE 35

RLs and CFLs

Can regular languages be described using context-free grammars? Given a RL L, there exists some DFA (Q, Σ, δ, q0, F) that accepts L What if we encode traversing the DFA into grammar rules, i.e., for each transition δ(q1, a) = q2 we create a rule Q1 → aQ2 the variables of our grammar correspond to the states in Q, with Q0 as the start variable. How do we deal with accept states? for each qi ∈ F, add rule Qi → ε Theorem Every regular language is context-free.

INF2080 Lecture :: 1st February 10 / 23

slide-36
SLIDE 36

Properties of CFLs

Closure under union/concatanation/Kleene star?

INF2080 Lecture :: 1st February 11 / 23

slide-37
SLIDE 37

Properties of CFLs

Closure under union/concatanation/Kleene star? Yes, group sessions!

INF2080 Lecture :: 1st February 11 / 23

slide-38
SLIDE 38

Properties of CFLs

Closure under union/concatanation/Kleene star? Yes, group sessions! Closure under complement/intersection?

INF2080 Lecture :: 1st February 11 / 23

slide-39
SLIDE 39

Properties of CFLs

Closure under union/concatanation/Kleene star? Yes, group sessions! Closure under complement/intersection? No, but we need to know more before we can determine if a language is not context-free.

INF2080 Lecture :: 1st February 11 / 23

slide-40
SLIDE 40

Ambiguity

Consider the grammar E → E + E | E × E | (E) | a

INF2080 Lecture :: 1st February 12 / 23

slide-41
SLIDE 41

Ambiguity

Consider the grammar E → E + E | E × E | (E) | a Here: the alphabet is {a, +, ×, (, )}.

INF2080 Lecture :: 1st February 12 / 23

slide-42
SLIDE 42

Ambiguity

Consider the grammar E → E + E | E × E | (E) | a Here: the alphabet is {a, +, ×, (, )}. → arithmetic expressions over a

INF2080 Lecture :: 1st February 12 / 23

slide-43
SLIDE 43

Ambiguity

Consider the grammar E → E + E | E × E | (E) | a Here: the alphabet is {a, +, ×, (, )}. → arithmetic expressions over a What does the parse tree for the string a + a × a look like?

INF2080 Lecture :: 1st February 12 / 23

slide-44
SLIDE 44

Ambiguity

E E a + E E a × E a

INF2080 Lecture :: 1st February 13 / 23

slide-45
SLIDE 45

Ambiguity

E E a + E E a × E a

Intuitively corresponds to a + (a × a)

INF2080 Lecture :: 1st February 13 / 23

slide-46
SLIDE 46

Ambiguity

E E a + E E a × E a

Intuitively corresponds to a + (a × a)

E E E a + E a × E a

INF2080 Lecture :: 1st February 13 / 23

slide-47
SLIDE 47

Ambiguity

E E a + E E a × E a

Intuitively corresponds to a + (a × a)

E E E a + E a × E a

Intuitively corresponds to (a + a) × a

INF2080 Lecture :: 1st February 13 / 23

slide-48
SLIDE 48

Ambiguity

E E a + E E a × E a

Intuitively corresponds to a + (a × a)

E E E a + E a × E a

Intuitively corresponds to (a + a) × a This is called ambiguity

INF2080 Lecture :: 1st February 13 / 23

slide-49
SLIDE 49

Ambiguity

But just having multiple possible derivations does not mean that a grammar is ambiguous.

INF2080 Lecture :: 1st February 14 / 23

slide-50
SLIDE 50

Ambiguity

But just having multiple possible derivations does not mean that a grammar is ambiguous. Two derivations could look different, yet “structurally” the same: apply the same rules to the same variables, yet in a different order.

INF2080 Lecture :: 1st February 14 / 23

slide-51
SLIDE 51

Ambiguity

But just having multiple possible derivations does not mean that a grammar is ambiguous. Two derivations could look different, yet “structurally” the same: apply the same rules to the same variables, yet in a different order. We are interested in structurally different derivations, i.e., two derivations of the same word that, given a predefined order of derivation, are different

INF2080 Lecture :: 1st February 14 / 23

slide-52
SLIDE 52

Ambiguity

But just having multiple possible derivations does not mean that a grammar is ambiguous. Two derivations could look different, yet “structurally” the same: apply the same rules to the same variables, yet in a different order. We are interested in structurally different derivations, i.e., two derivations of the same word that, given a predefined order of derivation, are different Definition A leftmost derivation of a string replaces, in each derivation step, the leftmost variable. Then a string is derived ambiguously over a grammar G if it has two or more leftmost derivations over G.

INF2080 Lecture :: 1st February 14 / 23

slide-53
SLIDE 53

Ambiguity

But just having multiple possible derivations does not mean that a grammar is ambiguous. Two derivations could look different, yet “structurally” the same: apply the same rules to the same variables, yet in a different order. We are interested in structurally different derivations, i.e., two derivations of the same word that, given a predefined order of derivation, are different Definition A leftmost derivation of a string replaces, in each derivation step, the leftmost variable. Then a string is derived ambiguously over a grammar G if it has two or more leftmost derivations over G. If L(G) contains a string that is derived ambiguously, we say that G is ambiguous.

INF2080 Lecture :: 1st February 14 / 23

slide-54
SLIDE 54

Chomsy Normal Form

Context-free languages have a nice property: Every CFL can be described by a CFG in Chomsky Normal Form: Definition A grammar is in Chomsky Normal Form if every rule is of the form: A → BC A → a where a is any terminal, A is any variable, B, C are any variables that are not the start variable. In addition the rule S → ε is permitted.

INF2080 Lecture :: 1st February 15 / 23

slide-55
SLIDE 55

Definition A grammar is in Chomsky Normal Form if every rule is of the form: A → BC A → a where a is any terminal, A is any variable, B, C are any variables that are not the start variable. In addition the rule S → ε is permitted. Proof sketch: Given an arbitrary grammar G. First, add new start variable S0 and new rule S0 → S to G.

INF2080 Lecture :: 1st February 16 / 23

slide-56
SLIDE 56

Definition A grammar is in Chomsky Normal Form if every rule is of the form: A → BC A → a where a is any terminal, A is any variable, B, C are any variables that are not the start variable. In addition the rule S → ε is permitted. Proof sketch: Given an arbitrary grammar G. First, add new start variable S0 and new rule S0 → S to G.Then, remove all rules A → ε, followed by all “unit” rules A → B.

INF2080 Lecture :: 1st February 16 / 23

slide-57
SLIDE 57

Definition A grammar is in Chomsky Normal Form if every rule is of the form: A → BC A → a where a is any terminal, A is any variable, B, C are any variables that are not the start variable. In addition the rule S → ε is permitted. Proof sketch: Given an arbitrary grammar G. First, add new start variable S0 and new rule S0 → S to G.Then, remove all rules A → ε, followed by all “unit” rules A → B. For each such

  • ccurence of A in the righthand side of a rule, add a new rule with ε (resp. B) substituted for

A (see examples on next slide).

INF2080 Lecture :: 1st February 16 / 23

slide-58
SLIDE 58

Definition A grammar is in Chomsky Normal Form if every rule is of the form: A → BC A → a where a is any terminal, A is any variable, B, C are any variables that are not the start variable. In addition the rule S → ε is permitted. Proof sketch: Given an arbitrary grammar G. First, add new start variable S0 and new rule S0 → S to G.Then, remove all rules A → ε, followed by all “unit” rules A → B. For each such

  • ccurence of A in the righthand side of a rule, add a new rule with ε (resp. B) substituted for

A (see examples on next slide). Finally, split all rules with more than 3 righthandside symbols into multiple rules containing only 2 symbols.

INF2080 Lecture :: 1st February 16 / 23

slide-59
SLIDE 59

CNF - Example

Grammar; S → ASA | aB A → B | S B → b | ε First, add new start variable:

INF2080 Lecture :: 1st February 17 / 23

slide-60
SLIDE 60

CNF - Example

Grammar; S → ASA | aB A → B | S B → b | ε First, add new start variable: S0 → S S → ASA | aB A → B | S B → b | ε

INF2080 Lecture :: 1st February 17 / 23

slide-61
SLIDE 61

CNF - Example

S0 → S S → ASA | aB A → B | S B → b | ε Then, remove B → ε:

INF2080 Lecture :: 1st February 18 / 23

slide-62
SLIDE 62

CNF - Example

S0 → S S → ASA | aB A → B | S B → b | ε Then, remove B → ε: S0 → S S → ASA | aB | a A → B | ε | S B → b

INF2080 Lecture :: 1st February 18 / 23

slide-63
SLIDE 63

CNF - Example

S0 → S S → ASA | aB | a A → B | ε | S B → b Then, remove A → ε:

INF2080 Lecture :: 1st February 19 / 23

slide-64
SLIDE 64

CNF - Example

S0 → S S → ASA | aB | a A → B | ε | S B → b Then, remove A → ε: S0 → S S → ASA | SA | AS | S | aB | a A → S | B B → b

INF2080 Lecture :: 1st February 19 / 23

slide-65
SLIDE 65

CNF - Example

S0 → S S → ASA | SA | AS | S | aB | a A → B | S B → b Then remove S → S:

INF2080 Lecture :: 1st February 20 / 23

slide-66
SLIDE 66

CNF - Example

S0 → S S → ASA | SA | AS | S | aB | a A → B | S B → b Then remove S → S: S0 → S S → ASA | SA | AS | aB | a A → B | S B → b

INF2080 Lecture :: 1st February 20 / 23

slide-67
SLIDE 67

CNF - Example

S0 → S S → ASA | SA | AS | aB | a A → B | S B → b Remove unit rule S0 → S:

INF2080 Lecture :: 1st February 21 / 23

slide-68
SLIDE 68

CNF - Example

S0 → S S → ASA | SA | AS | aB | a A → B | S B → b Remove unit rule S0 → S: S0 → ASA | SA | AS | aB | a S → ASA | SA | AS | aB | a A → B | S B → b

INF2080 Lecture :: 1st February 21 / 23

slide-69
SLIDE 69

CNF - Example

S0 → ASA | SA | AS | aB | a S → ASA | SA | AS | aB | a A → B | S B → b and you would continue to remove the unit rules A → S, etc....

INF2080 Lecture :: 1st February 22 / 23

slide-70
SLIDE 70

CNF - Example

S0 → ASA | SA | AS | aB | a S → ASA | SA | AS | aB | a A → B | S B → b and you would continue to remove the unit rules A → S, etc....But how to convert, say, S → ASA into rules with only two symbols on the right?

INF2080 Lecture :: 1st February 22 / 23

slide-71
SLIDE 71

CNF - Example

S0 → ASA | SA | AS | aB | a S → ASA | SA | AS | aB | a A → B | S B → b and you would continue to remove the unit rules A → S, etc....But how to convert, say, S → ASA into rules with only two symbols on the right? introduce help variables! S → ASA S → AA1, A1 → SA

INF2080 Lecture :: 1st February 22 / 23

slide-72
SLIDE 72

CNF

Thus, we see how all CFGs can be converted to CFGs in CNF. Useful property to have, both for practical purposes and theoretical work: knowing what the grammar looks like can be very beneficial (we will see an example next week) Next time: how can finite automata be enriched so as to accept context-free languages?

INF2080 Lecture :: 1st February 23 / 23