Theoretical Computer Science (Bridging Course) Context Free - - PowerPoint PPT Presentation

โ–ถ
theoretical computer
SMART_READER_LITE
LIVE PREVIEW

Theoretical Computer Science (Bridging Course) Context Free - - PowerPoint PPT Presentation

Theoretical Computer Science (Bridging Course) Context Free Languages Gian Diego Tipaldi Topics Covered Context free grammars Pushdown automata Equivalence of PDAs and CFGs Non-context free grammars The pumping lemma Context


slide-1
SLIDE 1

Theoretical Computer Science (Bridging Course)

Gian Diego Tipaldi

Context Free Languages

slide-2
SLIDE 2

Topics Covered

  • Context free grammars
  • Pushdown automata
  • Equivalence of PDAs and CFGs
  • Non-context free grammars
  • The pumping lemma
slide-3
SLIDE 3

Context Free Grammars

  • Extend regular expressions
  • First studied for natural languages
  • Often used in computer languages
  • Compilers
  • Parsers
  • Pushdown automata
slide-4
SLIDE 4

Context Free Grammars

  • Collection of substitution rules
  • Rules: Symbol -> string
  • Variable symbols (Uppercase)
  • Terminal symbols (lowercase)
  • Start variable
slide-5
SLIDE 5

Context Free Grammars

  • Example grammar G1:
  • A, B are variables
  • 0,1,# are terminals
  • A is the start variable
slide-6
SLIDE 6

Context Free Grammars

Example string: 000#111

Does it belong to the grammar?

slide-7
SLIDE 7

Context Free Grammars

Example string: 000#111

  • A -> 0A1
  • 0A1 ->00A11
  • 00A11 -> 000A111
  • 000A111 -> 000B111
  • 000B111 -> 000#111
slide-8
SLIDE 8

Context Free Grammars

Example string: 000#111

  • A -> 0A1
  • 0A1 ->00A11
  • 00A11 -> 000A111
  • 000A111 -> 000B111
  • 000B111 -> 000#111

A A A A B 0 0 0 # 1 1 1 Parse tree for 000#111 in ๐ป1

slide-9
SLIDE 9

Context Free Grammars

Example string: 000#111

  • A -> 0A1
  • 0A1 ->00A11
  • 00A11 -> 000A111
  • 000A111 -> 000B111
  • 000B111 -> 000#111

A A A A B 0 0 0 # 1 1 1 Parse tree for 000#111 in ๐ป1

slide-10
SLIDE 10

Natural Language Example

  • A boy sees
  • The boy sees the flower
  • A girl with the flower likes the boy

<SENTENCE> โ†’ <NOUN-PHRASE>< ><VERB-PHRASE> <NOUN-PHRASE> โ†’ <CMPLX-NOUN>|<CMPLX-NOUN>< ><PREP-PHRASE> <VERB-PHRASE> โ†’ <CMPLX-VERB>|<CMPLX-VERB>< ><PREP-PHRASE> <PREP-PHRASE> โ†’ <PREP>< ><CMPLX-NOUN> <CMPLX-NOUN> โ†’ <ARTICLE>< ><NOUN> <CMPLX-VERB> โ†’ <VERB>|<VERB>< ><NOUN-PHRASE> <ARTICLE> โ†’ a | the <NOUN> โ†’ boy | girl | flower <VERB> โ†’ touches | likes | sees <PREP> โ†’ with

slide-11
SLIDE 11

Context Free Grammar

Definition 2.2: A context-free grammar is a 4-tuple (๐‘Š,ฮฃ,๐‘†,๐‘‡) where:

  • ๐‘Š is the set of variables
  • ฮฃ is the set of terminals, ฮฃ โˆฉ ๐‘Š = โˆ…
  • ๐‘† is the set of rules
  • ๐‘‡โˆˆ๐‘Š is the start symbol
slide-12
SLIDE 12

Language of a grammar

  • u,v,w are strings, A->w a rule
  • uAv yields uwv: uAv

uwv

  • u derives v: u

โˆ—

v if

  • Language of a grammar
slide-13
SLIDE 13

Parsing a string

  • Consider the following grammar
  • What are the parse trees of
  • a + a x a
  • (a + a) x a

3

( , , , } { , , } { , , , (, )} is | | ( ) | G V R E xp r V E xp r T erm F a cto r a R E xp r E xp r T erm T erm T erm T erm F a cto r F a cto r F a cto r E xp r a ๏€ฝ ๏“ ๏€ผ ๏€พ ๏€ฝ ๏€ผ ๏€พ ๏€ผ ๏€พ ๏€ผ ๏€พ ๏“ ๏€ฝ ๏€ซ ๏‚ด ๏€ผ ๏€พ ๏‚ฎ ๏€ผ ๏€พ ๏€ซ ๏€ผ ๏€พ ๏€ผ ๏€พ ๏€ผ ๏€พ ๏‚ฎ ๏€ผ ๏€พ ๏‚ด ๏€ผ ๏€พ ๏€ผ ๏€พ ๏€ผ ๏€พ ๏‚ฎ ๏€ผ ๏€พ

slide-14
SLIDE 14

Parsing a string

slide-15
SLIDE 15

Designing Grammars

Harder than designing automata Few techniques can be used

  • Union of context free languages
  • Conversion from DFA (regular)
  • Exploit linked variables (0n1n)
  • Exploit recursive structure (trickier)
slide-16
SLIDE 16

Union of Different CFGs

1 1 2 2 1 2

1 | 1 0 | | S S S S S S S ๏ฅ ๏ฅ ๏‚ฎ ๏‚ฎ ๏‚ฎ

1 2 1 2

( ) {0 1 | 0} ( ) {1 0 | 0} ( ) ( ) ( )

n n n n

L G n L G n L G L G L G ๏€ฝ ๏‚ณ ๏€ฝ ๏‚ณ ๏€ฝ ๏ƒˆ

slide-17
SLIDE 17

Conversion from DFAs

  • Take the same vocabulary: ฮฃ๐‘• = ฮฃ๐‘
  • For each state qi insert a variable Ri
  • For each transition ๐œ€ ๐‘Ÿ๐‘—, ๐‘ = ๐‘Ÿ๐‘˜ insert

๐‘†๐‘— โ†’ ๐‘๐‘†๐‘˜

  • For each accept state ๐‘Ÿ๐‘™ insert

๐‘†๐‘™ โ†’ ๐œ—

slide-18
SLIDE 18

Conversion from DFAs

  • Take the same vocabulary: ฮฃ = {0,1}
  • Insert all the variables: V = {๐‘†1, ๐‘†2}
  • Insert the rules:

q2 q1

1 1

slide-19
SLIDE 19

Designing Linked Strings

  • Languages of the type
  • Create rules of the form
  • For the language above
slide-20
SLIDE 20

Designing Recursive Strings

  • Example are arithmetic expressions
  • Create the recursive structure <Expr>
  • Place it where it appear <Factor>
slide-21
SLIDE 21

Ambiguity

  • Generate a string in several ways
  • E.g., grammar G5:
  • No usual notion of precedence
  • Natural language processing
  • โ€œa boy touches a girl with the flowerโ€
slide-22
SLIDE 22

Ambiguity

  • Consider the string: a + a x a
slide-23
SLIDE 23

Ambiguity โ€“ Definition

  • Leftmost derivation: At every step,

replace the leftmost variable

  • A string is generated ambiguously if it

has multiple leftmost derivations

  • A CFG is ambiguous if generates some

string ambiguously

  • Some context free languages are

inherently ambiguous

slide-24
SLIDE 24

Chomsky Normal Form (CNF)

Definition 2.8: A context-free grammar is in Chomsky normal form if every rule is of the form ๐ต โ†’ ๐ถ๐ท ๐ต โ†’ ๐‘ where ๐‘ is any terminal and ๐ต,๐ถ, and ๐ท are any variablesโ€”except that ๐ถ and ๐ท may not be the start variable. In addition we permit the rule ๐‘‡โ†’๐œ, where ๐‘‡ is the start variable.

slide-25
SLIDE 25

Chomsky Normal Form (CNF)

Theorem 2.9: Any context-free language is generated by a context-free grammar in Chomsky normal form.

slide-26
SLIDE 26

Proof Idea

  • Rewrite the rules not in CNF
  • Introduce new variables
  • Four cases:
  • Start variable on the right side
  • Epsilon rules: ๐ต โ†’ ฮต
  • Unit rules: ๐ต โ†’ ๐ถ
  • Long and/or mixed rules: ๐ต โ†’ ๐‘๐ต๐‘๐‘๐ถ๐‘๐ถ
slide-27
SLIDE 27

Proof Idea

  • Start variable on the right side
  • Introduce a new start and ๐‘‡1 โ†’ ๐‘‡0
  • Epsilon rules: ๐ต โ†’ ฮต
  • Introduce new rules without A
  • Unit rules: ๐ต โ†’ ๐ถ
  • Replace B with its production
  • Long and/or mixed rules: ๐ต โ†’ ๐‘๐ต๐‘๐‘๐ถ๐‘๐ถ
  • New variables and new rules
slide-28
SLIDE 28

Formal Proof: by Construction

  • 1. Add a new start symbol ๐‘‡_0 and the

rule ๐‘‡0 โ†’ ๐‘‡, where ๐‘‡ is the old start

  • 2. Remove all rules ๐ต โ†’ ๐œ— :
  • For each ๐‘† โ†’ ๐‘ฃ๐ต๐‘ค add ๐‘† โ†’ ๐‘ฃ๐‘ค
  • For each ๐‘† โ†’ ๐ต add ๐‘† โ†’ ๐œ—
  • Repeat until all gone (keep ๐‘‡0 โ†’ ๐œ— )
  • 3. Remove all rules ๐ต โ†’ ๐ถ :
  • For each ๐ถ โ†’ ๐‘ฃ add ๐ต โ†’ ๐‘ฃ
  • Repeat until all gone
slide-29
SLIDE 29

Formal Proof: by Construction

  • 4. Convert all rules ๐ต โ†’ ๐‘ฃ1 โ€ฆ ๐‘ฃ๐‘™, ๐‘™ โ‰ฅ 3 in:
  • ๐ต โ†’ ๐‘ฃ1๐ต1
  • ๐ต1 โ†’ ๐‘ฃ2๐ต2, โ€ฆ
  • ๐ต๐‘™โˆ’2 โ†’ ๐‘ฃ๐‘™โˆ’1๐‘ฃ๐‘™
  • 5. Convert all rules ๐ต โ†’ ๐‘ฃ1๐‘ฃ2:
  • Replace any terminal ๐‘ฃ๐‘— with ๐‘‰๐‘—
  • Add the rules ๐‘‰๐‘— โ†’ ๐‘ฃ๐‘—
  • Be careful of cycles!
slide-30
SLIDE 30

CNF: Example 2.10 from Book

  • Convert the CFG in CNF
  • Added rules in bold
  • Removed rules in stroke

๐‘‡ โ†’ ๐ต๐‘‡๐ต | ๐‘๐ถ ๐ต โ†’ ๐ถ | ๐‘‡ ๐ถ โ†’ ๐‘ | ๐œ

slide-31
SLIDE 31

CNF: Example 2.10 from Book

  • Add the new start symbol

๐‘ป๐Ÿ โ†’ ๐‘ป ๐‘‡ โ†’ ๐ต๐‘‡๐ต | ๐‘๐ถ ๐ต โ†’ ๐ถ | ๐‘‡ ๐ถ โ†’ ๐‘ | ๐œ

slide-32
SLIDE 32

CNF: Example 2.10 from Book

  • Remove the empty rule ๐ถ โ†’ ๐œ

๐‘‡0 โ†’ ๐‘‡ ๐‘‡ โ†’ ๐ต๐‘‡๐ต ๐‘๐ถ ๐’ƒ ๐ต โ†’ ๐ถ ๐‘‡ ๐œป ๐ถ โ†’ ๐‘ | ๐œ

slide-33
SLIDE 33

CNF: Example 2.10 from Book

  • Remove the empty rule ๐ต โ†’ ๐œ

๐‘‡0 โ†’ ๐‘‡ ๐‘‡ โ†’ ๐ต๐‘‡๐ต ๐‘๐ถ ๐‘ ๐‘ป๐‘ฉ ๐‘ฉ๐‘ป | ๐‘ป ๐ต โ†’ ๐ถ ๐‘‡ ๐œ ๐ถ โ†’ ๐‘

slide-34
SLIDE 34

CNF: Example 2.10 from Book

  • Remove unit rule: ๐‘‡ โ†’ ๐‘‡

๐‘‡0 โ†’ ๐‘‡ ๐‘‡ โ†’ ๐ต๐‘‡๐ต ๐‘๐ถ ๐‘ ๐‘‡๐ต ๐ต๐‘‡ | ๐‘‡ ๐ต โ†’ ๐ถ | ๐‘‡ ๐ถ โ†’ ๐‘

slide-35
SLIDE 35

CNF: Example 2.10 from Book

  • Remove unit rule: ๐‘‡0 โ†’ ๐‘‡

๐‘‡0 โ†’ ๐‘‡ | ๐‘ฉ๐‘ป๐‘ฉ ๐’ƒ๐‘ช ๐’ƒ ๐‘ป๐‘ฉ ๐‘ฉ๐‘ป ๐‘‡ โ†’ ๐ต๐‘‡๐ต ๐‘๐ถ ๐‘ ๐‘‡๐ต ๐ต๐‘‡ ๐ต โ†’ ๐ถ | ๐‘‡ ๐ถ โ†’ ๐‘

slide-36
SLIDE 36

CNF: Example 2.10 from Book

  • Remove unit rule: ๐ต โ†’ ๐ถ

๐‘‡0 โ†’ ๐ต๐‘‡๐ต ๐‘๐ถ ๐‘ ๐‘‡๐ต ๐ต๐‘‡ ๐‘‡ โ†’ ๐ต๐‘‡๐ต ๐‘๐ถ ๐‘ ๐‘‡๐ต ๐ต๐‘‡ ๐ต โ†’ ๐ถ ๐‘‡ ๐’„ ๐ถ โ†’ ๐‘

slide-37
SLIDE 37

CNF: Example 2.10 from Book

  • Remove unit rule: ๐ต โ†’ ๐‘‡

๐‘‡0 โ†’ ๐ต๐‘‡๐ต ๐‘๐ถ ๐‘ ๐‘‡๐ต ๐ต๐‘‡ ๐‘‡ โ†’ ๐ต๐‘‡๐ต ๐‘๐ถ ๐‘ ๐‘‡๐ต ๐ต๐‘‡ ๐ต โ†’ ๐‘‡ ๐‘ ๐‘ฉ๐‘ป๐‘ฉ ๐’ƒ๐‘ช ๐’ƒ ๐‘ป๐‘ฉ ๐‘ฉ๐‘ป ๐ถ โ†’ ๐‘

slide-38
SLIDE 38

CNF: Example 2.10 from Book

  • Convert the remaining rules

๐‘‡0 โ†’ ๐ต๐‘ฉ๐Ÿ ๐‘ฝ๐ถ ๐‘ ๐‘‡๐ต ๐ต๐‘‡ ๐‘‡ โ†’ ๐ต๐‘ฉ๐Ÿ ๐‘ฝ๐ถ ๐‘ ๐‘‡๐ต ๐ต๐‘‡ ๐ต โ†’ ๐‘ ๐ต๐‘ฉ๐Ÿ ๐‘ฝ๐ถ ๐‘ ๐‘‡๐ต | ๐ต๐‘‡ ๐‘ฉ๐Ÿ โ†’ ๐‘ป๐‘ฉ ๐‘ฝ โ†’ ๐’ƒ ๐ถ โ†’ ๐‘

slide-39
SLIDE 39

Pushdown Automata (PDA)

  • Extend NFAs with a stack
  • The stack provides additional memory
  • Equivalent to context free grammars
  • They recognize context free languages
slide-40
SLIDE 40

Finite State Automata

  • Can be simplified as follow
  • State control for states and transitions
  • Tape to store the input string

state control a a b b input

slide-41
SLIDE 41

Pushdown Automata

  • Introduce a stack component
  • Symbols can be read and written there

state control a a b b input a a b stack

slide-42
SLIDE 42

What is a Stack?

  • Stacks are special containers
  • Symbols are โ€œpushedโ€ on top
  • Symbols can be โ€œpoppedโ€ from top
  • Last in first out principle
  • Similar to plates in cafeteria
slide-43
SLIDE 43

Formal Definition of PDA

A pushdown automata is a 6-tuple (๐‘…, ฮฃ, ฮ“, ๐œ€, ๐‘Ÿ๐‘, ๐บ)

  • ๐‘… is a finite set of states
  • ฮฃ is a finite set, the input alphabet
  • ฮ“ is a finite set, the stack alphabet
  • ๐œ€: ๐‘… ร— ฮฃ๐œ— ร— ฮ“๐œ— โ†’ ๐‘„(๐‘… ร— ฮ“๐œ—) is the

transition function

  • ๐‘Ÿ0 โˆˆ ๐‘… is the initial state
  • ๐บ โІ ๐‘… is the set of accept states
slide-44
SLIDE 44

Transition Function

  • Maps (state, in, stk) in (state, stk)
  • Can include empty symbols
  • $ is used to indicate the stack end

Input 1 ั” Stack $ ั” $ ั” $ ั” q1 {(q2,$)} q2 {(q2,0)} {(q3,ั”)} q3 {(q3,ั”)} {(q4,ั”)} q4

slide-45
SLIDE 45

Example PDA

  • PDA for the language

q2 q1 q4 q3 ั”,ั” โ†’ $ 0,ั” โ†’ 0 1,0 โ†’ ั” 1,0 โ†’ ั” ั”,$ โ†’ ั”

slide-46
SLIDE 46

Computation of the PDA

Compute keeping track of

  • String
  • State
  • Stack
slide-47
SLIDE 47

Computation of the PDA

Compute keeping track of

  • String
  • State
  • Stack

q2 q1 q4 q3 ั”,ั” โ†’ $ 0,ั” โ†’ 0 1,0 โ†’ ั” 1,0 โ†’ ั” ั”,$ โ†’ ั”

slide-48
SLIDE 48

Computation of the PDA

Compute keeping track of

  • String
  • State
  • Stack

q2 q1 q4 q3 ั”,ั” โ†’ $ 0,ั” โ†’ 0 1,0 โ†’ ั” 1,0 โ†’ ั” ั”,$ โ†’ ั”

1 2 2 2 3 3 4

(0 0 1 1, , ) (0 0 1 1, , $ ) (0 1 1, , 0 $ ) (1 1, , 0 0 $ ) (1, , 0 $ ) ( , , $ ) ( , ) a c c e p t q q q q q q q ๏ฅ ๏ฅ ๏ฅ ๏‚ฏ ๏‚ฏ ๏‚ฏ ๏‚ฏ ๏‚ฏ ๏‚ฏ

slide-49
SLIDE 49

Definition of Computation

* 1 1

L et b e a p u sh d o w n au to m ato n ( , , , , , ) L et .... b e a strin g o v er if an d .... w h ere an d a seq u en ce o f states ,..., ex ist strin g s ,..., accep ts ex ists s in an d in

n n i n n

M Q q F w w w M w w s s w w w w r r Q

๏ฅ

๏ค ๏“ ๏‡ ๏€ฝ ๏“ ๏ƒŽ ๏“ ๏€ฝ ๏ƒŽ ๏“ ๏‡

1 1 * * 1

1 . 2 .fo r all 0 ,..., 1 ( , ) ( , , ) w h ere = an d = fo r so m e , an d so m e 3 . N o ex p licit test fo r em p ty stack an d en d o su ch th at an d f in p u t

i i i i i n

r q i n r r w s a t s b s b a a t b t r F

๏ฅ

๏ค ๏ฅ

๏€ซ ๏€ซ ๏€ซ

๏€ฝ ๏€ฝ ๏€ญ ๏ƒŽ ๏‡ ๏ƒŽ ๏‡ ๏€ฝ ๏ƒŽ ๏ƒŽ

slide-50
SLIDE 50

Another Example of PDA

๐‘€ = ๐‘๐‘— ๐‘๐‘˜ ๐‘‘๐‘™ ๐‘—, ๐‘˜, ๐‘™๏€ ๏‚ณ 0 ๐‘๐‘œ๐‘’ ๐‘— = ๐‘˜ ๐‘๐‘  ๐‘— = ๐‘™}

q4 q3 q5 q6 ๐œ,$ โ†’ ๐œ q7 q2 q1 b, ๐œ โ†’ ๐œ a, ๐œ โ†’ a c,a โ†’ ๐œ b,a โ†’ ๐œ c, ๐œ โ†’ ๐œ ๐œ,$ โ†’ ๐œ ๐œ, ๐œ โ†’ ๐œ ๐œ, ๐œ โ†’ ๐œ

slide-51
SLIDE 51

Another Example of PDA

๐‘€ = ๐‘ฅ๐‘ฅ๐‘† ๐‘ฅ โˆˆ 0,1 โˆ—} ๐‘ฅ๐‘† is ๐‘ฅ written โ€œbackwardsโ€

q2 q1 q4 q3 ๐œ, ๐œ โ†’ $ 0, ๐œ โ†’ 0 1, ๐œ โ†’ 1 ๐œ, ๐œ โ†’ ๐œ 0,0 โ†’ ๐œ 1,1 โ†’ ๐œ ๐œ,$ โ†’ ๐œ

slide-52
SLIDE 52

Equivalence of PDAs and CFLs

Theorem 2.20: A language is context free if and only if some pushdown automaton recognizes it. Lemma 2.21: If a language is context free, then some pushdown automaton recognizes it. (Forward direction of proof)

slide-53
SLIDE 53

Lemma 2.21: Proof Idea

  • Construct a PDA P for the grammar
  • P accepts w if there is a derivation
  • Non determinism for multiple rules
  • Represent intermediate strings on PDA
  • Store the variables on the stack
slide-54
SLIDE 54

Lemma 2.21: Proof Idea

  • Representing 01A1A0

state control 0 1 1 0 A 1 A 0 1 $ 0 1 A 1 A 0 0 1 A 1 A 0

slide-55
SLIDE 55

Proof by Construction

  • 1. Place the marker symbol $ and the

start variable on the stack.

  • 2. Repeat the following steps forever.

There are three possible cases:

  • a. The top of stack is a variable symbol A;
  • b. The top of stack is a terminal symbol a;
  • c. The top of stack is the symbol $
slide-56
SLIDE 56

Proof by Construction

The top of stack is a variable symbol A

Non-deterministically select one of the rules for A and substitute A on the stack.

The top of stack is a terminal symbol a

Read the next symbol from the input and compare it to a. If they match, repeat. If they do not match, reject the branch.

slide-57
SLIDE 57

Proof by Construction

The top of stack is the symbol $

Enter the accept state. Doing so accepts the input if it has all been read.

slide-58
SLIDE 58

Proof by Construction

  • PDA to substitute a whole string
slide-59
SLIDE 59

Proof by Construction

  • Final PDA to accept the string
slide-60
SLIDE 60

Example 2.25 From the Book

  • Construct a PDA to accept the CFG

๐‘‡ โ†’ ๐‘๐‘ˆ๐‘ | ๐‘ ๐‘ˆ โ†’ ๐‘ˆ๐‘ | ๐œ

slide-61
SLIDE 61

Example 2.25 From the Book

  • Construct a PDA to accept the CFG

๐‘‡ โ†’ ๐‘๐‘ˆ๐‘ | ๐‘ ๐‘ˆ โ†’ ๐‘ˆ๐‘ | ๐œ

slide-62
SLIDE 62

Equivalence of PDAs and CFLs

Lemma 2.27: If a pushdown automaton recognizes some languages, then it is context free. (Backward direction of proof) Assumptions:

  • 1. The PDA has a single accept state
  • 2. The PDA empties the stack before accepting
  • 3. Transitions either push or remove symbols
slide-63
SLIDE 63

Lemma 2.27: Assumptions

  • Assumption 1
  • Create a new accept state with empty

transitions from the previous ones

  • Assumption 2
  • Creates dummy transitions to empty the

stack before accepting

slide-64
SLIDE 64

Lemma 2.27: Assumptions

  • Assumption 3
  • Replace each transitions that pushes and

pops with two transitions and a new state

  • Replace each transitions without push and

pop with two transitions that push and pop a dummy symbol and a new state

slide-65
SLIDE 65

Lemma 2.27: Proof

S ay th at an d co n stru ct . T h e v ariab les

  • f

are T h e start v ariab le is N o w w e d escrib e ยดs ru les. F o r each an d , if

a ccep t

a ccep t p q q ,q

P ( Q , , , q ,{ q }) G G { A | p ,q Q }. A . G p ,q ,r ,s Q ; t a ,b ( p ,a , ) ๏‡

๏ฅ

๏€ฝ ๏“ ๏ค ๏€ฌ ๏ƒŽ ๏‚ท ๏ƒŽ ๏ƒŽ ๏‡ ๏€ฌ ๏ƒŽ ๏“ ๏ค ๏ฅ co n tain s an d co n tain s p u t th e ru le in F o r each p u t th e ru le in F in ally, fo r each p u t th e ru le in Y o u m ay g ain s

p q rs p q p r rq p p

( r ,t ) ( s ,b ,t ) ( q , ) A a A b G . p ,q ,r Q A A A G . p Q A G . ๏ค ๏ฅ ๏‚ฎ ๏‚ท ๏ƒŽ ๏‚ฎ ๏‚ท ๏ƒŽ ๏‚ฎ ๏ฅ

  • m e in tu itio n fo r th is co n stru ctio n fro m th e fo llo w in g fig u res.
slide-66
SLIDE 66

Inserting ๐‘ฉ๐’’๐’“ โ†’ ๐’ƒ๐‘ฉ๐’”๐’•๐’„

slide-67
SLIDE 67

Inserting ๐‘ฉ๐’’๐’“ โ†’ ๐‘ฉ๐’’๐’”๐‘ฉ๐’”๐’“

slide-68
SLIDE 68

Lemma 2.27: Proof

  • We now need to prove that the

construction works

  • ๐‘ฉ๐’’๐’“ generates ๐’š iff ๐’š brings ๐‘ธ from

๐’’ with an empty stack to ๐’“ with an empty stack

  • Prove by induction
slide-69
SLIDE 69

Lemma 2.27: Proof (Forward)

If ๐‘ฉ๐’’๐’“ generates ๐’š , it brings ๐‘ธ from ๐’’ with empty stack to ๐’“ with empty stack Basis: The derivation has 1 step There is only one rule possible ๐‘ฉ๐’’๐’’ โ†’ ๐‘ which trivially brings P from p to p.

slide-70
SLIDE 70

Lemma 2.27: Proof (Forward)

Induction: Assume true for k steps, prove for k+1 Case a): ๐ต๐‘ž๐‘Ÿ ๐‘๐ต๐‘ ๐‘ก๐‘ ๐‘ฆ = ๐‘๐‘ง๐‘ and ๐ต๐‘ ๐‘ก

โˆ—

๐‘ง in ๐‘™ steps with empty stack (induction assumption). Now, because ๐ต๐‘ž๐‘Ÿ ๐‘๐ต๐‘ ๐‘ก๐‘ in G, we have ๐œ€(๐‘ž, ๐‘, ๐œ) โˆ‹ (๐‘ , ๐‘ข) and ๐œ€ ๐‘ก, ๐‘, ๐‘ข โˆ‹ (๐‘Ÿ, ๐œ) Therefore, ๐‘ฆ can bring ๐‘„ from ๐‘ž to ๐‘Ÿ with empty stack.

slide-71
SLIDE 71

Lemma 2.27: Proof (Forward)

Induction: Assume true for k steps, prove for k+1 Case b): ๐ต๐‘ž๐‘Ÿ ๐ต๐‘ž๐‘ ๐ต๐‘ ๐‘Ÿ ๐‘ฆ = ๐‘ง๐‘จ such that ๐ต๐‘ž๐‘ 

โˆ—

๐‘ง and ๐ต๐‘ž๐‘ 

โˆ—

๐‘จ in at most ๐‘™ steps with empty stack. Therefore, ๐‘ฆ can bring ๐‘„ from ๐‘ž to ๐‘Ÿ with empty stack.

slide-72
SLIDE 72

Lemma 2.27: Proof (Backward)

If ๐’š brings ๐‘ธ from ๐’’ with empty stack to ๐’“ with empty stack, then ๐‘ฉ๐’’๐’“ generates ๐’š Basis: The computation has 0 steps If it has 0 steps, it starts and ends in the same state. P can only read the empty

  • string. The rule ๐‘ฉ๐’’๐’’ โ†’ ๐‘ generates it.
slide-73
SLIDE 73

Lemma 2.27: Proof (Backward)

Induction: Assume true for k steps, prove for k+1 Case a): Stack is not empty in between The symbol pushed at the beginning is the same popped at the end, we have therefore ๐ต๐‘ž๐‘Ÿ โ†’ ๐‘๐ต๐‘ ๐‘ก๐‘ in the grammar. We have ๐‘ฆ = ๐‘๐‘ง๐‘, from induction we have ๐ต๐‘ ๐‘ก

โˆ—

๐‘ง, therefore ๐ต๐‘ž๐‘Ÿ

โˆ—

๐‘๐‘ง๐‘

slide-74
SLIDE 74

Lemma 2.27: Proof (Backward)

Induction: Assume true for k steps, prove for k+1 Case b): Stack is empty in between There exists a state ๐‘  in between and computations from ๐‘ž to ๐‘  and ๐‘  to ๐‘Ÿ have at most k steps. We have ๐‘ฆ = ๐‘ง๐‘จ, from induction ๐ต๐‘ž๐‘ 

โˆ—

๐‘ง and ๐ต๐‘ ๐‘Ÿ

โˆ—

๐‘จ . Since ๐ต๐‘ž๐‘Ÿ โ†’ ๐ต๐‘ž๐‘ ๐ต๐‘ ๐‘Ÿ is in the grammar, we have that ๐ต๐‘ž๐‘Ÿ

โˆ—

๐‘ง๐‘จ

slide-75
SLIDE 75

Regular vs. Context Free

  • Every regular language is context free
  • NFAs are PDAs without a stack!

regular languages

slide-76
SLIDE 76

Pumping Lemma

P u m p in g L em m a If is a co n tex t free lan g u ag e, th en th ere is a n u m b er su ch th at if is an y strin g in

  • f len g th at least

th en m ay b e d iv ed in to su ch th at 1 . F o r each 0; 2 .

i i

A p s A p s s u vxyz i u v xy z A v ๏€ฝ ๏‚ณ ๏ƒŽ T h eo rem 3 . y vxy p ๏€พ ๏‚ฃ

slide-77
SLIDE 77

Remember the Parse Tree?

slide-78
SLIDE 78

Pumping Lemma: Proof Idea

  • Let T be the parse tree for A
  • Show that s can be broken into uvxyz
  • Prove the conditions holds

T T

slide-79
SLIDE 79

Pumping Lemma: Proof Idea

  • Let T be the parse tree for A
  • Show that s can be broken into uvxyz
  • Prove the conditions holds

T T R u z

slide-80
SLIDE 80

Pumping Lemma: Proof Idea

  • Let T be the parse tree for A
  • Show that s can be broken into uvxyz
  • Prove the conditions holds

T T R R x u v y z

slide-81
SLIDE 81

Pumping Lemma: Proof Idea

  • Let T be the parse tree for A
  • Show that s can be broken into uvxyz
  • Prove the conditions holds

T v y T R R R x u v y z

slide-82
SLIDE 82

Pumping Lemma: Proof Idea

  • Let T be the parse tree for A
  • Show that s can be broken into uvxyz
  • Prove the conditions holds

T T R R x u v y z

slide-83
SLIDE 83

Pumping Lemma: Proof Idea

  • Let T be the parse tree for A
  • Show that s can be broken into uvxyz
  • Prove the conditions holds

T T R x u z

slide-84
SLIDE 84

Pumping Lemma: Proof

  • Let ๐‘ be the maximum number of

symbols on right hand side of a rule

  • The number of leaves in a parse tree
  • f height โ„Ž is at most ๐‘โ„Ž
  • Hence, for any string ๐‘ก of such parse

tree, its length |s| โ‰ค ๐‘โ„Ž

  • Let ๐‘Š be the number of variables and

choose the pumping length ๐‘ž = ๐‘ ๐‘Š +2

slide-85
SLIDE 85

Pumping Lemma: Proof

  • For any ๐‘ก โ‰ฅ ๐‘ž: possible parse trees

for ๐‘ก have height at least ๐‘Š + 1

  • let ๐œ be the minimum parse tree for ๐‘ก
  • It must contain a path P from root to a

leaf of length at least ๐‘Š + 1

  • P has at least ๐‘Š + 2 nodes: one terminal

and the rest variables

  • P has at least ๐‘Š + 1 variables ๏ƒ  some

variable must be doubled!

slide-86
SLIDE 86

Pumping Lemma: Proof Cnd. 1

  • Divide ๐‘ก into ๐‘ฃ๐‘ค๐‘ฆ๐‘ง๐‘จ as in picture.
  • R generates ๐‘ค๐‘ฆ๐‘ง, with a large subtree,
  • r just ๐‘ฆ, with a smaller subtree.
  • Pumping down gives ๐‘ฃ๐‘ฆ๐‘จ; pumping up

gives ๐‘ฃ๐‘ค๐‘—๐‘ฆ๐‘ง๐‘—๐‘จ with ๐‘— โ‰ฅ 1

T R R x u v y z

slide-87
SLIDE 87

Pumping Lemma: Proof Cnd. 2

  • Condition states ๐‘ค๐‘ง > 0.
  • We must be sure ๐‘ค and ๐‘ง are not ๐œ.
  • Assuming they were ๐œ, substituting

smaller for bigger subtree would lead to parse tree with fewer nodes.

  • Contradiction: ๐œ chosen to be parse

tree with fewest number of nodes

slide-88
SLIDE 88

Pumping Lemma: Proof Cnd. 3

  • Condition states ๐‘ค๐‘ฆ๐‘ง โ‰ค ๐‘ž
  • Upper occurrence of R generates ๐‘ค๐‘ฆ๐‘ง
  • R chosen such that both occurrences

fall within the bottom ๐‘Š + 1 variables

  • n the path and longest path
  • Subtree where R generates ๐‘ค๐‘ฆ๐‘ง is at

most ๐‘Š + 2 high.

  • A tree of height ๐‘Š + 2 can generate

strings of length at most ๐‘ ๐‘Š +2 = ๐‘ž

slide-89
SLIDE 89

Non Context Free Languages

๐ถ = ๐‘๐‘œ๐‘๐‘œ๐‘‘๐‘œ | ๐‘œ โ‰ฅ 0

  • Choose ๐‘๐‘ž๐‘๐‘ž๐‘‘๐‘ž
  • Find ๐‘ฃ๐‘ค๐‘ฆ๐‘ง๐‘จ , either v or y not empty (2)
  • Two cases:
  • Contain only one type of symbol:

Impossible to respect the equal number

  • Contain mixed symbols:

Impossible to keep the order of symbols

slide-90
SLIDE 90

Non Context Free Languages

๐ท = ๐‘๐‘—๐‘๐‘˜๐‘‘๐‘™ | 0 โ‰ค ๐‘— โ‰ค ๐‘˜ โ‰ค ๐‘™

  • Choose ๐‘๐‘ž๐‘๐‘ž๐‘‘๐‘ž
  • Find ๐‘ฃ๐‘ค๐‘ฆ๐‘ง๐‘จ , either v or y not empty (2)
  • Two cases as before:
  • Contain only one type of symbol

More complex to prove (next slide)

  • Contain mixed symbols

Impossible to keep the order of symbols

slide-91
SLIDE 91

Non Context Free Languages

๐ท = ๐‘๐‘—๐‘๐‘˜๐‘‘๐‘™ | 0 โ‰ค ๐‘— โ‰ค ๐‘˜ โ‰ค ๐‘™

  • Contain only one type of symbol
  • a does not appear:

we have that ๐‘ฃ๐‘ค0๐‘ฆ๐‘ง0๐‘จ โˆ‰ ๐ท (less b and c)

  • b does not appear:

if a appears, ๐‘ฃ๐‘ค2๐‘ฆ๐‘ง2๐‘จ โˆ‰ ๐ท (more a than b) if c appears, ๐‘ฃ๐‘ค0๐‘ฆ๐‘ง0๐‘จ โˆ‰ ๐ท (more c than b)

  • c does not appear:

we have that ๐‘ฃ๐‘ค2๐‘ฆ๐‘ง2๐‘จ โˆ‰ ๐ท (more a and b)

slide-92
SLIDE 92

Example Exam Question

slide-93
SLIDE 93

Summary

  • Context free grammars
  • Pushdown Automata
  • Equivalence of PDAs and CFGs
  • Non-context free grammars
  • Pumping lemma