MA/CSSE 474 Theory of Computation Bottom-up parsing Pumping - - PDF document

ma csse 474 theory of computation
SMART_READER_LITE
LIVE PREVIEW

MA/CSSE 474 Theory of Computation Bottom-up parsing Pumping - - PDF document

MA/CSSE 474 Theory of Computation Bottom-up parsing Pumping Theorem for CFLs Recap: Going One Way Lemma : Each context-free language is accepted by some PDA. Proof (by construction): The idea: Let the stack do the work. Two approaches:


slide-1
SLIDE 1

Bottom-up parsing Pumping Theorem for CFLs

MA/CSSE 474 Theory of Computation

Recap: Going One Way

Lemma: Each context-free language is accepted by some PDA. Proof (by construction): The idea: Let the stack do the work. Two approaches:

  • Top down
  • Bottom up
slide-2
SLIDE 2

Top-down VS Bottom-up

Approach Top-down Bottom-up Read the input string left-to-right left-to-right Derivation leftmost rightmost Order of derivation discovery forward backward

Bottom-Up PDA

Top-down parser discovers a leftmost derivation of the input string (If any). Bottom-up parser discovers a rightmost derivation (in reverse order)

The outline of M is: M = ({p, q}, , V, , p, {q}), where  contains:

  • The shift transitions: ((p, c, ), (p, c)), for each c  .
  • The reduce transitions: ((p, , (s1s2…sn.)R), (p, X)), for each rule

X  s1s2…sn. in G. Undoes an application of this rule.

  • The finish-up transition: ((p, , S), (q, )).
slide-3
SLIDE 3

Bottom-Up PDA

(1) E  E + T (2) E  T (3) T  T  F (4) T  F (5) F  (E) (6) F  id Reduce Transitions: (1) (p, , T + E), (p, E) (2) (p, , T), (p, E) (3) (p, , F  T), (p, T) (4) (p, , F), (p, T) (5) (p, , )E( ), (p, F) (6) (p, , id), (p, F) Shift Transitions: (7) (p, id, ), (p, id) (8) (p, (, ), (p, () (9) (p, ), ), (p, )) (10) (p, +, ), (p, +) (11) (p, , ), (p, ) The idea: Let the stack keep track of what has been found. Discover a rightmost derivation in reverse order. Start with the string of terminals and attempt to "pull it back" (reduce) to S. When the right side of a production is

  • n the top of the stack, we can replace

it by the left side of that production… …or not! That's where the nondeterminism comes in: choice between shift and reduce; choice between two reductions.

Example: id + id * id

Hidden during class, revealed later: Solution to bottom-up example

A bottom-up parser is sometimes called a shift-reduce parser. Show how it works on id + id * id State stack remaining input transition to use p  id + id * id 7 p id + id * id 6 p F + id * id 4 p T + id * id 2 p E + id * id 10 p +E id * id 7 p id+E * id 6 p F+E * id 4 p T+E * id 11 p *T+E id 7 p id*T+E  6 p F*T+E  3 p T+E  1 p E  q   Note that the top of the stack is on the left. This is what I should have done in the class for sections 1 and 2 (and I did do it for section 3).

slide-4
SLIDE 4

Acceptance by PDA  derived from CFG

  • Much more complex than the other direction.
  • Nonterminals in the grammar that we build from the

PDA M are based on a combination of M's states and stack symbols.

  • It gets very messy.
  • Takes 9½ dense pages in the textbook (265-274).
  • I think we can use our limited course time better.

How Many Context-Free Languages Are There?

(we had a slide just like this for regular languages) Theorem: For any finite input alphabet Σ, there is a countably infinite number of CFLs over Σ. Proof:

  • Upper bound: we can lexicographically enumerate

all the CFGs.

  • Lower bound: Each of {a}, {aa}, {aaa}, … is a CFL.

The number of languages over Σ is uncountable. Thus there are more languages than there are context- free languages. So there must be some languages that are not context- free.

slide-5
SLIDE 5

Languages That Are and Are Not Context-Free

a*b* is regular. AnBn = {anbn : n  0} is context-free but not regular. AnBnCn = {anbncn : n  0} is not context-free. We will show this soon. Is every regular language also context-free?

Showing that L is Context-Free

Techniques for showing that a language L is context-free:

  • 1. Exhibit a CFG for L.
  • 2. Exhibit a PDA for L.
  • 3. Use the closure properties of context-free languages.

Unfortunately, these are weaker than they are for regular languages. union, reverse, concatenation, Kleene star intersection of a CFL with a regular language NOT intersection, complement, set difference

slide-6
SLIDE 6

CFL Pumping Theorem

Show that L is Not Context-Free

Recall the basis for the pumping theorem for regular languages: A DFSM M. Why would it be hard to use a PDA to show that long strings from a CFL can be pumped? If a string is longer than the number of M's states…

slide-7
SLIDE 7

Some Tree Geometry Basics

The height h of a tree is the length of the longest path from the root to any leaf. The branching factor b of a tree is the largest number of children associated with any node in the tree. Theorem: The length of the yield (concatenation of leaf nodes)

  • f any tree T with height h and branching factor b is  bh.

Shown in CSSE 230.

A Review of Parse Trees

A parse tree, (a.k.a. derivation tree) derived from a grammar G = (V, , R, S), is a rooted, ordered tree in which:

  • Every leaf node is labeled with an element of   {},
  • The root node is labeled S,
  • Every interior node is labeled with an element of N

(i.e., V - ),

  • If m is a non-leaf node labeled X and the children of m

(left-to-right on the tree) are labeled x1, x2, …, xn, then the rule X  x1 x2 … xn is in R.

slide-8
SLIDE 8

From Grammars to Trees

Given a context-free grammar G:

  • Let n be the number of nonterminal symbols in G.
  • Let b be the branching factor of G

Suppose that a tree T is generated by G and no nonterminal appears more than once on any path from the root:

The maximum height of T is: The maximum length of T’s yield is:

The Context-Free Pumping Theorem

We use parse trees, not machines, as the basis for our argument. Let L = L(G), and let wL. Let T be a parse tree for w such that has the smallest possible number of nodes among all trees based on a derivation of w from G. Suppose L(G) contains a string w such that |w| is greater than bn. Then its parse tree must look like (for some nonterminal X): X[1] is the lowest place in the tree for which this happens. I.e., there is no other X in the derivation of x from X[2].

slide-9
SLIDE 9

The Context-Free Pumping Theorem

There is another derivation in G: S * uXz * uxz, in which, at X[1], the nonrecursive rule that leads to x is used instead of the recursive one that leads to vXy. So uxz is also in L(G).

Derivation of w

The Context-Free Pumping Theorem

There are infinitely many derivations in G, such as: S * uXz * uvXyz * uvvXyyz * uvvxyyz Those derivations produce the strings: uv2xy2z, uv3xy3z, uv4xy4z, … So all of those strings are also in L(G).

slide-10
SLIDE 10

The Context-Free Pumping Theorem

If rule1 is X  Xa, we could have v = . If rule1 is X  aX, we could have y = . But it is not possible that both v and y are . If they were, then the derivation S * uXz * uxz would also yield w and it would create a parse tree with fewer nodes. But that contradicts the assumption that we started with a parse tree for w with the smallest possible number

  • f nodes.

The Context-Free Pumping Theorem

The height of the subtree rooted at [1] is at most: So |vxy|  .

slide-11
SLIDE 11

The Context-Free Pumping Theorem

If L is a context-free language, then k  1 ( strings w  L, where |w|  k (u, v, x, y, z (w = uvxyz, vy  , |vxy|  k, and q  0 (uvqxyqz is in L)))).

Write it in contrapositive

  • form. Try to

do this before going on.

Pumping Theorem contrapositive

  • We want to write it in contrapositive form, so we can use it to

show a language is NOT context-free. Original: If L is a context-free language, then k  1 ( strings w  L, where |w|  k (u, v, x, y, z (w = uvxyz, vy  , |vxy|  k, and q  0 (uvqxyqz is in L)))).

Contrapositive: If k  1 ( string w  L, where |w|  k (u, v, x, y, z (w = uvxyz, vy  , |vxy|  k, and q  0 (uvqxyqz is not in L)))), then L is not a CFL.

slide-12
SLIDE 12

Regular vs. CF Pumping Theorems

Similarities:

  • We don't get to choose k.
  • We choose w, the string to be pumped, based on k.
  • We don't get to choose how w is broken up (into xyz or uvxyz)
  • We choose a value for q that shows that w isn’t pumpable.
  • We may apply closure theorems before we start.

Things that are different in CFL Pumping Theorem:

  • Two regions, v and y, must be pumped in tandem.
  • We don’t know anything about where in the strings v and y will

fall in the string w. All we know is that they are reasonably “close together”, i.e., |vxy|  k.

  • Either v or y may be empty, but not both.

Pumping Theorem contrapositive

  • We want to write it in contrapositive form, so we can use it to

show a language is NOT context-free. Original: If L is a context-free language, then k  1 ( strings w  L, where |w|  k (u, v, x, y, z (w = uvxyz, vy  , |vxy|  k, and q  0 (uvqxyqz is in L)))).

Contrapositive: If k  1 ( string w  L, where |w|  k (u, v, x, y, z (w = uvxyz, vy  , |vxy|  k, and q  0 (uvqxyqz is not in L)))), then L is not a CFL.

Example: AnBnCn

slide-13
SLIDE 13

An Example of Pumping: AnBnCn

AnBnCn = {anbncn, n 0} Choose w = ak bk ck (we don't get to choose the k) 1 | 2 | 3 (the regions: all a's, all b's, all c's) If either v or y spans two regions, then let q = 2 (i.e., pump in

  • nce). The resulting string will have letters out of order and

thus not be in AnBnCn. If both v and y each contain only one distinct character, set q to

  • 2. Additional copies of at most two different characters are

added, leaving the third unchanged. We no longer have equal numbers of the three letters, so the resulting string is not in AnBnCn.

An Example of Pumping: { , n 0}

L = { , n  0} The elements of L:

a n2

a n2

n w  1 a1 2 a4 3 a9 4 a16 5 a25 6 a36

slide-14
SLIDE 14

Nested and Cross-Serial Dependencies

PalEven = {wwR : w  {a, b}*} a a b b a a The dependencies are nested. WcW = {wcw : w  {a, b}*} a a b c a a b Cross-serial dependencies.

Work with another student on these

  • WcW = {wcw : w  {a, b}*}
  • {(ab)nanbn : n > 0}
  • {x#y : x, y  {0, 1}* and x  y}