1 Algorithms for Context-Free Languages The parsing problem is, - PDF document

1 Algorithms for Context-Free Languages The parsing problem is, given a string w and a context-free grammar G , to decide if w ∈ L ( G ), and if so, to produce a parse tree for it. How fast can this be done in general? One can put G into a special form called Chomsky normal form that makes parsing easier. It’s still too slow for large programs, but it can be useful for rapid prototyping in the early stages of language development. Any context-free grammar can be put into Chomksy normal form, roughly speaking. Definition 1.1 A context-free grammar G = ( V, Σ , R, S ) is in Chomsky normal form if the right-hand sides of all rules have length 2. Note that if G is in Chomsky normal form then L ( G ) cannot contain any strings of length 0 or 1. Theorem 1.1 For any context-free grammar G there is a context-free grammar G ′ in Chomsky normal form such that L ( G ′ ) = L ( G ) − (Σ ∪ { e } ) . Thus L ( G ) and L ( G ′ ) agree on strings of length greater than one. Also, G ′ can be obtained from G in polynomial time. 1.1 Transforming to Chomsky Normal Form on an Ex- ample We will just show the transformation on an example to illustrate the idea of the proof. Consider this context-free grammar: S → SS S → ( S ) S → ǫ The start symbol is S . 1

1.1.1 Step 1 The first step is to eliminate rules whose right-hand side has length greater than 2. Here there is just one rule like that: S → ( S ). This is split up into smaller rules whose right-hand sides have length 2. This yields the following grammar: S → SS S → ( S 1 S 1 → S ) S → ǫ Note that S 1 is a new nonterminal. How would you split up the rule S → UV XY ? 1.1.2 Step 2 The next step is to eliminate rules whose right-hand side is ǫ . This is done by substituting them in other rules so that they are not needed. • For example, the rule S → ǫ can be substituted into S → SS ; • we replace one of the S on the right-hand side with ǫ , yielding S → S . Of course, this rule is unnecessary. • We can also do this on the rule S 1 → S ) yielding the rule S 1 → ). This is a new rule that should be kept. • After this step, all rules with ǫ on the right-hand side can be removed, giving this grammar: S → S S → SS S → ( S 1 S 1 → S ) 2

S 1 → ) Of course, the first rule can be eliminated, giving this grammar: S → SS S → ( S 1 S 1 → S ) S 1 → ) 1.1.3 Step 3 Now all rules have right-hand sides of length one or two. It is necessary to eliminate the rules whose right-hand side has length one. This can be done by substituting as before; however, it may be necessary to do a chain of substitutions, if one has something like X → Y and Y → Z and X occurs on the right-hand side of some rule. • In our grammar, we can substitute the rule S 1 → ) into the rule S → ( S 1 obtaining the rule S → (). • Then the rule S 1 → ) can be eliminated. This yields the following grammar: S → SS S → ( S 1 S 1 → S ) S → () This grammar is in Chomsky normal form, and we are done. 3

1.2 Time to Parse in Chomsky Normal Form • Note that in a Chomsky normal form grammar, each replacement makes a string longer by one symbol. • In our grammar, we have a derivation like this: S ⇒ SS ⇒ ( S 1 S ⇒ ( S ) S . . . and note that each string is one symbol longer than the one before. So for example, a derivation of a string of length four has exactly three replacements in it. • This gives a way to decide if a string w is in L ( G ); just compute the length n of w and look at all derivations of length n − 1 to see if w can be derived. • However, this is very inefficient. It is possible to do much better. In fact, it can be done in O ( n 3 ) time. This gives the idea of the method: 4

((())()(())) S S S S1 S1 S S S S S S1 S1 S • The idea is that, to parse a string w , one considers all substrings v of w in order of size, and finds all nonterminals X such that X ⇒ ∗ v . • Each such substring v has to be split up as v 1 v 2 in all possible ways, • and for each way it is necessary to consider all X 1 and X 2 such that X 1 ⇒ ∗ v 1 and X 2 ⇒ ∗ v 2 and also all productions X → X 1 X 2 in the grammar. • The number of substrings of w is O ( n 2 ). • For each substring there are O ( n ) ways to split it up into v 1 v 2 . 5

• For each way of splitting it there is a constant amount of work, so the total work is O ( n 3 ). Here is an illustration how the method works on a substring in general: ((())()(())) A B C C -- AB > 6

1 Algorithms for Context-Free Languages The parsing problem is, - PDF document

1 Algorithms for Context-Free Languages The parsing problem is, given a string w and a context-free grammar G , to decide if w L ( G ), and if so, to produce a parse tree for it. How fast can this be done in general? One can put G into a

Before We Start Any questions? Context Free Languages PDAs and CFLs Languages Context Free

1 Context-Free Grammars Context-free languages are useful for studying computer languages as well

Context Sensitivity Example of a CSG Informatics 2A: Lecture 26 2 Context in Programming

Pumping Lemma for Context-Free Languages CSCI 3130 Formal Languages and Automata Theory Siu On

Context-Free Grammars and Languages Context-Free Grammars and Languages p.1/40

Properties of Context-Free Languages 2IT70 Finite Automata and Process Theory Technische

1 Closure Properties of Context-Free Languages We show that context-free languages are closed

Partially - Commutative Context - Free Languages Wojciech Czerwi ski S awomir Lasota G

Decidable Problems Concerning Context-Free Languages Decidable Problems Concerning Context-Free

Now our picture looks like Decision and Closure Context Free Languages Properties of CFLs

1 Push-down Automata and Context-Free Languages Lemma 1.1 (3.4.1) The class of languages

C6.1 Pumping Lemma Languages Automata & PDAs Formal Languages Context-free Languages

A Coalgebraic View on Context-Free Languages and Streams Joost Winter Centrum Wiskunde &

Theoretical Computer Science (Bridging Course) Context Free Languages Gian Diego Tipaldi Topics

Context-free languages I In Chapter 1 we consider two ways to describe languages automata &

Context-sensitive languages Informatics 2A: Lecture 28 Alex Simpson School of Informatics

CSC 1800 Organization of Programming Languages Syntax 1 Questions What is a computer

Mining Algorithms for New Applications: Modifying vs. Reductions Sanjoy Dasgupta Russell

Analysis of Algorithms What to analyze [Lewis/Denenberg 2.1, Goodrich/Tamassia 3.5]

The Role of Algorithms in Computing Chapter 1 1 CPTR 430 Algorithms The Role of Algorithms in

Context Free Languages and Grammars Lecture 7 September 18, 2018 Nikita Borisov (UIUC) CS/ECE

Reminder Final exam Solvability The date for the Final has been decided: Saturday,

Pattern Languages Seminar Algorithmic Learning Theory, SS 2015 Michael Krause RWTH Aachen

General Methods and Search 2. Generic Approaches to Combinatorial Optimization Algorithms 3.

1 Algorithms for Context-Free Languages The parsing problem is, - PDF document

1 Algorithms for Context-Free Languages The parsing problem is, given a string w and a context-free grammar G , to decide if w L ( G ), and if so, to produce a parse tree for it. How fast can this be done in general? One can put G into a

Before We Start Any questions? Context Free Languages PDAs and CFLs Languages Context Free

1 Context-Free Grammars Context-free languages are useful for studying computer languages as well

Context Sensitivity Example of a CSG Informatics 2A: Lecture 26 2 Context in Programming

Pumping Lemma for Context-Free Languages CSCI 3130 Formal Languages and Automata Theory Siu On

Context-Free Grammars and Languages Context-Free Grammars and Languages p.1/40

Properties of Context-Free Languages 2IT70 Finite Automata and Process Theory Technische

1 Closure Properties of Context-Free Languages We show that context-free languages are closed

Partially - Commutative Context - Free Languages Wojciech Czerwi ski S awomir Lasota G

Decidable Problems Concerning Context-Free Languages Decidable Problems Concerning Context-Free

Now our picture looks like Decision and Closure Context Free Languages Properties of CFLs

1 Push-down Automata and Context-Free Languages Lemma 1.1 (3.4.1) The class of languages

C6.1 Pumping Lemma Languages Automata &amp; PDAs Formal Languages Context-free Languages

A Coalgebraic View on Context-Free Languages and Streams Joost Winter Centrum Wiskunde &amp;

Theoretical Computer Science (Bridging Course) Context Free Languages Gian Diego Tipaldi Topics

Context-free languages I In Chapter 1 we consider two ways to describe languages automata &amp;

Context-sensitive languages Informatics 2A: Lecture 28 Alex Simpson School of Informatics

CSC 1800 Organization of Programming Languages Syntax 1 Questions What is a computer

Mining Algorithms for New Applications: Modifying vs. Reductions Sanjoy Dasgupta Russell

Analysis of Algorithms What to analyze [Lewis/Denenberg 2.1, Goodrich/Tamassia 3.5]

The Role of Algorithms in Computing Chapter 1 1 CPTR 430 Algorithms The Role of Algorithms in

Context Free Languages and Grammars Lecture 7 September 18, 2018 Nikita Borisov (UIUC) CS/ECE

Reminder Final exam Solvability The date for the Final has been decided: Saturday,

Pattern Languages Seminar Algorithmic Learning Theory, SS 2015 Michael Krause RWTH Aachen

General Methods and Search 2. Generic Approaches to Combinatorial Optimization Algorithms 3.

C6.1 Pumping Lemma Languages Automata & PDAs Formal Languages Context-free Languages

A Coalgebraic View on Context-Free Languages and Streams Joost Winter Centrum Wiskunde &

Context-free languages I In Chapter 1 we consider two ways to describe languages automata &