chomsky normal form
play

Chomsky Normal Form Chomsky Normal Form Chomsky Normal Form A - PDF document

Chomsky Normal Form Chomsky Normal Form Chomsky Normal Form A context free grammar is in Chomsky Normal Form (CNF) if every production is of the form: A BC A a Where A,B, and C are variables and a is a terminal.


  1. Chomsky Normal Form • Chomsky Normal Form Chomsky Normal Form – A context free grammar is in Chomsky Normal Form (CNF) if every production is of the form: • A → BC • A → a • Where A,B, and C are variables and a is a terminal. Theory Hall of Fame Chomsky Normal Form • Noam Chomsky • If we can put a CFG into CNF, then we can – The Grammar Guy calculate the “depth” of the longest branch – 1928 – of a parse tree for the derivation of a string. – b. Philadelphia, PA – PhD – UPenn (1955) A At most 2 branches at • Linguistics every node – Prof at MIT (Linguistics) B C (1955 - present) – Probably more famous for his a leftist political views. Removing ε -Productions Chomsky Normal Form • A ε -Productions is a production of the form • 3 Step process: • A → ε – Basic idea 1. Remove ε - Productions • Very similar to removing ε transitions from a NFA- 2. Remove Unit Productions ε • Find the set of all variables A such that A ⇒ * ε (set 3. Remove Useless Symbols of nullable variables) • For all productions that contain a nullable variable on the right hand side, add a production that eliminates the nullable from the right hand side 1

  2. Removing ε -Productions Removing ε -Productions • We must be a bit careful here • Step 1: Find the set of nullable variables: – If ε is in a CFL, then the production S → ε – Example: • S → AB must be in the production set. • A → aAA | ε – The algorithm to be described will generate L – { ε } • B → bBB | ε • All variables are nullable – A and B are nullable since A → ε and B → ε – S is nullable since S → AB and A and B are nullable Removing ε -Productions Removing ε -Productions • Step 2: Remove nullable variables Step 2: Remove nullable variables Example: – For all productions A → β where β contains • S → AB • A → aAA | ε nullable variables, add a new production with each nullable removed from β • B → bBB | ε • All variables are nullable Removing ε -Productions Removing ε -Productions • Step 2: Remove nullable variables Example: • Step 2: Remove nullable variables – Consider: S → AB – Our grammar now looks like: • Add to P: S → A and S → B • S → AB | A | B • A → aAA | aA | a | ε – Consider: A → aAA • B → bBB | bB | b | ε • Add to P: A → aA and A → a – Consider: B → bBB • Add to P: B → bB and B → b 2

  3. Removing ε -Productions Removing Unit Productions • Step 3: Remove your ε -Productions • A Unit Productions is a production of the form – Example: • A → B where A and B are variable • Remove A → ε and B → ε – Basic idea • Our final grammar looks like: • Very similar to removing ε productions – S → AB | A | B • For each variable A, find the set of all variables B – A → aAA | aA | a such that A ⇒ * B by just following unit productions – B → bBB | bB | b (A-derivable) • For all variables B that are A derivable and for all – Questions? productions B → α , add the production A → α Removing Unit Productions Removing Unit Productions • Step 0: Remove ε -Productions using the • Step 1: For all variables A find the set of previous algorithm. A-derivable variables: – Recursive definition of A-derivable 1. If A → B then B is A-derivable 2. If C is A derivable and C → B (and B ≠ A), then B is A derivable 3. No other variables are A-derivable. Removing Unit Productions Removing Unit Productions • Step 1: For all variables A find the set of A- • Step 1: For all variables A find the set of A- derivable variables: derivable variables: – Example: – Example: • S → S + T | T • S → S + T | T • T → T * F | F • T → T * F | F • F → (S) | a • F → (S) | a • Let’s find the set of S-derivable variables: • S-derivable = {T, F} – T is S derivable since S → T • T-derivable = {F} – F is S derivable since T → F and T is S derivable • F-derivable = ∅ 3

  4. Removing Unit Productions Removing Unit Productions • Step 2: • Step 2: For each variable A, if B is A- – Example: derivable, for each non-unit production B • S → S + T | T → β , add the production A → β • T → T * F | F • F → (S) | a • S-derivable = {T, F} • T-derivable = {F} • Add to P: S → T * F, S → (S) | a : T → (S) | a • Removing Unit Productions Removing Unit Productions • Step 3: Remove Unit Productions • Step 2: – Our final grammar looks like: – Our new grammar now looks like: – Our new grammar now looks like: • S → S + T | T * F | (S) | a | T • S → S + T | T * F | (S) | a • T → T * F | (S) | a | F • T → T * F | (S) | a • F → (S) | a • Remove S → T, T → F – Questions Removing Useless Symbols Removing Useless Symbols • A symbol X is useful for a grammar G = (V, T, P, • Definitions: S) if – We say a symbol X is generating if: – S ⇒ * α X β ⇒ * w where w ∈ L(G) • X ⇒ * w for some w ∈ L(G) • In other words, a useful symbol will be used – We say a symbol X is reachable if: somewhere in the derivation of a string in the • S ⇒ * α X β for some α , β language. • Symbols that are useful must be both • Any symbol that is not useful is useless. generating and reachable. • Useless symbols do not add to the language – Such symbols (and assoc. productions) can be generated by a grammar, so it’s okay to remove removed them. 4

  5. Removing useless symbols Removing useless symbols • Algorithm: • Finding generating symbols 1. Eliminate all non generating symbols 1. All symbols in T are generating 2. If A → α and all symbols in α are 2. Eliminate all non reachable symbols from resultant grammar. generating, then A is generating. 3. No other symbols are generating. Removing useless symbols Removing Useless Symbols • Finding reachable symbols • Example: S → AB | a 1. S is reachable 2. If A is reachable, and A → α , then all A → b variables in α are reachable. B is useless since it is not generating Eliminate it Removing useless symbols Recall our goal • Example: • Chomsky Normal Form S → a – A context free grammar is in Chomsky Normal A → b Form (CNF) if every production is of the form: • A → BC – Now A is not reachable, eliminate it! • A → a S → a • Where A,B, and C are variables and a is a terminal. Note that you must eliminate non-generating symbols before non-reachable symbols. 5

  6. Chomsky Normal Form Chomsky Normal Form • Given a CFG G, there is an equivalent CFG, • Step 1: G’ in Chomsky Normal form such that – Remove ε -Productions – L(G’) = L(G) – { ε } • Step 2: – Remove Unit Productions • Step 3: – Remove useless symbols Chomsky Normal Form Chomsky Normal Form • Step 4: • After steps 1 – 3 : – All productions are of the form: – Let’s go back to our first example: • A → a where A is a variable and a is a terminal – S → AB | A | B – A → aAA | aA | a • A → β where | β | ≥ 2 and β contains variables and/or – B → bBB | bB | b terminals. • Removing unit transitions: – Step 4: Derive terminals from new variables: – S → AB | aAA | aA | a | bBB | bB | b • For all productions of the 2 nd type: A → β , for all terminals a in β , create a new variable X a – A → aAA | aA | a – B → bBB | bB | b • Add a new production X a → a • Replace a in β with X a • Note that S, A, and B are all useful. Chomsky Normal Form Chomsky Normal Form • Step 4: • After steps 1 – 4 : – Define new productions: X a → a and X b → b and – All productions are of the form: replace instance of a with X a , similarly for b • A → a where A is a variable and a is a terminal – S → AB | aAA | aA | a | bBB | bB | b – A → aAA | aA | a • A → β where | β | ≥ 2 and β contains only variables. – B → bBB | bB | b – Step 5: • New: – S → AB | X a AA | X a A | a | X b BB | X b B | b • For all productions of type 2 where | β | > 2 , replace – A → X a AA | X a A | a the production with a series of new productions each – B → X b BB | X b B | b having exactly 2 variables on the right – X a → a – X b → b • Best illustrated with an example 6

  7. Chomsky Normal Form Chomsky Normal Form • Step 4: • Step 4: – The production: – Back to our example • A → BCDBCE – S → AB | X a AA | X a A | a | X b BB | X b B | b – A → X a AA | X a A | a – Would be replaced with – B → X b BB | X b B | b • A → BY 1 – X a → a • Y 1 → CY 2 – X b → b • Y 2 → DY 3 – Add productions • Y 3 → BY 4 • Y 1 → AA • Y 4 → CE • Y 2 → BB Chomsky Normal Form CNF • Step 4: • Any grammar can be placed into CNF – Our final grammar – S → AB | X a Y 1 | X a A | a | X b Y 2 | X b B | b – A → X a Y 1 | X a A | a • Why bother? – B → X b Y 2 | X b B | b – Remember that awful CFG we generated last – Y 1 → AA week? – Y 2 → BB – X a → a • Simplification – X b → b – Gives upper limit on size of parse tree – Questions • Pumping Lemma will need this. Questions? • Next time – The Return of the pumping lemma 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend