Parsing beyond context-free grammar: S ( 0 , n ) for any w T - - PowerPoint PPT Presentation

parsing beyond context free grammar
SMART_READER_LITE
LIVE PREVIEW

Parsing beyond context-free grammar: S ( 0 , n ) for any w T - - PowerPoint PPT Presentation

Kallmeyer/Maier Sommersemester 2009 Kallmeyer/Maier Sommersemester 2009 Elimination of useless rules (1) Boullier (1998) A useless rule is a rule which cannot be used in a derivation Parsing beyond context-free grammar: S ( 0 , n


slide-1
SLIDE 1

Kallmeyer/Maier Sommersemester 2009

Parsing beyond context-free grammar: Simple RCG: Simplifying the Grammar

Laura Kallmeyer, Wolfgang Maier Sommersemester 2009

Parsing beyond CFG 1 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Overview

  • 1. Elimination of useless rules
  • 2. Elimination of ε-Rules
  • 3. Ordering
  • 4. Binarization
  • 5. Conclusion

Parsing beyond CFG 2 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Elimination of useless rules (1) Boullier (1998) A useless rule is a rule which cannot be used in a derivation S(0, n)

⇒ ε for any w ∈ T ∗. Elimination (similar as in the CFG case) in two steps:

  • 1. Compute the set NT of all symbols A ∈ N that can lead to a

tuple of terminal strings: [A] A( α) → ε ∈ P [A1], . . ., [Am] [A] A( α) → A1( α1) . . . Am( αm) ∈ P Rules that contain non-terminals not in NT are eliminated.

Parsing beyond CFG 3 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Elimination of useless rules (2)

  • 2. In the resulting simple RCG, eliminate unreachable rules:

Compute the set NS of non-terminals reachable from S: [S] [A] [A1], . . ., [Am] A( α) → A1( α1) . . .Am( αm) ∈ P Rules that contain non-terminals not in NS are eliminated.

Parsing beyond CFG 4 LCFRS/MCFG/sRCG III

slide-2
SLIDE 2

Kallmeyer/Maier Sommersemester 2009

Elimination of ε-Rules (1) An ε-rule is a rule with one of the lhs arguments being ε. A simple RCG is ǫ-free if

  • it either contains no ǫ-rules
  • or there is exactly one rule S(ǫ) → ǫ and S does not appear in

any rhs of a rule in G

Parsing beyond CFG 5 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Elimination of ε-Rules (2) First, compute for all A ∈ N, all possibilities to have ε-components in their yields:

  • We introduce vectors

ι ∈ {0, 1}dim(A), and

  • we compute the set Nε of pairs (A,

ι) where ι signifies that it is possible for A to have a tuple τ in its yield with τ(i) = ε if

  • ι(i) = 0 and τ(i) = ε if

ι(i) = 0.

Parsing beyond CFG 6 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Elimination of ε-Rules (3) The set Nǫ is constructed recursively with initial value ∅:

  • 1. For every A(x1, . . ., xk) → ǫ ∈ P:

Add (A, ι) to Nǫ with for all 1 ≤ i ≤ k: ι(i) = 0 if xi = ε, else

  • ι(i) = 1 for all 1 ≤ i ≤ k.
  • 2. Repeat until Nǫ does not change any more:

For every A(x1, . . ., xk) → A1(α1) . . .Am(αm) and all (A1, ι1), . . . , (Am, ιm) ∈ Nǫ:

  • Calculate a vector (x′

1, . . ., x′ k) from (x1, . . ., xk) by

removing every variable that is the jth variable of Ai in the righthand side such that ιi(j) = 0 with ε.

  • Then add (A,

ι) to Nǫ with for all 1 ≤ i ≤ k: ι(i) = 0 if x′

i = ǫ, else

ι(i) = 1.

Parsing beyond CFG 7 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Elimination of ε-Rules (4) Now we can compute the new set of rules: For every clause A(x1, . . ., xk) → A1(x(1)

1 , . . ., x(1) k1 ) . . .Am(x(m) 1

, . . ., x(m)

km ) and all

(A1, ι1), . . . , (Am, ιm) ∈ Nǫ, we compute a clause for the new grammar:

  • In the rhs, we replace Ai with (Ai,

ιi) for all 1 ≤ i ≤ m;

  • in both, lhs and rhs, we delete all variables x(i)

j

for 1 ≤ i ≤ m and 1 ≤ j ≤ ki with ιi(j) = 0;

  • in the lhs, we replace A with (A,

ι) where ι(l) = 0 iff the ith argument is ε;

  • finally delete all ε-arguments in lhs and rhs.

Furthermore, we add a new start symbol (S′) with S′(X) → S1(X) and, if ε in the language, S′(ε) → ε.

Parsing beyond CFG 8 LCFRS/MCFG/sRCG III

slide-3
SLIDE 3

Kallmeyer/Maier Sommersemester 2009

Elimination of ε-Rules (5) Example: Original simple RCG rules: S(XY ) → A(X, Y ), A(a, ε) → ε, A(ε, a) → ε, A(a, b) → ε Set of pairs characterizing possibilities for ε-components: Nε = {(S, 1), (A, 10), (A, 01), (A, 11)} Rules after ε-elimination: S′(X) → S1(X), S1(X) → A10(X), A10(a) → ε, S1(X) → A01(X), A01(b) → ε, S1(XY ) → A11(X, Y ), A11(a, b) → ε

Parsing beyond CFG 9 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Ordered simple RCG (1) A simple RCG is ordered if for every rule A( α) → A1( α1) . . .Ak( αk) and every Ai( αi) = Ai(Y1, . . ., Ydim(Ai)) (1 ≤ i ≤ k), the order of the components of αi in α is Y1, . . ., Ydim(Ai). Crucially, in an ordered simple RCG, the order of the components

  • f the lhs predicate of a rule corresponds always to their order in

the input. For every simple RCG, there exists a weakly equivalent ordered simple RCG. For the transformation, in addition to the A ∈ N, we introduce new predicates Ap where p a permutation of 1, . . ., dim(A).

Parsing beyond CFG 10 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Ordered simple RCG (2) Transformation into an ordered simple RCG:

P ′ := P; repeat until P ′ does not change any more: for all rules r = A( α) → A1( α1) . . . Ak( αk) in P ′: for all i, 1 ≤ i ≤ k: if Ai( αi) = Ai(Y1, . . . , Ydim(Ai)) and the order of the Y1, . . . , Ydim(Ai) in α is p(Y1, . . . , Ydim(Ai)) where p is not the identity then replace Ai( αi) in r with Ap

i (p(

αi)); for every Ai-rule Ai( γ) → Γ ∈ P ′: add a new rule Ap

i (p(

γ)) → Γ to P ′ (if not yet in P ′) Parsing beyond CFG 11 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Ordered simple RCG (3) Example: Original clauses: S(XY ) → A(X, Y ), A(X, Y ) → A(Y, X), A(aX, bY ) → A(X, Y ), A(a, b) → ε Clauses after transformation into ordered RCG: S(XY ) → A(X, Y ), A(X, Y ) → Ap(X, Y ), A(aX, bY ) → A(X, Y ), A(a, b) → ε, Ap(Y, X) → A(Y, X), Ap(bY, aX) → Ap(Y, X), Ap(b, a) → ε (p being the permutation that switches the two arguments)

Parsing beyond CFG 12 LCFRS/MCFG/sRCG III

slide-4
SLIDE 4

Kallmeyer/Maier Sommersemester 2009

Binarization (1) We call the length of the rhs of a rule its rank. The rank of a simple RCG is given by the maximal rank of its rules. For every simple RCG, there is an equivalent simple RCG of rank 2. CNF binarization for CFG: A → BCD replaced with A → BE, E → CD A similar binarization algorithm can be applied to LCFRS/simple RCG: For every rule with rhs length > 2, we introduce a new predicate that comprises all rhs predicates except the first. This is done repeatedly.

Parsing beyond CFG 13 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Binarization (2) We define the reduction of a α1 ∈ [(T ∪ V )∗]k1 by X1, . . ., Xk2 ∈ V k2 where all Xi for 1 ≤ i ≤ k2 occur in α1 as the following vector of variables: We take all variables from α1 (in their order) that are not in {X1, . . ., Xk2} while starting a new component whenever a variable is, in α1, in a different component than the preceding variable in the result or in the same component but not adjacent to it. Examples:

  • 1. aX1, X2, bX3 reduced with X2 yields X1, X3;
  • 2. aX1X2bX3 reduced with X2 yields X1, X3.

Parsing beyond CFG 14 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Binarization (3)

Binarization algorithm: for all rules r = A( α) → A0( α0) . . . Am( αm) in P with m > 1: remove r from P; R := ∅; pick new predicate names C1, . . . , Cm−1; add A( α) → A0( α0)C1( γ1) to R where γ1 is α reduced with α0; for all i, 1 ≤ i < m − 1: add Ci( γi) → Ai( αi)Ci+1( γi+1) to R where

  • γi+1 is

γi reduced with αi; add Cm−1(

  • γm−2) → Am−1(
  • αm−1)Am(

αm) to R; for every r′ ∈ R: replace rhs arguments of length > 1 with new variables (in both sides) and add the result to P Parsing beyond CFG 15 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Binarization (4) Example: Original simple RCG:

S(XY ZUV W) → A(X, U)B(Y, V )C(Z, W) A(aX, aY ) → A(X, Y ) B(bX, bY ) → B(X, Y ) C(cX, cY ) → C(X, Y ) A(a, a) → ε B(b, b) → ε C(c, c) → ε For the rule with righthand side of length > 2 we obtain R = {S(XY ZUV W) → A(X, U)C1(Y Z, V W), C1(Y Z, V W) → B(Y, V )C(Z, W)} Equivalent simple RCG of rank 2: S(XPUQ) → A(X, U)C1(P, Q) C1(Y Z, V W) → B(Y, V )C(Z, W) A(aX, aY ) → A(X, Y ) B(bX, bY ) → B(X, Y ) C(cX, cY ) → C(X, Y ) A(a, a) → ε B(b, b) → ε C(c, c) → ε Parsing beyond CFG 16 LCFRS/MCFG/sRCG III

slide-5
SLIDE 5

Kallmeyer/Maier Sommersemester 2009

Binarization (5) There are different ways to binarize a given simple RCG rule: We could choose any partition of the rhs predicates into two sets and then introduce new predicates for each element of the partition. The arities (dimensions) of the new predicates depend on the partitions we choose. Gomez et al. (2009) have shown how to obtain an optimal binarization for a given LCFRS in the sense of obtaining a minimal arity (fan-out in LCFRS terminology). In addition, one could also try to minimize the number of variables per rule.

Parsing beyond CFG 17 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Binarization (6) Example: A(aXY, cZ, dU) → B(X)C(Y, Z)D(U).

  • 1. Partition into B(X) and C(Y, Z)D(U). Leads to

A(aXY, cZ, dU) → B(X)E1(Y, Z, U), E1(Y, Z, U) → C(Y, Z)D(U) (arity 3, 4 variables)

  • 2. Partition into C(Y, Z) and B(X)D(U). Leads to

A(aXY, cZ, dU) → E2(X, U)C(Y, Z), E2(X, U) → B(X)D(U) (arity 2 and 4 variables)

  • 3. Partition into D(U) and B(X)C(Y, Z). Leads to

A(aV, cZ, dU) → E3(V, Z)D(U), E3(XY, Z) → B(X)C(Y, Z) (arity 2 and 3 variables) The third possibility is the best one since it gives us a minimal arity and a minimal variable number per clause.

Parsing beyond CFG 18 LCFRS/MCFG/sRCG III Kallmeyer/Maier Sommersemester 2009

Conclusion

  • Since LCFRS/sRCG/MCFG is a natural extension of CFG, the

transformation proposed for CFG in order to facilitate parsing can be applied to sRCG as well:

  • We can eliminate useless rules and ε-rules, we can order the

yield components and we can binarize the rules.

Parsing beyond CFG 19 LCFRS/MCFG/sRCG III