SLIDE 1 Kallmeyer/Maier ESSLLI 2008
Parsing beyond context-free grammar: Range Concatenation Grammar Parsing
Laura Kallmeyer, Wolfgang Maier University of T¨ ubingen ESSLLI Course 2008
Parsing beyond CFG 1 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Overview
- 1. Range Concatenation Grammars (RCG)
- 2. Parsing RCG
(a) Directional top-down parsing (b) Earley-style parsing
Parsing beyond CFG 2 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Range Concatenation Grammar The idea behind range concatenation grammar (RCG) is comparable to the idea behind MCFG.
- Predicate-rewriting clauses describe ranges which are not
necessarily adjacent.
- One predicate can be true or false for a certain string.
- Some string w is in the language of an RCG if the start
predicate is true for w.
- While in MCFG, a string is generated, in RCG, a string is
reduced to ǫ.
Parsing beyond CFG 3 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Expressivity of RCG
- RCG exactly covers the class of PTIME recognizable languages
(Bertsch&Nederhof, 2001).
- Simple RCG (basically non-deleting non-copying RCG) is
equivalent to MCFG
- RCG can represent languages beyond mild context-sensitivity
Parsing beyond CFG 4 RCG Parsing
SLIDE 2 Kallmeyer/Maier ESSLLI 2008
Definition of RCGs: Grammar Definition A RCG is a tuple G = N, T, V, P, S such that
- N is a finite set of predicates, each with a fixed arity,
- T and V are disjoint finite sets of terminals and variables,
- S ∈ N is the start predicate of arity 1, and
- P is a finite set of clauses of the form
A0(x01, . . ., x0a0) → ǫ
A0(x01, . . ., x0a0) → A1(x11, . . ., x1a1) . . .An(xn1, . . ., xnan) with n ≥ 1 and Ai ∈ N, xij ∈ (T ∪ V )∗ and ai being the arity of
- Ai. A predicate An(xn1, . . ., xnan) can be written as An(
xn)
Parsing beyond CFG 5 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Definition of RCGs: Instantiation A given clause C is instantiated with respect to a string w if variables and arguments are consistently replaced by ranges of w. Example:
- A(i . . .j) → B(i + 1 . . .j)
is an instantiation of the clause
if wi+1 = a.
Parsing beyond CFG 6 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Definition of RCGs: Derivation Relation, Language
- The derivation relation is defined as follows:
For a predicate A of arity k, a clause A(. . .) → . . ., and ranges i1, j1, . . ., ik, jk with respect to a given w: if there is an instantiation of this clause with LHS A(i1, j1, . . ., ik, jk), then A(i1, j1, . . ., ii, jk) can be replaced with the RHS of this instantiation.
- The language of an RCG G is the set of strings that can be
reduced to the empty word: L(G) = {w | S(0, |w|)
∗
⇒ ǫ with respect to w}.
Parsing beyond CFG 7 RCG Parsing Kallmeyer/Maier ESSLLI 2008
A sample RCG (1) Sample RCG G for the string language {anbkan | k, n ∈ IN}: An RCG with N = {S, A, B}, T = {a, b}, V = {X, Y, Z}, start predicate S and clauses
- S(X Y Z) → A(X, Z) B(Y ),
- A(a X, a Y ) → A(X, Y ),
- B(b X) → B(X),
- A(ǫ, ǫ) → ǫ,
- B(ǫ) → ǫ
Parsing beyond CFG 8 RCG Parsing
SLIDE 3
Kallmeyer/Maier ESSLLI 2008
A sample RCG (2) As an example consider the reduction of w = aabaa: S(X Y Z) → A(X, Z) B(Y ) w0,2 w2,3 w3,5 w0,2 w3,5 w2,3 aa b aa aa aa b With this instantiation, S(w0,5) ⇒ A(w0,2, w3,5)B(w2,3). Then
Parsing beyond CFG 9 RCG Parsing Kallmeyer/Maier ESSLLI 2008
A sample RCG (3) B(b X) → B(X) w2,3 w3,3 w3,3 b ǫ ǫ and B(ǫ) → ǫ lead to A(w0,2, w3,5)B(w2,3) ⇒ A(w0,2, w3,5)B(w3,3) ⇒ A(w0,2, w3,5).
Parsing beyond CFG 10 RCG Parsing Kallmeyer/Maier ESSLLI 2008
A sample RCG (4) A(a X, a Y ) → A(X, Y ) w0,1 w1,2 w3,4 w4,5 w1,2 w4,5 a a a a a a leads to A(w0,2, w3,5) ⇒ A(w1,2, w4,5). Then
Parsing beyond CFG 11 RCG Parsing Kallmeyer/Maier ESSLLI 2008
A sample RCG (5) A(a X, a Y ) → A(X, Y ) w1,2 w2,2 w4,5 w5,5 w2,2 w5,5 a ǫ a ǫ ǫ ǫ and A(ǫ, ǫ) → ǫ lead to A(w1,2, w4,5) ⇒ A(w2,2, w5,5) ⇒ ǫ
Parsing beyond CFG 12 RCG Parsing
SLIDE 4 Kallmeyer/Maier ESSLLI 2008
Definition of RCGs: Other properties (1)
- An RCG with maximal predicate arity k is called an RCG of
arity k (also called a k-RCG).
- An RCG is called non-combinatorial if each of the arguments
in the right-hand sides of the productions are single variables.
- An RCG is called linear if no variable appears more than once
in the left-hand sides of the productions and no variable appears more than once in the right-hand side of the productions.
Parsing beyond CFG 13 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Definition of RCGs: Other properties (2)
- An RCG is called non-erasing if for each production, each
variable occurring in the left-hand side occurs also in the right-hand side and vice versa.
- An RCG is called simple if it is non-combinatorial, linear and
non-erasing.
- A simple RCG is called ordered simple if the range variables
are ordered the same way in the RHS and the LHS predicates. Ordered simple RCG is equivalent to simple RCG.
Parsing beyond CFG 14 RCG Parsing Kallmeyer/Maier ESSLLI 2008
RCG parsing: Treatment of terminals Without loss of generality, we presuppose that all non-ǫ clauses contain no terminals in their arguments. For each t ∈ T, we introduce a new clause Tt(t) → ǫ and for each clause C ∈ P,
- we replace each occurrence t′ of t in all arguments of all
predicates with a variable Vt′,
- for each Vt′, we add the predicate Tt(Vt′) to the RHS of C.
Furthermore, for all clauses we assume that its variables are continuously numbered from 1 to some j.
Parsing beyond CFG 15 RCG Parsing Kallmeyer/Maier ESSLLI 2008
RCG parsing: Range vectors We will use range vectors similar to those used for MCFG parsing. Range vectors are used to describe variable bindings.
- φ = (x1, y1, . . ., xk, yk) is a range vector in w if all xi, yi
are ranges in w for 1 ≤ i ≤ k.
- φ = (x1, y1, . . ., xk, yk) is a range constraint vector if it
contains pairs x, y where x, y ∈ Pos(w) ∪ Vr (Vr is a set {r1, r2, . . .} of range boundary variables) such that if x, y ∈ Pos(w)2 then it is a range.
- k is called the dimension of φ
- φ(i).l denotes then the first component and φ(i).r the second
component of the ith element of φ.
Parsing beyond CFG 16 RCG Parsing
SLIDE 5 Kallmeyer/Maier ESSLLI 2008
RCG parsing: Variable constraint vectors The variable constraint vector φ of a non-ǫ clause A( x) → Φ is a range constraint vector of dimension j, j being the highest variable index in the clause. It contains only x ∈ Vr × Vr and must be consistent with variable adjacencies in the clause. Formally, the elements of φ are pairs from Vr × Vr such that φ(h).r = φ(i).l iff XhXi occurs as a substring in one of the arguments of the clause.
Parsing beyond CFG 17 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Update of range vectors We define an update φ′ of a range constraint vector φ with respect to an identity x = y, x, y ∈ Pos(w) ∪ Vr as follows:
- if x = y, then φ′ = φ;
- else if x ∈ Vr and the result ψ of replacing all occurrences of x
in φ with y is a range constraint vector, then φ′ = ψ;
- else if y ∈ Vr and the result ψ of replacing all occurrences of y
in φ with x is a range constraint vector, then φ′ = ψ;
- otherwise, φ′ is undefined.
Parsing beyond CFG 18 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Directional top-down parsing Corresponds to the algorithm presented in Boullier (2000). Item form:
X) → Φ • Ψ, φ]
- Passive items: [A, ψ, flag]
where
- φ is a range vector of dimension j, j being the highest variable
index in the clause,
- ψ is a range vector of dimension k, k being the arity of A,
- flag= {p, c} indicates if a passive item is predicted or
completed.
Parsing beyond CFG 19 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Directional top-down parsing (axiom and goal)
[S, (0, n), p]
- The goal item is [S, (0, n), c].
Parsing beyond CFG 20 RCG Parsing
SLIDE 6
Kallmeyer/Maier ESSLLI 2008
Directional top-down parsing (predict-rule) We have two predict operations. Predict-rule predicts an active item for a previously introduced passive item. [A, ψ, p] [A( x) → •Ψ, φ] thereby, the variable bindings in φ applied to x yield ψ. Furthermore, φ respects the adjacency constraints imposed by the variable constraint vector of the clause.
Parsing beyond CFG 21 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Directional top-down parsing (predict-pred) Predict-pred predicts a passive item. [A(. . .) → Φ • B( x)Ψ, φ] [B, ψ, p] thereby, ψ results from applying φ to x.
Parsing beyond CFG 22 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Directional top-down parsing (scan) Scan: [A, (l, r), p] [A, (l, r), c] A(x) → ǫ, l, r(w) = x
Parsing beyond CFG 23 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Directional top-down parsing (complete) Complete moves the dot over a predicate in the RHS of an active item if the corresponding passive item has been completed. [B, φB, c], [A(. . .) → Φ • B( x)Ψ, φ] [A(. . .) → ΦB( x) • Ψ, φ] where φB must be the result of applying φ to x.
Parsing beyond CFG 24 RCG Parsing
SLIDE 7 Kallmeyer/Maier ESSLLI 2008
Directional top-down parsing (convert) Once the dot has arrived at the right end of the RHS of a clause, we can convert the active item to a passive item. Convert: [A( x) → Φ•, φ] [A, φ, c]
Parsing beyond CFG 25 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Earley-style parsing Presented in Kallmeyer&Maier (2009) (in preparation). Item form:
x) → Φ • Ψ, φ]
- Passive items: [A, ψ, flag]
where
- φ is a range constraint vector of dimension j, j being the
highest variable index in the clause,
- ψ is a range constraint vector of dimension k, k being the arity
- f A,
- flag= {p, c} indicates if a passive item is predicted or
completed.
Parsing beyond CFG 26 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Earley-style parsing (initialization and goal)
[S, (0, n), p]
- The goal item is [S, (0, n), c].
Parsing beyond CFG 27 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Earley-style parsing (predict) We have two predict operations. As for the top-down case, predict-rule predicts active items with the dot on the left of the RHS, for a given previously introduced passive item. [A, ψ, p] [A(x1 . . . y1, . . ., xk . . . yk) → •Ψ, φ′] where, starting from the variable constraint vector φ of the clause, we obtain φ′ by updating with the following identities: φ(xi).l = ψ(i).l, φ(yi).r = ψ(i).r for all 1 ≤ i ≤ k. Note the difference to the top-down case: We are now dealing with range constraint vectors, i.e., some variable boundaries remain unspecified.
Parsing beyond CFG 28 RCG Parsing
SLIDE 8 Kallmeyer/Maier ESSLLI 2008
Earley-style parsing (predict-pred) Also as for the top-down case, predict-pred predicts a passive item for the predicate following the dot in an active item. [A(. . .) → Φ • B(x1 . . . y1, . . ., xk . . . yk)Ψ, φ] [B, ψ, p] where ψ(i).l = φ(xi).l, ψ(i).r = φ(yi).r for all 1 ≤ i ≤ k.
Parsing beyond CFG 29 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Earley-style parsing (scan) Scan: [A, (l, r), p] [A, (l′, r′), c] A(x) → ǫ, l′, r′(w) = x, l, r compatible with l′, r′
- Reduce a single terminal to ǫ, recall definition
- Here, “compatible with” means that there is a function
f : {l, r} → {l′, r′} such that f(l) = l′, f(r) = r′ and f(x) = x if x ∈ Pos(w).
Parsing beyond CFG 30 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Earley-style parsing (complete) Complete moves the dot over a predicate in the RHS of an active item if the corresponding passive item has been completed. [B, φB, c], [A( x) → Φ • B(x1 . . . y1, . . ., xk . . . yk)Ψ, φ] [A( x) → ΦB(x1 . . . y1, . . ., xk . . .yk) • Ψ, φ′] where φ′ is φ updated with all new constraint information coming from φB, i.e., φ′ is an update of φ wrt. the identities φ(xj).l = φB(j).l and φ(yj).r = φB(j).r for all 1 ≤ j ≤ k.
Parsing beyond CFG 31 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Earley-style parsing (convert) Convert turns an active item with the dot at the end of the righthand side into a completed passive item [A(x1 . . . y1, . . ., xk . . . yk) → Ψ•, φ] [A, ψ, c] where ψ(i).l = φ(xi).l and ψ(i).r = φ(yi).r for all 1 ≤ i ≤ k.
Parsing beyond CFG 32 RCG Parsing
SLIDE 9 Kallmeyer/Maier ESSLLI 2008
RCG as a tool One can use RCG as an intermediary device, resp. a pivot
- formalism. We will see two applications:
- RCG for TAG parsing
- RCG for Syntax-Directed Machine Translation
Parsing beyond CFG 33 RCG Parsing Kallmeyer/Maier ESSLLI 2008
TAG → simple RCG A TAG can straightforwardly be converted into an RCG.
- Introduce a predicate for each elementary tree
- A predicate corresponding to
– an aux tree β has the form β(L, R), where L and R covers the yield of β to the left and the right of the footnode, including all material added to it – an initial tree α have the form α(X), with X covering the yield of α and all trees added to it
- A predicate α/β reduces the input by determining which parts
- f the string come from the α/β respectively and which parts
come from substituted/adjoined trees
Parsing beyond CFG 34 RCG Parsing Kallmeyer/Maier ESSLLI 2008
TAG → simple RCG: Example S SNA SNA ǫ a S b S S∗
NA
a S∗
NA
b Start predicate: S(X) → α(X) α(ǫ) → ǫ α(B1B2) → β1(B1, B2)|β2(B1, B2) β1(aB1, aB2) → β1(B1, B2)|β2(B1, B2) β2(bB1, bB2) → β1(B1, B2)|β2(B1, B2) β1(a, a) → ǫ β2(b, b) → ǫ
Parsing beyond CFG 35 RCG Parsing Kallmeyer/Maier ESSLLI 2008
RCG for MT
- Binary 2-RCG can be used for efficient syntax-based machine
translation.
- Intuitively, the first argument of a clause specifies the source
language input, while the parser determines the destination language via string variables, i.e., variables in the parser input that are instantiated by lexical items in parsing.
- Main advantage over previous systems based on synchronous
versions of CFG/TAG/etc.: Higher expressivity through availability of copying/deleting while still in the same complexity class (O(n6)).
- Refer to Søgaard (2008) (COLING 2008) for complete
presentation.
Parsing beyond CFG 36 RCG Parsing
SLIDE 10 Kallmeyer/Maier ESSLLI 2008
RCG for MT: Example (1) Example grammar:
S(X1X2, Y1Y2) → NP(X1, Y1)VP(X2, Y2) VP(X1X2, Y1Y2Y3) → V (X1, Y1)ObjP(X2, Y2)Part(X1, Y3) VP(X1, Y1Y2) → V (X1, Y1)Part(X1, Y2) ObjP(X1, Y1Y2) → NP(X1, Y2)Prep(X1, Y1) NP(X1X2, Y1Y2) → Art(X1, Y1)N (X2, Y2) NP(he, er) → ǫ V (entered, trat) → ǫ Part(entered, ein) → ǫ Prep(the room, in) → ǫ Art(the, das) → ǫ N (room, Zimmer) → ǫ Parsing beyond CFG 37 RCG Parsing Kallmeyer/Maier ESSLLI 2008
RCG for MT: Example (2)
- Call the parser with the input string w =He entered S, where
S is a string variable, and the start predicate S(X1X2, Y1Y2).
- The algorithm should infer that S = Y1Y2 = trat ein in order
to reduce X1X2 to ǫ. Example derivation: he entered the room (the room) er trat in das Zimmer ein
Parsing beyond CFG 38 RCG Parsing Kallmeyer/Maier ESSLLI 2008
Conclusions
- Range concatenation languages coincide with the class of
PTIME recognizable languages.
- We have seen a top-down algorithm and an Earley-style
algorithm.
- Other parsing strategies are possible (cf. Kallmeyer&Maier
(2009)).
- Range concanenation grammar are used as intermediary
formalism in different applications.
Parsing beyond CFG 39 RCG Parsing