A constraint driven metagrammar Joseph Le Roux (1) - Beno e (2) - - - PowerPoint PPT Presentation

a constraint driven metagrammar
SMART_READER_LITE
LIVE PREVIEW

A constraint driven metagrammar Joseph Le Roux (1) - Beno e (2) - - - PowerPoint PPT Presentation

A constraint driven metagrammar Joseph Le Roux (1) - Beno e (2) - Yannick Parmentier (3) t Crabb (1) Calligramme Project LORIA - INPL (2) HCRC / ICCS University of Edinburgh (3) Langue Et Dialogue Project INRIA / LORIA - UHP TAG+8


slide-1
SLIDE 1

A constraint driven metagrammar

Joseph Le Roux(1) - Benoˆ ıt Crabb´ e(2) - Yannick Parmentier(3)

(1) Calligramme Project LORIA - INPL (2) HCRC / ICCS University of Edinburgh (3) Langue Et Dialogue Project INRIA / LORIA - UHP

TAG+8 – Sydney 15 July 2006

1 / 32

slide-2
SLIDE 2

Introduction

◮ Our concern: semi-automatic grammar development of

real-scale Lexicalised TAGs.

◮ Related problems: design and maintnance issues raised by

redundancy inherent to strong lexicalisation.

◮ The MetaGrammar approach:

◮ capturing linguistic generalisations among grammatical

structures (i.e., trees),

◮ describing the trees of a grammar as combinations of

elementary tree fragments.

◮ This work has been realized in collaboration with Pr. Denys

Duchier.

2 / 32

slide-3
SLIDE 3

Outline

eXtensible MetaGrammar Constraining admissible structures An efficient implementation of constraints Some features of the XMG system Conclusion

3 / 32

slide-4
SLIDE 4

eXtensible MetaGrammar (1 / 2)

◮ Monotonic description of the grammar trees using an

expressive and relatively intuitive language.

◮ MetaGrammar ≡ manipulation of elementary tree descriptions

using a control language.

◮ (1) description of tree fragments and (2) combinations of

these fragments.

◮ Two methodological axes of description (Crabb´

e, 05):

  • 1. structure sharing (i.e. reusable elementary tree fragments).
  • 2. alternatives (i.e. combination of fragments using conjunction

and disjunction).

4 / 32

slide-5
SLIDE 5

eXtensible MetaGrammar (2 / 2)

◮ A language to describe tree fragments:

Description ::= x → y | x →+ y | x →∗ y | x ≺ y | x ≺+ y | x ≺∗ y | x[f :E] | x(p:E) (1)

◮ A language to combine tree fragments:

Class ::= Name → Content (2) Content ::= Description | Name | Content ∨ Content | Content ∧ Content (3)

5 / 32

slide-6
SLIDE 6

Example (1 / 2)

◮ Tree fragment #1:

SubjectCan → (X [cat : s] → Y [cat : v] ) ∧ (X → Z (mark : subst) [cat : n] ) ∧ (Z ≺ Y ) SubjectCan → X [cat:s] Z ↓ [cat:n] Y [cat:v]

6 / 32

slide-7
SLIDE 7

Example (1 / 2)

◮ Tree fragment #1:

SubjectCan → (X [cat : s] → Y [cat : v] ) ∧ (X → Z (mark : subst) [cat : n] ) ∧ (Z ≺ Y ) SubjectCan → X [cat:s] Z ↓ [cat:n] Y [cat:v]

◮ Tree fragment #2:

Active → (X [cat : s] ∧ Y (mark : anchor) [cat : v] ) ∧ X → Y ) Active → X [cat:s] Y ⋄ [cat:v]

7 / 32

slide-8
SLIDE 8

Example (1 / 2)

◮ Tree fragment #1:

SubjectCan → (X [cat : s] → Y [cat : v] ) ∧ (X → Z (mark : subst) [cat : n] ) ∧ (Z ≺ Y ) SubjectCan → X [cat:s] Z ↓ [cat:n] Y [cat:v]

◮ Tree fragment #2:

Active → (X [cat : s] ∧ Y (mark : anchor) [cat : v] ) ∧ X → Y ) Active → X [cat:s] Y ⋄ [cat:v]

◮ Combination rule: Intransitive → SubjectCan ∧ Active (∗)

8 / 32

slide-9
SLIDE 9

Example (2 / 2)

Some trees for intransitive verbs (e.g., the lexical item sleeps)

S N↓ V (Canonical Subject)

S V⋄ (Active verb morph)

S N↓ V⋄ (e.g. the boy sleeps) N N* S N↓ V (Extracted Subject)

S V⋄ (Active verb morph)

N N* S N↓ V⋄ (e.g. the boy who sleeps)

9 / 32

slide-10
SLIDE 10

About namespaces

◮ Scope of the node variables used within descriptions ? ◮ local scope by default. ◮ possibility to explicitly manage namespaces via Import /

Export declarations.

◮ Furthermore, introduction of an inheritance mechanism whose

semantics corresponds to class conjunction and namespace merging.

10 / 32

slide-11
SLIDE 11

Outline

eXtensible MetaGrammar Constraining admissible structures An efficient implementation of constraints Some features of the XMG system Conclusion

11 / 32

slide-12
SLIDE 12

Constraining admissible structures

◮ Further constraining the tree structures generated from the

metagrammar.

◮ Specifying constraints on the well-formedness of trees. ◮ Interest: avoid manual checking (e.g. no tree with more than

  • ne foot node, etc).

◮ Classification of these constraints into 4 categories:

  • 1. Formal constraints
  • 2. Operational constraints
  • 3. Language-dependent constraints
  • 4. Theoretical constraints

12 / 32

slide-13
SLIDE 13

Formal constraints

◮ Constraints assuring that the trees generated by the model

builder are regular TAG trees.

◮ On top of being trees, the output structures must respect

some specific criteria:

◮ each node has a category label, ◮ leaf nodes are either marked as subst, foot or anchor, ◮ the category of the foot node is identical to that of the root

node,

◮ etc. 13 / 32

slide-14
SLIDE 14

Operational constraint (1 / 3)

◮ Constraints controlling the combinations of tree fragments

(closely linked to the concept of Resources / Needs).

◮ Constraints based on a colouring of the nodes. ◮ Each node of the description is labelled either Black, Red or

White.

◮ During minimal model computation, nodes are identified

according to the following rules:

  • w

+

  • w

=

  • w
  • b

+

  • w

=

  • b
  • b

+

  • b

= ⊥

  • r

+ { ◦w ;

  • b ;
  • r}

= ⊥

14 / 32

slide-15
SLIDE 15

Operational constraint (2 / 3)

Benefits:

◮ Avoids node naming issues (no global names). ◮ Allows to reduce the metagrammatical description (node

equations are replaced with implicit coloured node identifications).

◮ Facilitates the reuse of a same tree fragment several times.

15 / 32

slide-16
SLIDE 16

Operational constraint (3 / 3)

Example:

S◦w N•r V◦w (SubjectCan)

N•r N•r S◦w N•r V◦w (SubjectRel)

S•b V⋄•b (Active)

S◦w V◦w N↓•r (ObjectCan) 16 / 32

slide-17
SLIDE 17

Language-dependent constraints (1 / 2)

◮ For French, the ordering and uniqueness of clitics. ◮ (Perlmutter, 70):

first they appear in front of the verb in a fixed order according to their rank (a-b) and second two different clitics in front of the verb cannot have the same rank (c).

◮ For instance the clitics le, la have the rank 3 and lui the

rank 4 (rank is a node property). (a) Jean le3 lui4 donne John gives it to him (b) *Jean lui4 le3 donne *John gives to him it (c) *Jean le3 la3 donne *John gives it it

17 / 32

slide-18
SLIDE 18

Language-dependent constraints (2 / 2)

S N↓ V’ ≺+ (Jean)

V’ Cl↓3 V ≺+ (le)

V’ Cl↓4 V ≺+ (lui)

S V’ V⋄ (donne)

S N↓ V’ Cl↓3 Cl↓4 V⋄ (Jean le lui donne) S N↓ V’ Cl↓4 Cl↓3 V⋄ (Jean lui le donne)

18 / 32

slide-19
SLIDE 19

Theoretical principles

◮ Language-independent principles related to the grammatical

formalism described.

◮ For TAG, such a principle may be the Principle of

Predicate-Argument Coocurrency.

◮ NB: such principles are not yet implemented within the XMG

system.

19 / 32

slide-20
SLIDE 20

Outline

eXtensible MetaGrammar Constraining admissible structures An efficient implementation of constraints Some features of the XMG system Conclusion

20 / 32

slide-21
SLIDE 21

An efficient implementation of constraints

◮ A 3-step metagrammar compilation:

  • 1. translation of the descriptions into intermediate code for a

specific virtual machine (WAM-based),

  • 2. execution of this code and accumulation of partial tree

descriptions,

  • 3. solving of tree descriptions.

◮ The third step is performed by a tree description solver

implemented using the Constraint Satisfaction approach.

◮ In this context, the constraints introduced above can be

expressed naturally.

21 / 32

slide-22
SLIDE 22

Solving Tree Descriptions (1 / 3)

  • 1. Setting the constraint framework:

◮ Each node in the input description is associated with an

integer.

◮ Then, we use an asbtract data type to refer to a node of a

valid model in terms of the nodes being equals, above, below,

  • r on its side:

Eq Up Down Left Right

Ni

x

::= node( Eq: {ints} Up: {ints} Down: {ints} Left: {ints} Right: {ints})

22 / 32

slide-23
SLIDE 23

Solving Tree Descriptions (2 / 3)

◮ The input description is converted into relation constraints on

node sets. For instance, the dominance relation x → y can be translated as:

Ni

x→ Nj y ≡ [Ni x.EqUp ⊆ Nj y.Up ∧ Ni x.Down ⊇ Nj y.EqDown

∧ Ni

x.Left ⊆ Nj y.Left ∧ Ni x.Right ⊆ Nj y.Right]

23 / 32

slide-24
SLIDE 24

Solving Tree Descriptions (2 / 3)

◮ The input description is converted into relation constraints on

node sets. For instance, the dominance relation x → y can be translated as:

Ni

x→ Nj y ≡ [Ni x.EqUp ⊆ Nj y.Up ∧ Ni x.Down ⊇ Nj y.EqDown

∧ Ni

x.Left ⊆ Nj y.Left ∧ Ni x.Right ⊆ Nj y.Right]

Ni

x.Down ⊇ Nj y.EqDown

24 / 32

slide-25
SLIDE 25

Solving Tree Descriptions (3 / 3)

  • 2. Searching the solutions to the problem:

◮ The solutions are the assignments for each of the node sets

associated with the nodes of the input description.

◮ A distribution strategy is used to explore the consistent

assignments for these node sets.

◮ The implementation of the solver follows the ideas of (Duchier

and Niehren, 2000) and uses the constraint programming support of the Oz/Mozart system.

25 / 32

slide-26
SLIDE 26

Extension to specific constraints (1 / 4)

◮ This constraint framework can relatively easily be extended to

solve specific constraints, such as those introduced previously.

◮ The idea:

  • 1. extension of the node representation (tuples whose fields

contain sets of nodes),

  • 2. definition of additional constraints on these fields, reflecting

the syntactic constraints we want to express.

26 / 32

slide-27
SLIDE 27

Extension to specific constraints (2 / 4)

1st example: Clitic uniqueness.

In a valid model φ, there is only one node having a given property p.

◮ For each node x in the description, we add to its

representation a field containing a boolean variable px indicating whether the node denoting x in the model has this property or not: px ≡ (Ni

x.Eq ∩ Vφ p ) = ∅ ◮ Finally, if true value ∼ 1 and false ∼ 0:

  • x∈φ

px ≤ 1

27 / 32

slide-28
SLIDE 28

Extension to specific constraints (3 / 4)

2nd example: Colouring constraint.

◮ During description solving, coloured nodes are identified

according to the following rules:

  • w

+

  • w

=

  • w
  • b

+

  • w

=

  • b
  • b

+

  • b

= ⊥

  • r

+ { ◦w ;

  • b ;
  • r}

= ⊥

◮ A valid model in this context is a saturated tree, i.e. where

nodes are either black (possibly resulting from identifications)

  • r red.

28 / 32

slide-29
SLIDE 29

Extension to specific constraints (4 / 4)

◮ First, the tuples representing nodes are extended by adding a

integer field RB referring to the red or black node with which the node has been identified.

◮ Then, we define the following constraints:

x ∈ VR ⇒ Ni

x.RB = i ∧ Ni x.Eq = {i}

(a) x ∈ VB ⇒ Ni

x.RB = i

(b) x ∈ VW ⇒ Ni

x.RB ∈ Vφ B

(c)

29 / 32

slide-30
SLIDE 30

Outline

eXtensible MetaGrammar Constraining admissible structures An efficient implementation of constraints Some features of the XMG system Conclusion

30 / 32

slide-31
SLIDE 31

Some features of the XMG system

◮ The XMG system has been used successfully to compute a

core TAG for French (6,000+ trees computed from a description containing 293 classes).

◮ This metagrammar has been designed relatively quickly as the

description language is intuitive as advocated in (Crabb´ e, 05).

◮ The compilation of a TAG with more than 6,000 trees takes

about 15 min with a P4 processor 2.6 GHz and 1 GB RAM.

◮ XMG is released under the terms of the GPL-like CeCILL

license and can be freely downloaded at http://sourcesup.cru.fr/xmg

31 / 32

slide-32
SLIDE 32

Conclusion

◮ The XMG formalism includes a fully declarative language. ◮ On top of describing tree-based grammar, XMG allows to

express higher-level constraints such as operational constraints (colours) and language-dependent constraints (Clitic uniqueness).

◮ These specific constraints are processed using Constraint

  • Programming. This paradigm allows to extend the library of

constraints easily.

◮ Up to now, the constraints implemented within XMG are:

◮ Colouring. ◮ Uniqueness. ◮ Rank (ordering). ◮ TAG labelling constraints. ◮ Node arity. 32 / 32