Grammar Implementation with Lexicalized Tree Adjoining Grammars and - - PowerPoint PPT Presentation

grammar implementation with lexicalized tree adjoining
SMART_READER_LITE
LIVE PREVIEW

Grammar Implementation with Lexicalized Tree Adjoining Grammars and - - PowerPoint PPT Presentation

Grammar Implementation with Lexicalized Tree Adjoining Grammars and Frame Semantics Grammar implementation with XMG Laura Kallmeyer, Timm Lichte, Rainer Osswald & Simon Petitjean University of Dsseldorf DGfS Fall School, September 18,


slide-1
SLIDE 1

Grammar Implementation with Lexicalized Tree Adjoining Grammars and Frame Semantics

Grammar implementation with XMG Laura Kallmeyer, Timm Lichte, Rainer Osswald & Simon Petitjean

University of Düsseldorf

DGfS Fall School, September 18, 2017

SFB 991

slide-2
SLIDE 2

Outline

1

Overview: Last week, this week

2

What is grammar implementation?

3

Two ways of tree template implementation Metarules Metagrammars

4

eXtensible Metagrammar (XMG)

5

Lexicon and parser

6

XMG 2: tutorial

7

Principles

8

Summary

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 2 2

slide-3
SLIDE 3

Last week

Mon: introduction to LTAG Tue: syntactic analyses with LTAG I, derivation trees, feature structures Wed: syntactic analyses with LTAG II, introduction to LTAG semantics Thu: introduction to frame semantics Fri: puting things together

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 3 3

slide-4
SLIDE 4

This week

Mon: introduction to grammar engineering and XMG Tue: implementing syntax with XMG Wed: implementing semantics with XMG Thu: parsing implemented grammars with TuLiPA Fri: conclusion

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 4 4

slide-5
SLIDE 5

Grammar engineering: the task

grammar sketches, example analyses (incomplete) ↓ implemented grammars, digital resource (complete) ↓ grammar in action, parsing (i.a. usable in NLP)

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 5 5

slide-6
SLIDE 6

Grammar engineering: the problem

How to factorize the set of templates?

⇒ express lexical generalizations, e.g. active-passive diathesis ⇒ define tree families

How to turn this into an electronic resource? How to plug it into a lexicon and use it?

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 6 6

slide-7
SLIDE 7

Outline

1

Overview: Last week, this week

2

What is grammar implementation?

3

Two ways of tree template implementation Metarules Metagrammars

4

eXtensible Metagrammar (XMG)

5

Lexicon and parser

6

XMG 2: tutorial

7

Principles

8

Summary

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 7 7

slide-8
SLIDE 8

Two kinds of grammar implementation

grammar/ linguistic theory specifications in accordance with a grammar formalism evaluation

  • f the theory

“implementation” As is frequently pointed out but cannot be overemphasized, an important goal

  • f formalization in linguistics is to enable subsequent researchers to see the

defects of an analysis as clearly as its merits; only then can progress be made

  • efficiently. (Dowty 1979: 322)

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 8 8

slide-9
SLIDE 9

Two kinds of grammar implementation

grammar/ linguistic theory specifications in accordance with a grammar formalism evaluation

  • f the theory

grammar resource computational application “implementation” “implementation”

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 9 8

slide-10
SLIDE 10

What kind of grammar resource?

tree template S NP VP V⋄ NP lexical insertion anchor repairs

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 10 9

slide-11
SLIDE 11

The implementation task for LTAG

General task Implement a large-coverage LTAG, i.e. based on the XTAG grammar! Subtasks:

1 Generate unlexicalized trees (= tree templates)! 2 Generate a database of lexical anchors (= the lexicon)! 3 Connect the tree templates with the lexicon (= lexical inser-

tion)!

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 11 10

slide-12
SLIDE 12

Two ways of grammar implementation with TAG

Two existing toolkits: XTAG tools[23]

1 implementation tools

⇒ metarule approach

2 editor/viewer for MorphDB and SynDB 3 parser

XMG + lexConverter + TuLiPA

1 XMG: eXtensible MetaGrammar[9]

⇒ metagrammar approach

2 lexConverter (LEX2ALL) 3 TuLiPA: Tübingen Linguistic Parsing Architecture[16]

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 12 11

slide-13
SLIDE 13

Outline

1

Overview: Last week, this week

2

What is grammar implementation?

3

Two ways of tree template implementation Metarules Metagrammars

4

eXtensible Metagrammar (XMG)

5

Lexicon and parser

6

XMG 2: tutorial

7

Principles

8

Summary

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 13 12

slide-14
SLIDE 14

The situation

12 templates 39 tree templates for intransitive verbs for transitive verbs

S NP VP V S NP VP V NP S NP S NP ε VP V S NP S NP ε VP V NP

... ... Basically, XTAG defines a set of 1008 unrelated tree templates.

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 14 13

slide-15
SLIDE 15

Metarules for LTAG

Idea from GPSG[12], later applied to XTAG[2,3,19] core grammar (tree templates) tree fragments expanded grammar (tree templates) accumulation metarules metarules metarules connect tree templates of a tree family

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 15 14

slide-16
SLIDE 16

Metarules for LTAG: Example

extraction:

S NP γ

=⇒

S NP S NP ε γ

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 16 15

slide-17
SLIDE 17

Metarules for LTAG: Example

extraction:

S NP γ

=⇒

S NP S NP ε γ

passivization:

S NP0 VP VP V⋄ NP1 γ

=⇒

S NP1 VP VP V⋄ PP P by NP0 γ

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 17 15

slide-18
SLIDE 18

Metarules for LTAG: Example

αnx0Vnx1 αW0nx0Vnx1 αnx1Vbynx0 αW1nx1Vbynx0

extraction passivization extraction

Tnx0nx1

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 18 16

slide-19
SLIDE 19

Metarules for LTAG: Problems[2]

Metarules are very powerful: deletion, copying, recursive application, metavariables over trees

  • rder sensitive

in the unrestricted case: undecidable[21] Restrictions (GPSG):[20] finite closure: apply every metarule at most once! ⇒ still NP-complete biclosure: apply at most two metarules in a row! ⇒ insufficient for LTAG metarules[2] explicit rule ordering (by means of finite state automata)[19]

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 19 17

slide-20
SLIDE 20

Metagrammars for LTAG

Candito (1996)[8,9,22] tree fragments tree templates tree families arbitrary disjunction accumulation of descriptions

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 20 18

slide-21
SLIDE 21

Metagrammars for LTAG: Tree descriptions

LD: Description language for trees Let n1 and n2 be node variables: Description :=

  • n1 → n2

| n1 →+ n2 | n1 →∗ n2 | n1 ≺ n2 | n1 ≺+ n2 | n1 ≺∗ n2 | n1 = n2 | Description ∧ Description

  • Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf)

21 19

slide-22
SLIDE 22

Metagrammars for LTAG: Tree descriptions

LD: Description language for trees Let n1 and n2 be node variables: Description :=

  • n1 → n2

| n1 →+ n2 | n1 →∗ n2 | n1 ≺ n2 | n1 ≺+ n2 | n1 ≺∗ n2 | n1 = n2 | Description ∧ Description

  • Example:

S NP VP corresponds to nS → nNP ∧ nS → nVP ∧ nNP ≺ nVP

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 22 19

slide-23
SLIDE 23

Metagrammars for LTAG: Tree descriptions

LD: Description language for trees Let n1 and n2 be node variables: Description :=

  • n1 → n2

| n1 →+ n2 | n1 →∗ n2 | n1 ≺ n2 | n1 ≺+ n2 | n1 ≺∗ n2 | n1 = n2 | Description ∧ Description

  • Example:

S NP VP corresponds to nS → nNP ∧ nS → nVP ∧ nNP ≺ nVP S NP ≺∗ NP + corresponds to nS → nNP1 ∧ nS →+ nNP2 ∧ nNP1 ≺∗ nNP2

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 23 19

slide-24
SLIDE 24

Metagrammars for LTAG: Example

Minimal model of tree descriptions You may add edges but not nodes! S NP S S NP NP ε VP VP V⋄ VP NP

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 24 20

slide-25
SLIDE 25

Metagrammars for LTAG: Example

Minimal model of tree descriptions You may add edges but not nodes! S NP VP V⋄ |= S NP S S NP NP ε VP VP V⋄ VP NP

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 25 20

slide-26
SLIDE 26

Metagrammars for LTAG: Example

Minimal model of tree descriptions You may add edges but not nodes! S NP S S NP NP ε VP VP V⋄ VP NP |= S NP S NP ε VP V⋄ NP

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 26 20

slide-27
SLIDE 27

Metagrammars for LTAG: Example

Minimal model of tree descriptions You may add edges but not nodes! S NP VP S NP VP |=

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 27 21

slide-28
SLIDE 28

Metagrammars for LTAG: Example

Minimal model of tree descriptions You may add edges but not nodes! S NP VP S NP VP |= S NP VP

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 28 21

slide-29
SLIDE 29

Metagrammars for LTAG: Example

Minimal model of tree descriptions You may add edges but not nodes! S NP VP S NP VP |= S NP VP S NP VP S NP VP |=

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 29 21

slide-30
SLIDE 30

Metagrammars for LTAG: Example

Minimal model of tree descriptions You may add edges but not nodes! S NP VP S NP VP |= S NP VP S NP VP S NP VP |= S NP S NP VP VP ...

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 30 21

slide-31
SLIDE 31

Metagrammars for LTAG: Properties

no deletion, no copying, no recursion declarative, order insensitive The number of minimal models is finite. BUT: the number of minimal models can grow exponentially (O(n!)) in terms of the number of described nodes. Does it suffice? How to express passivization?

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 31 22

slide-32
SLIDE 32

Metagrammars for LTAG: Passivization

|=

S NP VP V⋄ NP S NP VP V⋄ PP P by NP

Tnx0nx1

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 32 23

slide-33
SLIDE 33

Metagrammars for LTAG: Passivization

S NP VP ∧ VP V⋄ ∧

  • VP

V⋄ NP ∨ VP V⋄ PP P by NP

  • |=

S NP VP V⋄ NP S NP VP V⋄ PP P by NP

Tnx0nx1

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 33 23

slide-34
SLIDE 34

Metagrammars for LTAG: Passivization

S NP VP ∧ VP V⋄ ∧

  • VP

V⋄ NP ∨ VP V⋄ PP P by NP

  • disjunction

does the trick! |=

S NP VP V⋄ NP S NP VP V⋄ PP P by NP

Tnx0nx1

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 34 23

slide-35
SLIDE 35

Metagrammar for LTAG: Classes

Tree descriptions are bundled into so-called classes: LC: Description language for the combination of tree descriptions Class := Name : Content Content :=

  • Description | Name |

Content ∨ Content | Content ∧ Content

  • Upon instantiating/using a class:

Node variables are replaced by fresh ones. Node variables are known to the instantiating class. The class name is replaced by the content in the instantiating class. ⇒ Classes can be reused!

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 35 24

slide-36
SLIDE 36

Metagrammar for LTAG: Classes

Tnx0Vnx1:

S NP VP

VP V⋄

  • VP

V⋄ NP ∨ VP V⋄ PP P by NP

  • Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf)

36 25

slide-37
SLIDE 37

Metagrammar for LTAG: Classes

Tnx0Vnx1:

S NP VP

VP V⋄

  • VP

V⋄ NP ∨ VP V⋄ PP P by NP

  • Tnx0Vnx1: Subject

∧ VerbProjection ∧ (Object ∨ by-Phrase) Tnx0Vnx1: nx0V ∧ (Object ∨ by-Phrase)

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 37 25

slide-38
SLIDE 38

Metagrammar for LTAG: Classes

Tnx0Vnx1:

S NP VP

VP V⋄

  • VP

V⋄ NP ∨ VP V⋄ PP P by NP

  • Tnx0Vnx1: Subject

∧ VerbProjection ∧ (Object ∨ by-Phrase) Tnx0Vnx1: nx0V ∧ (Object ∨ by-Phrase) Subject VerbProjection Object by-Phrase nx0V Tnx0Vnx1

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 38 25

slide-39
SLIDE 39

Metagrammar for LTAG: Class hierarchies

There are very many possible class hierarchies ...

Subject WhNP+EmptyWord Object BaseSubject WhSubject WhObject BaseObject VerbProjection alphanx0V alphaW0nx0V Tnx0V alphaW0nx0Vnx1 alphanx0Vnx1 alphaW1nx0Vnx1 Tnx0Vnx1

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 39 26

slide-40
SLIDE 40

Metagrammar for LTAG: Class hierarchies

There are very many possible class hierarchies ...

Subject VerbProjection Object nx0V alphanx0V WhNP+EmptyWord nx0Vnx1 Wnx0Vnx1 alphanx0Vnx1 alphaW0nx0V alphaW0nx0Vnx1 alphaW1nx0Vnx1 Tnx0V Tnx0Vnx1

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 40 26

slide-41
SLIDE 41

Metagrammar for LTAG: Class hierarchies

There are very many possible class hierarchies ...

WhNP+EmptyWord BaseSubject WhSubject WhObject BaseObject Subject VerbProjection Object Tnx0V Tnx0Vnx1

[1]

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 41 26

slide-42
SLIDE 42

Metagrammar for LTAG: Class hierarchies

There are very many possible class hierarchies ...

Subject WhNP+EmptyWord Object BaseSubject WhSubject WhObject BaseObject VerbProjection Tnx0V Tnx0Vnx1

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 42 26

slide-43
SLIDE 43

Metagrammar for LTAG: Class hierarchies

...but not everything is possible: alphanx0Vnx1 alphaW0nx0Vnx1 alphanx1Vbynx0 alphaW1nx1Vbynx0 Tnx0nx1

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 43 27

slide-44
SLIDE 44

Outline

1

Overview: Last week, this week

2

What is grammar implementation?

3

Two ways of tree template implementation Metarules Metagrammars

4

eXtensible Metagrammar (XMG)

5

Lexicon and parser

6

XMG 2: tutorial

7

Principles

8

Summary

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 44 28

slide-45
SLIDE 45

eXtensible Metagrammar (XMG): Background

developed at LORIA, Nancy, LIFO, Orléans and HHU, Düsseldorf.[9] writen in Oz/Mozart YAP and Python (as of XMG2) available at

dokufarm.phil.hhu.de/xmg

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 45 29

slide-46
SLIDE 46

eXtensible Metagrammar (XMG): Background

developed at LORIA, Nancy, LIFO, Orléans and HHU, Düsseldorf.[9] writen in Oz/Mozart YAP and Python (as of XMG2) available at

dokufarm.phil.hhu.de/xmg

Why “eXtensible” ? highly modularized[17] dimensions with dedicated description languages and compilers (<syn>, <sem>, <frame>, <morph>, ...) interface using shared variables

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 46 29

slide-47
SLIDE 47

eXtensible Metagrammar (XMG): Background

developed at LORIA, Nancy, LIFO, Orléans and HHU, Düsseldorf.[9] writen in Oz/Mozart YAP and Python (as of XMG2) available at

dokufarm.phil.hhu.de/xmg

Why “eXtensible” ? highly modularized[17] dimensions with dedicated description languages and compilers (<syn>, <sem>, <frame>, <morph>, ...) interface using shared variables Some existing implementations using XMG: French: FrenchTAG[8] English: XTAG with XMG[1] German: GerTT[14]

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 47 29

slide-48
SLIDE 48

eXtensible Metagrammar (XMG): Example

S NP↓ VP

1

class Subject

2

export ?S

3

declare ?S ?NP ?VP

4

{ <syn>{

5

node ?S [cat=s]{

6

node ?NP (mark=subst) [cat=np]

7

node ?VP [cat=vp]

8

}

9

}

10

}

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 48 30

slide-49
SLIDE 49

eXtensible Metagrammar (XMG): Example

S NP↓ VP

1

class Subject

2

export ?S

3

declare ?S ?NP ?VP

4

{ <syn>{

5

node ?S [cat=s];

6

node ?NP (mark=subst) [cat=np];

7

node ?VP [cat=vp];

8

?S -> ?NP;

9

?S -> ?VP;

10

?NP >> ?VP

11

}

12

}

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 49 31

slide-50
SLIDE 50

eXtensible Metagrammar (XMG): Example

S NP↓ VP VP V⋄ |= S NP↓ VP V⋄

1

class alphanx0v

2

import VerbProjection[]

3

declare ?Subj

4

{

5

?Subj = Subject[];

6

?Subj.?VP = ?VP

7

}

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 50 32

slide-51
SLIDE 51

eXtensible Metagrammar (XMG): Example with <frame>

(Lichte & Petitjean)[15] S NP[I= 1 ] VP[E= 0 ] V⋄[E= 0 ]

  • event

actor

1

  • class Subj

class Subj ... <syn>{ node ?S [cat=s]; node ?SUBJ [cat=np, top=[i=?1]]; node ?VP [cat=vp,bot=[e=?0]]; node ?V (mark=anchor) [cat=v,top=[e=?0]]; ?S -> ?SUBJ; ?S -> ?VP; ?VP -> * ?V; ?SUBJ >> ?VP }; <frame>{ ?0[event, actor:?1] } ...

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 51 33

slide-52
SLIDE 52

Outline

1

Overview: Last week, this week

2

What is grammar implementation?

3

Two ways of tree template implementation Metarules Metagrammars

4

eXtensible Metagrammar (XMG)

5

Lexicon and parser

6

XMG 2: tutorial

7

Principles

8

Summary

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 52 34

slide-53
SLIDE 53

Lexicon and parser

metagrammar XMG compiler compiled metagrammar 2-layered lexicon TuLiPA parser parsing result input sentence implementational cycle

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 53 35

slide-54
SLIDE 54

Lexicon and parser: A 2-layered lexicon

loves love Tnx0Vnx1 “morphological lexicon” “lemma lexicon”

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 54 36

slide-55
SLIDE 55

Lexicon and parser: A 2-layered lexicon

loves love Tnx0Vnx1 “morphological lexicon” “lemma lexicon” Morphological lexicon maps an (inflected) token to some base form (= lemma), while preserving morphological information in a feature structure.

loves love [pos=v; num=sing; pers=3;] Peter Peter [pos=n; num=sing; pers=3; case=nom|acc;]

Interface with tree templates: Feature unification during lexical insertion

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 55 36

slide-56
SLIDE 56

Lexicon and parser: A 2-layered lexicon

loves love Tnx0Vnx1 “morphological lexicon” “lemma lexicon” Lemma lexicon maps a lemma onto tree tuple families, while also containing selectional restrictions (e.g., case assignment).

*ENTRY: love *CAT: v *SEM: *ACC: 1 *FAM: Tnx0Vnx1 *FILTERS: [] *EX: *EQUATIONS: NParg1 -> case = nom NParg2 -> case = acc *COANCHORS:

Interface with tree templates: EQUATIONS → nodes of tree templates FILTERS → selection of tree templates

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 56 36

slide-57
SLIDE 57

Lexicon and parser: The TuLiPA parser

(Parmentier et al.)[16] TuLiPA Tübingen Linguistic Parsing Architecture (TuLiPA) uses Range Concatenation Grammar (RCG) as a pivot formal- ism. Components:

1 TAG-to-RCG converter (on-line) 2 RCG parser → RCG derivation forest → TAG derivation forest 3 Parse viewer (derived tree, derivation tree, dependency view,

semantic representation) Availability of TuLiPA: writen in Java and released under the GNU GPL (http://sourcesup.cru.fr/tulipa/)

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 57 37

slide-58
SLIDE 58

Outline

1

Overview: Last week, this week

2

What is grammar implementation?

3

Two ways of tree template implementation Metarules Metagrammars

4

eXtensible Metagrammar (XMG)

5

Lexicon and parser

6

XMG 2: tutorial

7

Principles

8

Summary

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 58 38

slide-59
SLIDE 59

How does it work?

XMG processing steps are as follow: The metagrammar is compiled: metagrammatical language is translated into executable code The generated code is executed: accumulation of descriptions into the dimensions Descriptions are solved: every dimension comes with a dedi- cated solver Models are converted into the output language (XML)

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 59 39

slide-60
SLIDE 60

Tools

XMG-1 eXtensible (?) Metagrammar Only 3 dimensions XMG-2 Arbitrarily many dimensions, with DSLs Modular assembly of DSL, using bricks Methodology to generate a whole processing chain

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 60 40

slide-61
SLIDE 61

XMG-2: Architecture

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 61 41

slide-62
SLIDE 62

XMG-2: Architecture (relevant part for us)

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 62 42

slide-63
SLIDE 63

Installing XMG 2

Three options, provided by the documentation:

dokufarm.phil.hhu.de/xmg

Follow the steps (Ubuntu), or Install VirtualBox and get the XMG image Use the online compiler(s): http://xmg.phil.hhu.de/index.

php/upload/compile_grammar

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 63 43

slide-64
SLIDE 64

Installing contributions

XMG bricks are distributed as contributions Making a contribution available is done with the install com- mand

xmg@xmg:∼/xmg-ng$ cd contributions xmg@xmg:∼/xmg-ng/contributions$ xmg install core xmg@xmg:∼/xmg-ng/contributions$ xmg install treemg xmg@xmg:∼/xmg-ng/contributions$ xmg install compat xmg@xmg:∼/xmg-ng/contributions$ xmg install synsemCompiler

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 64 44

slide-65
SLIDE 65

Installing compilers

A set of already assembled compilers is available Building one of them can be done with the build command

xmg@xmg:∼/xmg-ng$ cd contributions/synsemCompiler/ xmg@xmg:∼/xmg-ng/.../synsemCompiler$ cd compilers/ synsem/ xmg@xmg:∼/xmg-ng/.../synsem$ xmg build

To avoid these steps: scripts (reinstall.sh)

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 65 45

slide-66
SLIDE 66

Compiling a first metagrammar

The compile command takes two arguments The compiler which will be used The metagrammar

xmg@xmg:∼/xmg-ng$ xmg compile synsem MetaGrammars/synsem /TagExample.mg

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 66 46

slide-67
SLIDE 67

Drawing trees

The output of XMG2 can be given to a parser or a generator, but also be inspected by a tree viewer XMG comes with a built-in tree viewer:

xmg@xmg:∼/xmg-ng$ xmg gui tag

Pytreeview (https://gitlab.com/parmenti/pytreeview) is a light tree viewer installed on the Virtualbox distribution of XMG2:

xmg@xmg:∼/xmg-ng$ pytreeview --mode WEB -i input-file. xml

A tree and frame viewer is available online: http://xmg.phil.

hhu.de/index.php/upload/xmg_viewer

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 67 47

slide-68
SLIDE 68

The control language

XMG descriptions: Associate a content to an identifier (abstraction) Describe structures inside dimensions, with dedicated lan- guages Use other abstractions (classes) Combine contents in a disjunctive or a conjunctive way Class := Name → Content Content := Dimension{Description} | Name | Content ∨ Content | Content ∧ Content

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 68 48

slide-69
SLIDE 69

Describing trees

The <syn> dimension Declaring nodes: keyword node, optional node variable, op- tional features and properties

node ?S [cat=s]

Expressing constraints between nodes: dominance operators (->, ->+, ->*) and precedence operators (>>, >>+, >>*) Combining these statements: with logical operators (; and |) Example:

1

node ?S [cat=s];

2

node ?VP [cat=vp];

3

node ?V (mark=anchor) [cat=v];

4

node ?NP (mark=subst) [cat=n];

5

?S -> ?VP;

6

?VP -> ?V;

7

?S -> ?NP;

8

?NP >> ?VP

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 69 49

slide-70
SLIDE 70

Alternative syntax: bracket notation

The <syn> dimension Declaring nodes: same as for the standard notation Expressing dominance and precedence constraints thanks to bracketing, and special operators for non immediate relations (... , ...+ , ,,, , ,,,+)

1

node ?S [cat=s]{

2

node ?NP (mark=subst) [cat=np]

3

node ?VP [cat=vp]{

4

node ?V (mark=anchor) [cat=v]

5

}

6

}

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 70 50

slide-71
SLIDE 71

Using dimensions

Contributing descriptions Descriptions (constraints) are accumulated into dimensions Every dimension is associated to a solver (sometimes identity)

<syn>: a tree solver generates all minimal models

1

<syn>{

2

node ?S [cat=s];

3

node ?VP [cat=vp];

4

node ?V (mark=anchor) [cat=v];

5

node ?NP (mark=subst) [cat=n];

6

?S -> ?VP;

7

?VP -> ?V;

8

?S -> ?NP;

9

?NP >> ?VP

10

}

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 71 51

slide-72
SLIDE 72

Syntactic nodes

Two nodes can be unified if: their feature structures can be unified their properties can be unified Unification of nodes happens at two different stages: During the execution of the code (“explicit” unification: unifica- tion instruction = or reuse of variable) Afer solving: some nodes may be merged to obtain a minimal model

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 72 52

slide-73
SLIDE 73

Minimal models

A minimal model is a model of the description where: no constraint is violated no additional node is created What are the minimal models for the following sets of constraints?

1

?S -> + ?A ; ?S -> ?B

1

?S -> ?A ; ?S -> ?B ; ?S -> ?C ; ?A >>* ?C

Which set of constraints leads to the following minimal models? S A B C D S A C B D

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 73 53

slide-74
SLIDE 74

Defining abstractions

Classes allow to: Control the scope of variables Make (parametrized) abstractions Examples (just headers):

1

class kicked_the_bucket

2

import nx0Vnx1[]

3

declare ?X0 ?X1

1

class nx0Vnx1

2

export ?S ?NP_Subj ?VP ?V ?NP_Obj

3

declare ?S ?NP_Subj ?VP ?V ?NP_Obj ?X0 ?X1

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 74 54

slide-75
SLIDE 75

Defining abstractions

1

class Intransitive

2

declare ?S ?NP ?VP ?V

3

{

4

<syn>{

5

node ?S [cat=s];

6

node ?VP [cat=vp];

7

node ?V (mark=anchor) [cat=v];

8

node ?NP (mark=subst) [cat=n];

9

?S -> ?VP; ?VP -> ?V;

10

?S -> ?NP; ?NP >> ?VP

11

}

12

}

Valuation To specify for which class models have to be computed (the axioms), the instruction value has to be used afer the class definitions.

1

value Intransitive

2

value Transitive

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 75 55

slide-76
SLIDE 76

Using abstractions

Classes can be used by other classes by two means: Importing the class in the header: all the (exported) variables are added to the scope, all the constraints from the class are added to the current set of constraints Calling the class in the body: variables are not added to the scope Calling classes has two advantages: alternatives are possible (disjunction) it allows to use parameters Examples:

1

CanObj[] | RelObj[]

1

?C=Class[?X]

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 76 56

slide-77
SLIDE 77

Classes: examples (1)

1

class a

2

export ?A

3

declare ?A ?S

4

{

5

<syn>{

6

?S -> ?A

7

}

8

}

9 10

class b

11

import a[]

12

declare ?B

13

{

14

<syn>{

15

?B -> ?A

16

}

17

}

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 77 57

slide-78
SLIDE 78

Classes: examples (2)

1

class a

2

export ?S

3

declare ?A ?S

4

{

5

<syn>{

6

?S -> ?A

7

}

8

}

9 10

class b

11

import a[]

12

declare ?A

13

{

14

<syn>{

15

?S -> ?A

16

}

17

}

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 78 58

slide-79
SLIDE 79

Definition of types and constants

Everything inside the metagrammar has a type: values, feature structures, nodes, dimensions... Four ways to define new types: Enumerated type: type T={a,b,c,d} Structured type: type T=[a1:t1,...,an:tn] Interval type: type T=[1..3] Unspecified type: type T!

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 79 59

slide-80
SLIDE 80

Definition of types and constants

We can now specify the types of features and properties:

1

type CAT= {np,vp,s,n,v,det}

2

type MARK= {lex,anchor,subst}

3

type LABEL !

4

type PERS= [1..3]

5

type GEN = {m,f}

6

type NUM = {sg,pl}

7

type AGR = [gen:GEN, num:NUM]

8 9 10

feature cat: CAT

11

feature e: LABEL

12

feature pers: PERS

13

feature agr: AGR

14 15

property mark: MARK

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 80 60

slide-81
SLIDE 81

Outline

1

Overview: Last week, this week

2

What is grammar implementation?

3

Two ways of tree template implementation Metarules Metagrammars

4

eXtensible Metagrammar (XMG)

5

Lexicon and parser

6

XMG 2: tutorial

7

Principles

8

Summary

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 81 61

slide-82
SLIDE 82

Principles: motivation

As fragments become more numerous, controlling their combi- nation (and the scope of variables) gets difficult Idea: adding new constraints on top of dominance and prece- dence Principles: sets of additionnal constraints for the solverCrabbeDuchier:04

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 82 62

slide-83
SLIDE 83

A set of principles

XMG offers several sets of additionnal constraints over the models (principles): colors: polarities for node unification rank: linear order constraints on nodes unicity: uniqueness of a feature inside a model

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 83 63

slide-84
SLIDE 84

Rank: Clitics ordering

The ordering of clitic pronouns (in Spanish or French for exam- ple) is known to be problematic when formalizing a grammar In a metagrammar, when combining fragments, nodes repre- senting these clitics have to come in a specific order Pedro nos la da *Pedro la nos da Je le lui laisse *Je lui le laisse

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 84 64

slide-85
SLIDE 85

Rank: Clitics ordering (in French)

Every produced model has to satisfy the order constraint

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 85 65

slide-86
SLIDE 86

Using principles: rank

1

use rank with () dims (syn)

2

type RANK=[1..7]

3

property rank: RANK

1

class CliticIobjectII

2

import nonReflexiveClitic[]

3

{

4

<syn>{

5

node xCl(rank=2)

6

[top=[func=iobj, pers = @{1,2}]]

7

}

8

}

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 86 66

slide-87
SLIDE 87

Using principles: unicity

1

use unicity with (rank=1) dims (syn)

2

use unicity with (rank=2) dims (syn)

3

use unicity with (rank=3) dims (syn)

4

use unicity with (rank=4) dims (syn)

5

use unicity with (rank=5) dims (syn)

6

use unicity with (rank=6) dims (syn)

7

use unicity with (rank=7) dims (syn)

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 87 67

slide-88
SLIDE 88

Using principles: colors

Colors are a solution to guide the combination of fragments A color is affected to every node New constraints on node unification

  • b
  • r
  • w

  • b

⊥ ⊥

  • b

  • r

⊥ ⊥ ⊥ ⊥

  • w
  • b

  • w

⊥ ⊥ ⊥ ⊥ ⊥ ⊥ Valid models only have red and black nodes

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 88 68

slide-89
SLIDE 89

Combination with polarities

S N V CanSubj N N S C Wh S V RelObj S V Active

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 89 69

slide-90
SLIDE 90

Combination with polarities

S◦W N•B V ◦W CanSubj N•R N•R S•R C•R Wh•R S◦W V ◦W RelObj S•B V •B Active

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 90 69

slide-91
SLIDE 91

Combination with polarities

s np↓ vp v np nx0Vnx1 v⋄ np↓ kick v kicked np det the n bucket kick_the_bucket

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 91 70

slide-92
SLIDE 92

Combination with polarities

s•B np↓•B vp•B v◦W np◦W nx0Vnx1 v⋄•B np↓•B kick v•B kicked np•B det•B the n•B bucket kick_the_bucket

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 92 70

slide-93
SLIDE 93

Using principles: colors

1

use color with () dims (syn)

2

type COLOR={red,black,white}

3

property color: COLOR

1

class nx0Vnx1

2

declare ?S ?NP_Subj ?VP ?V ?NP_Obj

3

{

4

<syn>{

5

?S (color=red)[cat=s] {

6

?NP_Subj (color=black, mark=subst) [cat=np]

7

?VP (color=black)[cat=vp] {

8

?V (color=white)[cat=v]

9

?NP_Obj (color=white)[cat=np]

10

}

11

}

12

}

13

}

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 93 71

slide-94
SLIDE 94

Outline

1

Overview: Last week, this week

2

What is grammar implementation?

3

Two ways of tree template implementation Metarules Metagrammars

4

eXtensible Metagrammar (XMG)

5

Lexicon and parser

6

XMG 2: tutorial

7

Principles

8

Summary

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 94 72

slide-95
SLIDE 95

Summary

A metagrammar contains descriptions of unanchored elementary trees. Metagrammar descriptions are declarative and multidimensional. Metagrammar descriptions make up an inheritance hierarchy. The metagrammar allows one to express and implement lexical generalizations, e.g. active-passive diathesis.

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 95 73

slide-96
SLIDE 96

Summary

A metagrammar contains descriptions of unanchored elementary trees. Metagrammar descriptions are declarative and multidimensional. Metagrammar descriptions make up an inheritance hierarchy. The metagrammar allows one to express and implement lexical generalizations, e.g. active-passive diathesis. Hot topics: parsing with metagrammars[7] use metagrammars for morphological descriptions[11,18]

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 96 73

slide-97
SLIDE 97

Summary

A metagrammar contains descriptions of unanchored elementary trees. Metagrammar descriptions are declarative and multidimensional. Metagrammar descriptions make up an inheritance hierarchy. The metagrammar allows one to express and implement lexical generalizations, e.g. active-passive diathesis. Hot topics: parsing with metagrammars[7] use metagrammars for morphological descriptions[11,18] Adjacent topics: grammar induction from treebanks[5,6,13,22]

Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 97 73

slide-98
SLIDE 98

[1] Alahverdzhieva, Katya. 2008. XTAG using XMG. A core Tree-Adjoining Grammar for

  • English. University of Nancy 2 / University of Saarland Master’s Thesis.

http://homepages.inf.ed.ac.uk/s0896251/pubs/msc-sb2008.pdf.

[2] Becker, Tilman. 1994. Hytag: a new type of tree adjoining grammars for hybrid syntactic representations of free word order languages. Universität des Saarlandes dissertation.

http://www.dfki.de/~becker/becker.diss.ps.gz.

[3] Becker, Tilman. 2000. Paterns in metarules for TAG. In Anne Abeillé & Owen Rambow (eds.), Tree Adjoining Grammars: Formalisms, linguistic analyses and processing (CSLI Lecture Notes 107), 331–342. Stanford, CA: CSLI Publications. [4] Candito, Marie-Hélène. 1996. A principle-based hierarchical representation of LTAGs. In Proceedings of the 16th international Conference on Computational Linguistics (COLING 96). Copenhagen. http://aclweb.org/anthology-new/C/C96/C96-1034.pdf. [5] Chen, John, Srinivas Bangalore & K. Vijay-Shanker. 2006. Automated extraction of Tree-Adjoining Grammars from treebanks. Natural Language Engineering 12. 251–299. [6] Chiang, David. 2000. Statistical parsing with an automatically-extracted Tree Adjoining

  • Grammar. In Proceedings of the 38th annual meeting of the Association for

Computational Linguistics, 456–463. Hong Kong. [7] de la Clergerie, Éric Villemonte. 2013. Exploring beam-based shif-reduce dependency parsing with DyALog: results from the SPMRL 2013 shared task. In 4th workshop on statistical parsing of morphologically rich languages (SPMRL’2013). Seatle.

http://hal.inria.fr/docs/00/87/91/29/PDF/dyalogsr.pdf.

[8] Crabbé, Benoît. 2005. Représentation informatique de grammaires d’arbres fortement lexicalisées: Le cas de la grammaire d’arbres adjoints. Université Nancy 2 dissertation.

slide-99
SLIDE 99

[9] Crabbé, Benoit, Denys Duchier, Claire Gardent, Joseph Le Roux & Yannick Parmentier.

  • 2013. XMG: eXtensible MetaGrammar. Computational Linguistics 39(3). 1–66.

http://hal.archives-ouvertes.fr/hal-00768224/en/.

[10] Dowty, David R. 1979. Word meaning and Montague Grammar. Reprinted 1991 by Kluwer Academic Publishers. Dordrecht: D. Reidel Publishing Company. [11] Duchier, Denys, Brunelle Magnana Ekoukou, Yannick Parmentier, Simon Petitjean & Emmanuel Schang. 2012. Describing morphologically rich languages using metagrammars: A look at verbs in Ikota. In Workshop on language technology for normalisation of less-resourced languages (SALTMIL 8 – AfLaT 2012), 55–59.

http://www.tshwanedje.com/publications/SaLTMiL8-AfLaT2012.pdf#page=67.

[12] Gazdar, Gerald. 1981. Unbounded dependencies and coordinated structure. Linguistic Inquiry 12. 155–182. [13] Kaeshammer, Miriam & Vera Demberg. 2012. German and English treebanks and lexica for Tree-Adjoining Grammars. In Nicoleta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk & Stelios Piperidis (eds.), Proceedings of the eighth international Conference on Language Resources and Evaluation (LREC’12). Istanbul, Turkey: European Language Resources Association (ELRA). [14] Kallmeyer, Laura, Timm Lichte, Wolfgang Maier, Yannick Parmentier & Johannes Dellert. 2008. Developing a TT-MCTAG for German with an RCG-based

  • parser. In European Language Resources Association (ELRA) (ed.), Proceedings of the

sixth international Conference on Language Resources and Evaluation (LREC’08). Marrakech, Morocco.

slide-100
SLIDE 100

[15] Lichte, Timm & Simon Petitjean. 2015. Implementing semantic frames as typed feature structures with XMG. Journal of Language Modelling 3(1). 185–228.

http://jlm.ipipan.waw.pl/index.php/JLM/article/view/96.

[16] Parmentier, Yannick, Laura Kallmeyer, Wolfgang Maier, Timm Lichte & Johannes Dellert. 2008. TuLiPA: A syntax-semantics parsing environment for mildly context-sensitive formalisms. In Proceedings of the ninth international workshop on Tree Adjoining Grammars and related formalisms (TAG+9), 121–128. Tübingen, Germany. [17] Petitjean, Simon. 2014. Génération Modulaire de Grammaires Formelles. Orléans, France: Université d’Orléans Thèse de Doctorat.

https://tel.archives-ouvertes.fr/tel-01163150/.

[18] Petitjean, Simon, Younes Samih & Timm Lichte. 2015. Une métagrammaire de l’interface morpho-sémantique dans les verbes en arabe. In Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles, 473–479. Caen, France.

http://www.atala.org/taln_archives/TALN/TALN-2015/taln-2015-court-024.

[19] Prolo, Carlos A. 2002. Generating the XTAG English grammar using metarules. In Proceedings of the 19th international Conference on Computational Linguistics (COLING 2002), 814–820. Taipei. Taiwan. [20] Ristad, Eric Sven. 1987. Revised General Phrase Structure Grammar. In Proceedings of the 25th annual meeting of the Association for Computational Linguistics, 243–250. Stanford, CA. http://www.aclweb.org/anthology/P87-1034. [21] Uszkoreit, Hans & Stanley Peters. 1987. On some formal properties of metarules.

  • English. In Walter J. Savitch, Emmon Bach, William Marsh & Gila Safran-Naveh (eds.),

The formal complexity of natural language (Studies in Linguistics and Philosophy 33), 227–250. Dordrecht, The Netherlands: D. Reidel Publishing.

http://dx.doi.org/10.1007/978-94-009-3401-6_9.

slide-101
SLIDE 101

[22] Xia, Fei. 2001. Automatic grammar generation from two different perspectives. University

  • f Pennsylvania dissertation.

http://faculty.washington.edu/fxia/papers_from_penn/thesis.pdf.

[23] XTAG Research Group. 2001. A Lexicalized Tree Adjoining Grammar for English.

  • Tech. rep. Philadelphia, PA: Institute for Research in Cognitive Science, University of

Pennsylvania.