Joint Learning of Syntactic and Semantic Dependencies Xavier Llu s - - PowerPoint PPT Presentation

joint learning of syntactic and semantic dependencies
SMART_READER_LITE
LIVE PREVIEW

Joint Learning of Syntactic and Semantic Dependencies Xavier Llu s - - PowerPoint PPT Presentation

Introduction Difficulties Joint model Results and discussion Future work Joint Learning of Syntactic and Semantic Dependencies Xavier Llu s and Llu s M` arquez TALP Research Center Technical University of Catalonia Barcelona,


slide-1
SLIDE 1

Introduction Difficulties Joint model Results and discussion Future work

Joint Learning of Syntactic and Semantic Dependencies

Xavier Llu´ ıs and Llu´ ıs M` arquez TALP Research Center Technical University of Catalonia

Barcelona, December 9, 2008

slide-2
SLIDE 2

Introduction Difficulties Joint model Results and discussion Future work

Introduction

Joint parsing is the simultaneous processing of the syntactic and semantic structure.

slide-3
SLIDE 3

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Syntactic and semantic parsing: syntax

A sample sentence

slide-4
SLIDE 4

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Syntactic and semantic parsing: syntax

Syntactic dependencies

slide-5
SLIDE 5

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Syntactic and semantic parsing: semantics

Predicate completed

slide-6
SLIDE 6

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Syntactic and semantic parsing: semantics

Semantic dependencies for completed

slide-7
SLIDE 7

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Syntactic and semantic parsing: semantics

Predicate acquisition

slide-8
SLIDE 8

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Syntactic and semantic parsing: semantics

Semantic dependencies for acquisition

slide-9
SLIDE 9

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Syntactic and semantic parsing: semantics

Predicate announcedq

slide-10
SLIDE 10

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Syntactic and semantic parsing: semantics

Semantic dependencies for announced

slide-11
SLIDE 11

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Syntactic and semantic parsing: semantics

Semantic dependencies for all predicates

slide-12
SLIDE 12

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Mainstream approach

The pipeline approach

1

Syntactic parsing

A parser (Eisner, Shift-reduce)

2

Semantic role labeling

A simpler (non-structured) classifier

⇒ ⇒

slide-13
SLIDE 13

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic dependencies

Pipeline strategy

The pipeline approach

1

Propagation or amplification of errors

2

Assumes an order of increasing difficulty

3

Dependencies between layers are hard to be captured

slide-14
SLIDE 14

Introduction Difficulties Joint model Results and discussion Future work The joint approach

Joint approach

Design a joint model

1

Overcome the pipeline approach

2

To build from scratch a simple and feasible system

slide-15
SLIDE 15

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

Design a joint model

A joint approach Extend a syntactic parsing model to jointly parse semantics

1

Syntactic parsing

A parser (Eisner, Shift-reduce)

2

Semantic role labeling

A simpler (non-structured) classifier

slide-16
SLIDE 16

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

Design a joint model

A joint approach Extend the Eisner algorithm to jointly parse semantics O(n3) algorithm Based on CKY algorithm Bottom-up parser

slide-17
SLIDE 17

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

The Eisner algorithm

Bottom-up dependency parsing

slide-18
SLIDE 18

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

The Eisner algorithm

Bottom-up dependency parsing

slide-19
SLIDE 19

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

The Eisner algorithm

Bottom-up dependency parsing

slide-20
SLIDE 20

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

The Eisner algorithm

Bottom-up dependency parsing

slide-21
SLIDE 21

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

The Eisner algorithm

Bottom-up dependency parsing

slide-22
SLIDE 22

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

The Eisner algorithm

Bottom-up dependency parsing

slide-23
SLIDE 23

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

The Eisner algorithm

Bottom-up dependency parsing

slide-24
SLIDE 24

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

The Eisner algorithm

Score of a dependency A dependency d = h, m, l of a sentence x is scored by: score(d, x) = φ (h, m, l , x) · w where φ is a feature extraction function, w is a weight vector

slide-25
SLIDE 25

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

The Eisner algorithm

Best tree We are interested in the best scoring tree among all trees Y(x): best tree(x) = argmax

y∈Y(x)

score tree(y, x) Eisner algorithm The Eisner algorithm is an exact search algorithm that computes the best first-order factorized tree.

slide-26
SLIDE 26

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

The Eisner algorithm

Score of a tree A syntactic tree y for a sentence x is scored by: score tree(y, x) =

  • h,m,l∈y

score (h, m, l , y) Arc-factorization The first order factorization is the sum of independent scores for each dependency of the tree.

slide-27
SLIDE 27

Introduction Difficulties Joint model Results and discussion Future work Design a joint model

Extension of the Eisner algorithm

Joint parsing point of view simultaneous prediction of the syntactic and semantic label

slide-28
SLIDE 28

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Extension of the Eisner algorithm: an example

The complete syntactic and semantic structure.

slide-29
SLIDE 29

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Extension of the Eisner algorithm: an example

Overlapping syntactic and semantic depencies.

slide-30
SLIDE 30

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Extension of the Eisner algorithm: an example

Overlapping syntactic and semantic depencies.

slide-31
SLIDE 31

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Extension of the Eisner algorithm: an example

Non-overlapping semantic dependencies.

slide-32
SLIDE 32

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Syntax and Semantics overlapping

  • 1. Are syntax and semantics overlapping?

36.4% of argument-predicate relations do not exactly overlap with modifier-head syntactic relations. Proposed solution Attach the semantic label to the syntactic dependency

slide-33
SLIDE 33

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Difficulties: non-overlapping semantics

Any given syntactic dependency

slide-34
SLIDE 34

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Difficulties: non-overlapping semantics

The related semantic dependencies

slide-35
SLIDE 35

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Difficulties: non-overlapping semantics

The overlapping A0 dependency

slide-36
SLIDE 36

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Difficulties: non-overlapping semantics

The overlapping A0 dependency will be jointly annotated

slide-37
SLIDE 37

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Difficulties: non-overlapping semantics

The non-overlapping A0 dependency

slide-38
SLIDE 38

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Difficulties: non-overlapping semantics

The non-overlapping A0 dependency will also be jointly annotated

slide-39
SLIDE 39

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Difficulties: non-overlapping semantics

Solution An extended dependency is: d =

  • h, m, lsyn, lsem p1, . . . , lsem pq
  • h is the head

m the modifier lsyn the syntactic label lsem pi one semantic label for each sentence predicate pi

slide-40
SLIDE 40

Introduction Difficulties Joint model Results and discussion Future work Syntactic and semantic overlap

Proposed solution

OBJ, _, _, Su OBJ, A1, A1, _ AMOD, _, AM−TMP, _ NMOD, _, _, _ NMOD, _, _, _ SBJ, A0, _, A0

A dependency has its syntactic and semantic labels

slide-41
SLIDE 41

Introduction Difficulties Joint model Results and discussion Future work Unavailable features

Proposed solution: unavailable features

A dependency with semantic labels

slide-42
SLIDE 42

Introduction Difficulties Joint model Results and discussion Future work Unavailable features

Proposed solution: unavailable features

The first A0 is an overlapping semantic dependency

slide-43
SLIDE 43

Introduction Difficulties Joint model Results and discussion Future work Unavailable features

Proposed solution: unavailable features

The first A0 is an overlapping semantic dependency

slide-44
SLIDE 44

Introduction Difficulties Joint model Results and discussion Future work Unavailable features

Proposed solution: unavailable features

The second A0 is a non-overlapping semantic dependency

slide-45
SLIDE 45

Introduction Difficulties Joint model Results and discussion Future work Unavailable features

Proposed solution: unavailable features

The second A0 is a non-overlapping semantic dependency

slide-46
SLIDE 46

Introduction Difficulties Joint model Results and discussion Future work Unavailable features

Proposed solution: unavailable features

The second A0 is a non-overlapping semantic dependency

slide-47
SLIDE 47

Introduction Difficulties Joint model Results and discussion Future work Unavailable features

Proposed solution: unavailable features

The syntactic relation is not yet processed

slide-48
SLIDE 48

Introduction Difficulties Joint model Results and discussion Future work Unavailable features

Problems inherited from traditional pipeline design

  • 2. Problems not appearing in pipeline systems

State-of-the-art SRL systems strongly rely on syntactic path features. There is only a partial visibility of the syntax restricted to the current sentence span. A distant argument-predicate relation can occur. Proposed solution Pre-parse and extract predicate-modifier syntactic paths.

slide-49
SLIDE 49

Introduction Difficulties Joint model Results and discussion Future work Formalization

Joint Model

Joint model formalization

slide-50
SLIDE 50

Introduction Difficulties Joint model Results and discussion Future work Joint Model

Joint Model

The Joint Model extends and it is based on the first order syntactic model Best joint tree best tree(x, w, y ′) = argmax

y∈Y(x)

score tree(y, x, w, y ′) argmax computed using the Eisner algorithm x is the input sentence y is the syntactic-semantic tree y ′ pre-parsed syntactic tree w is the weight vector

slide-51
SLIDE 51

Introduction Difficulties Joint model Results and discussion Future work Joint Model

Joint model

First order factorization score tree(y, x, w, y ′) =

  • h,m,lsyn,l∈y

score(h, m, lsyn, l , x, w, y ′)

x is the input sentence y is the syntactic-semantic tree y ′ pre-parsed syntactic tree w is the weight vector l = lsem p1, . . . , lsem pq are the semantic labels for predicates pi

slide-52
SLIDE 52

Introduction Difficulties Joint model Results and discussion Future work Joint Model

Scoring

score

  • h, m, lsyn, l , x, w, y ′

= syntactic score (h, m, lsyn, x, w) + semantic score

  • h, m,lsem p1, . . . , lsem pq, x, w, y ′

The score of a dependency is the syntactic score (as usual) + the semantic score of the assigned semantic label (if any) for each predicate l = lsem p1, . . . , lsem pq

slide-53
SLIDE 53

Introduction Difficulties Joint model Results and discussion Future work Joint Model

Semantic Scoring

Semantic scoring function semantic score

  • h, m,lsem p1, . . . , lsem pq, x, w, y ′

=

  • lsem pi

φsem (h, m,lsem pi , pi, x, y ′) · w(lsem pi ) q y ′ is the precomputed syntax tree for feature extraction lsem pi is the semantic label of m for predicate pi

slide-54
SLIDE 54

Introduction Difficulties Joint model Results and discussion Future work Architecture

System summary

Core

Averaged perceptron learning + Eisner algorithm inference

Collins, 2002 Eisner, 1996 and based on Carreras et al. 2006

Features State-of-the-art features adapted to the dependency formalism: syntax McDonald et al. (2005) and Carreras et al. (2006) semantics Xue and Palmer (2004) and Surdeanu et al. (2007)

slide-55
SLIDE 55

Introduction Difficulties Joint model Results and discussion Future work

Results

The system was presented to the CoNLL-2008 shared task.

slide-56
SLIDE 56

Introduction Difficulties Joint model Results and discussion Future work

Learning curve (development)

slide-57
SLIDE 57

Introduction Difficulties Joint model Results and discussion Future work Syntactic-Semantic Overlap

Learning curve (development)

slide-58
SLIDE 58

Introduction Difficulties Joint model Results and discussion Future work Syntactic-Semantic Overlap

Discussion

Could semantics hurt syntax? Analize the effects of semantics ⇒ syntax

The semantic score increases the overall dependency score The overall dependency score defines the syntax

slide-59
SLIDE 59

Introduction Difficulties Joint model Results and discussion Future work Syntactic-Semantic Overlap

The syntactic and semantic scores on a dependency

A correct syntactic dependency with its syntactic score scorey

slide-60
SLIDE 60

Introduction Difficulties Joint model Results and discussion Future work Syntactic-Semantic Overlap

The syntactic and semantic scores on a dependency

A correct syntactic dependency with its score increased by the semantic score: improved ↑ syntax

slide-61
SLIDE 61

Introduction Difficulties Joint model Results and discussion Future work Syntactic-Semantic Overlap

What if the semantic score is not so dependant on syntax?

An incorrect competing dependency scorey

slide-62
SLIDE 62

Introduction Difficulties Joint model Results and discussion Future work Syntactic-Semantic Overlap

What if the semantic score is not so dependant on syntax?

An incorrect compenting dependency with its syntactic score score

slide-63
SLIDE 63

Introduction Difficulties Joint model Results and discussion Future work Syntactic-Semantic Overlap

What if the semantic score is not so dependant on syntax?

An incorrect compenting dependency with its score increased by the semantic score: hurt ↓ syntax

slide-64
SLIDE 64

Introduction Difficulties Joint model Results and discussion Future work Syntactic-Semantic Overlap

Discussion

Why an incorrect syntactic dependency could have a high semantic score?

The semantic score is almost independent of the correct syntactic dependency.

slide-65
SLIDE 65

Introduction Difficulties Joint model Results and discussion Future work Syntactic-Semantic Overlap

Discussion

The semantic score is almost independent of the correct syntactic dependency: it mainly relies on features extracted from the modifier-predicate semantic score

  • h, m,lsem pi, x, w, y ′

= φsem

  • h, m,pi, x, y ′

· w(lsem pi ) Features are extracted by φsem from:

h head m modifier pi predicate h, m modifier-head m, pi modifier-predicate h, pi head-predicate

slide-66
SLIDE 66

Introduction Difficulties Joint model Results and discussion Future work CoNLL-2008 results

Posteval results

Group Name WSJ + Brown WSJ Brown Lund (*) Johansson (*) 85.49 86.61 76.34 Yahoo! (*) Ciaramita (*) 82.69 83.83 73.51 HIT-IR Che 82.66 83.78 73.57 Hong Kong (*) Zhao (*) 82.24 83.41 72.70 Geneva (*) Henderson (*) 80.48 81.53 71.93 Koc Yuret 79.84 80.97 70.55 GSLT ML2 Samuelsson 79.79 80.92 70.49 DFKI 2 Zhang 79.32 80.41 70.48 NAIST Watanabe 79.10 80.3 69.29 Antwerp Morante 78.43 79.52 69.55 HIT-ICR Li 78.35 79.38 70.01 UPC (*) Llu´ ıs (*) 78.11 79.16 69.84 UT Austin Baldridge 77.49 78.57 68.53 Koc Yatbaz 77.45 78.43 69.61 USTC Chen 77.00 77.95 69.23 Korea Lee 76.90 77.96 68.34 Peking Sun 76.28 77.1 69.58 Colorado Choi 71.23 72.22 63.44 UAIC Trandabat 63.45 64.21 57.41 DFKI 1 Neumann 19.93 20.13 18.14 Reasonable results for a built from scratch system. It is one of the most efficient systems.

slide-67
SLIDE 67

Introduction Difficulties Joint model Results and discussion Future work

Future and Ongoing Work

1

Higher degree of joint processing

Joint predicate identification No previous dependency parsing

2

Higher order dependencies

3

Improvement of the semantic classifier component

4

Projectivization techniques

5

Feature engineering and system tuning

6

Alternative joint models

slide-68
SLIDE 68

Introduction Difficulties Joint model Results and discussion Future work

Ongoing Work

Jointparser demo http://www.lsi.upc.edu/~xlluis/jointparser

slide-69
SLIDE 69

Introduction Difficulties Joint model Results and discussion Future work

The end Thank you

slide-70
SLIDE 70

Introduction Difficulties Joint model Results and discussion Future work

For further reading

Xavier Carreras, Mihai Surdeanu and Llu´ ıs M` arquez Projective dependency parsing with perceptron. Proceedings of the CoNLL-2006, 2006. Ryan McDonald, Koby Crammer and Fernando Pereira Online large-margin training of dependency parsers. Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, 2005. James Henderson, Paola Merlo, Gabriele Musillo and Ivan Titov A Latent Variable Model of Synchronous Parsing for Syntactic and Semantic Dependencies. Proceedings of the CoNLL-2008, 2008.

slide-71
SLIDE 71

Introduction Difficulties Joint model Results and discussion Future work

For further reading

Mihai Surdeanu, Llu´ ıs M` arquez, Xavier Carreras and Pere R. Comas Combination strategies for semantic role labeling. Journal of Artificial Intelligence Research, 2007. Nianwen Xue and Martha Palmer Calibrating features for semantic role labeling. Proceedings of the Empirical Methods in Natural Language Processing, 2004. Xavier Llu´ ıs and Llu´ ıs M` arquez A Joint Model for Parsing Syntactic and Semantic Dependencies. Proceedings of the CoNLL-2008, 2008.