Learning Joint Semantic Parsers from Disjoint Data Hao Peng 1 , Sam - - PowerPoint PPT Presentation

learning joint semantic parsers from disjoint data
SMART_READER_LITE
LIVE PREVIEW

Learning Joint Semantic Parsers from Disjoint Data Hao Peng 1 , Sam - - PowerPoint PPT Presentation

Learning Joint Semantic Parsers from Disjoint Data Hao Peng 1 , Sam Thomson 2 , Swabha Swayamdipta 2 Noah A. Smith 1 1 University of Washington 2 Carnegie Mellon University @NAACL June 4, 2018 Motivations almost Larger data Better


slide-1
SLIDE 1

Learning Joint Semantic Parsers from Disjoint Data

Hao Peng1, Sam Thomson2, Swabha Swayamdipta2Noah A. Smith1

1University of Washington 2Carnegie Mellon University

@NAACL June 4, 2018

slide-2
SLIDE 2

Motivations

❖ Larger data

Better performance

almost

❖ Overlaps among different theories

slide-3
SLIDE 3

Learning Joint Semantic Parsers from Disjoint Data

FrameNet vs. semantic dependencies Different structures; no parallel annotations

Overview

slide-4
SLIDE 4

Learning Joint Semantic Parsers from Disjoint Data

FrameNet vs. semantic dependencies Different structures; no parallel annotations Joint decoding Latent variables

Overview

slide-5
SLIDE 5

❖ Parsing semantic spans and dependencies ❖ Joint parsing ❖ Learning with latent variables ❖ Empirical results

Outline

slide-6
SLIDE 6

Input:

A few books fell in the room .

fall.v

Target: token span Lexical unit: lemma.pos

Parsing FrameNet Structures

Baker et al., (1998)

slide-7
SLIDE 7

Input:

A few books fell in the room .

fall.v

Target: token span Lexical unit: lemma.pos

Output:

Parsing FrameNet Structures

A few books fell in the room .

fall.v Motion Directional Theme Place

Arguments: span + semantic roles Frame

who what when where …

Baker et al., (1998)

slide-8
SLIDE 8

Input:

A few books fell in the room .

fall.v

A few books fell in the room .

fall.v Motion Directional Theme Place

F

  • =

Theme Place Motion Directional

fframe

  • + farg
  • + farg
  • Score:

Parsing FrameNet Structures

slide-9
SLIDE 9

Input:

A few books fell in the room .

fall.v

A few books fell in the room .

fall.v Motion Directional Theme Place

F

  • =

Theme Place Motion Directional

fframe

  • + farg
  • + farg
  • Score:

Parsing FrameNet Structures

BiLSTM+MLPs

slide-10
SLIDE 10

max

A few books fell in the room .

fall.v

frame? arg1?

F

  • arg2?

arg3?

frame, args

Dynamic program

Kong et al., (2016); Swayamdipta et al., (2017)

Parsing FrameNet Structures

  • non-overlapping
  • consistency

s.t. Decoding:

slide-11
SLIDE 11

Input: Output:

MRS-derived dependencies (DM)

Parsing Semantic Dependencies

arg2

A few books fell in the room .

arg1 mwe arg1 arg1 BV top

head modifier

role label

who what when where …

Oepen et al., (2015)

A few books fell in the room .

slide-12
SLIDE 12

Input:

A few books fell in the room .

Score:

Parsing Semantic Dependencies

arg2

A few books fell in the room .

arg1 mwe arg1 arg1 BV top

G

  • =

head mod

role

X g

  • BiLSTM+MLPs

labeled arcs

slide-13
SLIDE 13

Parsing Semantic Dependencies

Decoding: Linear program

AD3; Martins et al., (2011)

few books

compound

few books

arg1

fell room

arg2

? ? ? …

G

  • max

labeled arcs

  • consistency
  • determinism

s.t.

slide-14
SLIDE 14

❖ Parsing semantic spans and dependencies ❖ Joint parsing ❖ Learning with latent variables ❖ Empirical results

Outline

slide-15
SLIDE 15

Joint Parsing

Sharing parameters:

Swayamdipta et al., (2016); Hershcovich et al., (2018) A few books fell in the room .

fall.v Motion Directional Theme Place

F

  • arg2
arg1 mwe arg1 arg1 BV top

A few books fell in the room .

G

  • Shared LSTMs
slide-16
SLIDE 16

Joint Parsing

A few books fell in the room .

fall.v Motion Directional Theme Place

F

  • arg2
arg1 mwe arg1 arg1 BV top

A few books fell in the room .

G

  • Shared LSTMs

This work, joint decoding:

Theme

A few books fell in the room .

fall.v Motion Directional Place

arg2 arg1 mwe arg1 arg1 BV top

H ⇣ ⌘

Sharing parameters:

Swayamdipta et al., (2016); Hershcovich et al., (2018)

slide-17
SLIDE 17

Joint Parsing

A few books fell in the room .

fall.v Motion Directional Theme Place

F

  • arg2
arg1 mwe arg1 arg1 BV top

A few books fell in the room .

G

  • Shared LSTMs

This work, joint decoding:

Theme

A few books fell in the room .

fall.v Motion Directional Place

arg2 arg1 mwe arg1 arg1 BV top

H ⇣ ⌘

Sharing parameters:

Swayamdipta et al., (2016); Hershcovich et al., (2018)

Orthogonal

slide-18
SLIDE 18

Joint Parsing

Input:

A few books fell in the room .

fall.v

Score:

Theme

A few books fell in the room .

fall.v Motion Directional Place

arg2 arg1 mwe arg1 arg1 BV top

H ⇣ ⌘

slide-19
SLIDE 19

Joint Parsing

Input:

A few books fell in the room .

fall.v

=

A few books fell in the room .

fall.v Motion Directional Theme Place

arg2 arg1 mwe arg1 arg1 BV top

A few books fell in the room .

F

  • + G
  • FrameNet Score

DM Score

Theme

A few books fell in the room .

fall.v Motion Directional Place

arg2 arg1 mwe arg1 arg1 BV top

H ⇣ ⌘

Score:

slide-20
SLIDE 20

Joint Parsing

Input:

A few books fell in the room .

fall.v

=

A few books fell in the room .

fall.v Motion Directional Theme Place

arg2 arg1 mwe arg1 arg1 BV top

A few books fell in the room .

F

  • + G
  • +hjoint
  • ?

FrameNet Score DM Score Affinities between them

Theme

A few books fell in the room .

fall.v Motion Directional Place

arg2 arg1 mwe arg1 arg1 BV top

H ⇣ ⌘

Score:

slide-21
SLIDE 21

Span vs. Dependencies

If both were spans

Finkel and Manning, (2009)

?

hjoint

  • head mod

role1 role2

hjoint

  • role2

role1

hjoint

  • If both were dependencies

Lluís et al., (2013); Peng et al., (2017)

slide-22
SLIDE 22

Span vs. Dependencies

If both were spans

Finkel and Manning, (2009)

Structural divergence

A few books fell

fall.v Motion Directional Theme

arg1 mwe arg1

?

hjoint

  • head mod

role1 role2

hjoint

  • role2

role1

hjoint

  • If both were dependencies

Lluís et al., (2013); Peng et al., (2017)

slide-23
SLIDE 23

Span vs. Dependencies

Structural divergence

A few books fell

fall.v Motion Directional Theme

arg1 mwe arg1

Designate a head for each span

PropBank dependencies; Surdeanu et al., (2008)

A few books fell

fall.v Theme

slide-24
SLIDE 24

Span vs. Dependencies

Structural divergence

A few books fell

fall.v Motion Directional Theme

arg1 mwe arg1

Designate a head for each span

PropBank dependencies; Surdeanu et al., (2008)

A few books fell

fall.v Theme

Head selected by syntax

Collins, (2003)

slide-25
SLIDE 25

Span vs. Dependencies

Structural divergence

A few books fell

fall.v Motion Directional Theme

arg1 mwe arg1

A few books fell

fall.v Theme

arg1

Designate a head for each span

PropBank dependencies; Surdeanu et al., (2008)

slide-26
SLIDE 26

Span vs. Dependencies

Structural divergence

A few books fell

fall.v Motion Directional Theme

arg1 mwe arg1

This work

A few books fell

fall.v Theme

A few books fell

fall.v Theme

A few books fell

fall.v Theme

slide-27
SLIDE 27

Span vs. Dependencies

Score:

A few books fell

fall.v

Theme

arg1

Motion Directional

+hjoint ⇣ ⌘

=

A few books fell in the room .

fall.v Motion Directional Theme Place

arg2 arg1 mwe arg1 arg1 BV top

A few books fell in the room .

F

  • + G
  • FrameNet Score

DM Score Affinities between them Multilinear mapping

Theme

A few books fell in the room .

fall.v Motion Directional Place

arg2 arg1 mwe arg1 arg1 BV top

H ⇣ ⌘

slide-28
SLIDE 28

Span vs. Dependencies

Decoding:

maxH ⇣ ⌘

arg1 ? arg2 ?

A few books fell in the room .

fall.v

frame? arg1? arg2? arg3?

frame, args labeled arcs joint parts

Linear program Speed up by promoting sparsity

BV ?

slide-29
SLIDE 29

❖ Parsing semantic spans and dependencies ❖ Joint parsing ❖ Learning with latent variables ❖ Empirical results

Outline

slide-30
SLIDE 30

Learning with Latent Variables

FrameNet data DM data

slide-31
SLIDE 31

Supervision

Theme

head mod

role

A few books fell

fall.v Theme

Learning with Latent Variables

Supervision

Theme

head mod

role

A few books fell

fall.v Theme

FrameNet data DM data

slide-32
SLIDE 32

Learning with Latent Variables

Latent structured hinge

Yu and Joachims, (2009) frame, args labeled arcs joint parts

L = − max H ⇣ ⌘ + max H ⇣ ⌘ + δ

arg1 ? arg2 ?

A few books fell in the room .

fall.v frame? arg1? arg2? arg3? BV ? arg1 ? arg2 ?

A few books fell in the room .

fall.v BV ? Motion Directional Theme Place

labeled arcs joint parts

FrameNet data

slide-33
SLIDE 33

Learning with Latent Variables

Latent structured hinge

Yu and Joachims, (2009) frame, args labeled arcs joint parts

L = − max H ⇣ ⌘ + max H ⇣ ⌘ + δ

arg1 ? arg2 ?

A few books fell in the room .

fall.v frame? arg1? arg2? arg3? BV ? arg1 ? arg2 ?

A few books fell in the room .

fall.v BV ? Motion Directional Theme Place

labeled arcs joint parts

cost

FrameNet data

Prediction

slide-34
SLIDE 34

Learning with Latent Variables

Latent structured hinge

Yu and Joachims, (2009) frame, args labeled arcs joint parts

Gold FN output

L = − max H ⇣ ⌘ + max H ⇣ ⌘ + δ

arg1 ? arg2 ?

A few books fell in the room .

fall.v frame? arg1? arg2? arg3? BV ? arg1 ? arg2 ?

A few books fell in the room .

fall.v BV ? Motion Directional Theme Place

labeled arcs joint parts

FrameNet data

slide-35
SLIDE 35

Learning with Latent Variables

Latent structured hinge

Yu and Joachims, (2009) frame, args labeled arcs joint parts

L = − max H ⇣ ⌘ + max H ⇣ ⌘ + δ

arg1 ? arg2 ?

A few books fell in the room .

fall.v frame? arg1? arg2? arg3? BV ? arg1 ? arg2 ?

A few books fell in the room .

fall.v BV ? Motion Directional Theme Place

labeled arcs joint parts

FrameNet data

slide-36
SLIDE 36

❖ Parsing semantic spans and dependencies ❖ Joint parsing ❖ Learning with latent variables ❖ Empirical results

Outline

slide-37
SLIDE 37

FrameNet Results

Compared models:

FitzGerald et al. (2015) Open-SESAME Yang & Mitchell (2017) This work Basic w/o joint decoding Frame & Arg. ID. Multitask Learning Joint Decoding

Open-SESAME: Swayamdipta et al., (2017)

slide-38
SLIDE 38

FrameNet Results

Compared models:

FitzGerald et al. (2015) Open-SESAME Yang & Mitchell (2017) This work Basic w/o joint decoding Frame & Arg. ID. Multitask Learning Joint Decoding

Pipeline Predict both frames and arguments

Open-SESAME: Swayamdipta et al., (2017)

slide-39
SLIDE 39

FrameNet Results

Compared models:

FitzGerald et al. (2015) Open-SESAME Yang & Mitchell (2017) This work Basic w/o joint decoding Frame & Arg. ID. Multitask Learning Joint Decoding

FrameNet & PropBank FrameNet & Syntax FrameNet & DM

Open-SESAME: Swayamdipta et al., (2017)

slide-40
SLIDE 40

FrameNet Results

Compared models:

FitzGerald et al. (2015) Open-SESAME Yang & Mitchell (2017) This work Basic w/o joint decoding Frame & Arg. ID. Multitask Learning Joint Decoding

Share LSTMs & embeddings Joint decoding FrameNet only

Open-SESAME: Swayamdipta et al., (2017)

slide-41
SLIDE 41

FrameNet Results

FitzGerald et al. (2015) Open-SESAME Yang & Mitchell (2017) This work Basic w/o joint decoding Frame & Arg. ID. Multitask Learning Joint Decoding

F1 60 64 68 72 76 80

single ensemble

Frame and argument F1, FrameNet 1.5 test set Open-SESAME: Swayamdipta et al., (2017)

slide-42
SLIDE 42

FrameNet Results

FitzGerald et al. (2015) Open-SESAME Yang & Mitchell (2017) This work Basic w/o joint decoding Frame & Arg. ID. Multitask Learning Joint Decoding

F1 60 64 68 72 76 80

single ensemble

2× 2× 5× 10×

Frame and argument F1, FrameNet 1.5 test set Open-SESAME: Swayamdipta et al., (2017)

slide-43
SLIDE 43

FrameNet Results

FitzGerald et al. (2015) Open-SESAME Yang & Mitchell (2017) This work Basic w/o joint decoding Frame & Arg. ID. Multitask Learning Joint Decoding

F1 60 64 68 72 76 80

single ensemble

Frame and argument F1, FrameNet 1.5 test set Open-SESAME: Swayamdipta et al., (2017)

2× 5× 10× 2×

slide-44
SLIDE 44

Accuracy 82 84 86 88 90 92

Hartmann Y & M Hermman This work

Frame ID. accuracy, FrameNet 1.5 test set

slide-45
SLIDE 45

Accuracy 82 84 86 88 90 92

Hartmann Y & M Hermman This work

Frame ID. accuracy, FrameNet 1.5 test set

This work NeurboParser

Labeled F1 82 84 86 88 90 92

in-domain

  • ut-of-domain

DM labeled F1 SemEval ’15 test set NeurboParser: Peng et al., (2017)

slide-46
SLIDE 46

Conclusion

Problem

slide-47
SLIDE 47

Conclusion

Problem Method

slide-48
SLIDE 48

Conclusion

Problem Method Results

slide-49
SLIDE 49

Addressee thank.v Judgement Direct Address

thank you

arg2 Communicator

I

arg1