A Corpus and Model Integrating Multiword Expressions and Supersenses - - PowerPoint PPT Presentation

a corpus and model integrating multiword expressions and
SMART_READER_LITE
LIVE PREVIEW

A Corpus and Model Integrating Multiword Expressions and Supersenses - - PowerPoint PPT Presentation

A Corpus and Model Integrating Multiword Expressions and Supersenses Nathan Schneider Noah Smith NAACL-HLT June 3, 2015, Denver Given a sentence fi nd & categorize minimal units of meaning cheaply, with broad coverage 2 Noam_Chomsky


slide-1
SLIDE 1

NAACL-HLT • June 3, 2015, Denver

A Corpus and Model Integrating Multiword Expressions and Supersenses

Nathan Schneider Noah Smith

slide-2
SLIDE 2

2

Given a sentence find & categorize minimal units of meaning cheaply, with broad coverage

slide-3
SLIDE 3

3

Noam_Chomsky refused to give_in_to the vicious daddy_longlegs .

slide-4
SLIDE 4

4

Noam_Chomsky refused to give_in_to the vicious daddy_longlegs .

slide-5
SLIDE 5

5

Noam_Chomsky refused to give_in_to the vicious daddy_longlegs .

slide-6
SLIDE 6

6

Noam_Chomsky refused to give_in_to the vicious daddy_longlegs .

slide-7
SLIDE 7

7

Alan_Black refused to give_in_to the vicious daddy_longlegs .

Lexical segmentation

Noam_Chomsky refused to give_in_to the vicious daddy_longlegs .

multiword expressions

slide-8
SLIDE 8

8

  • sdfsf

Jonathan Huang

N

  • a

m _ C h

  • m

s k y d a d d y _ l

  • n

g l e g s give_in_to

slide-9
SLIDE 9

9

Supersense tagging

Noam_Chomsky refused to give_in_to the vicious daddy_longlegs .

N:PERSON V:COGNITION

V:SOCIAL

– –

N:ANIMAL

slide-10
SLIDE 10

Outline

  • Background
  • multiword expressions
  • supersenses
  • Dataset
  • Joint model
  • Results

10

slide-11
SLIDE 11

Definition

  • Multiword expression (MWE): 2 or more
  • rthographic words/lexemes that function

together as an idiomatic whole

  • idiomatic = not fully predictable in 


form, function, and/or frequency

  • unusual morphosyntax: Me/*Him neither; 


by and large; plural of daddy longlegs?

  • non- or semi-compositional: 


ice cream, daddy longlegs, pay attention

  • statistically collocated:


p(highly unlikely) > p(strongly unlikely)

11

(Baldwin & Kim, 2010; Schneider et al., LREC 2014)

slide-12
SLIDE 12

Definition

  • Multiword expression (MWE): 2 or more
  • rthographic words/lexemes that function

together as an idiomatic whole

  • idiomatic = not fully predictable in 


form, function, and/or frequency

  • unusual morphosyntax: Me/*Him neither; 


by and large; plural of daddy longlegs?

  • non- or semi-compositional: 


ice cream, daddy longlegs, pay attention

  • statistically collocated:


p(highly unlikely) > p(strongly unlikely)

12

S P E C I A L L Y L E A R N E D

(Baldwin & Kim, 2010; Schneider et al., LREC 2014)

slide-13
SLIDE 13

13

Noam Chomsky daddy longlegs, hot dog dry out depend on, come across pay attention (to) put up with, give in (to) under the weather cut and dry in spite of pick up where __ left off easy as pie You’re welcome. To each his own. The structure of this paper is as follows. pay dry the clothes

  • ut

close attention (to) they pick up where left off __ no attention was paid (to)

slide-14
SLIDE 14

The CMWE Corpus

  • The entire REVIEWS subsection of the

English Web Treebank (Bies et al. 2012), comprehensively annotated for MWEs

  • 723 reviews
  • 3,800 sentences
  • 55,000 words
  • found 3,500 MWE instances
  • 57% of all sentences (72% >10 words)

contain an MWE

14

(Schneider et al., LREC 2014)

slide-15
SLIDE 15

CMWE Example

They gave me the run around and missing paperwork only to call back to tell me someone else wanted her and I would need to come in and put down a deposit .

15

(Schneider et al., LREC 2014)

slide-16
SLIDE 16

CMWE Example

They gave_ me _the_run_around and missing paperwork only to call_back to tell me someone else wanted her and I would need to come_in and put_down a deposit .

16

Simplified a bit for presentational purposes (we also made a strong/weak distinction)

(Schneider et al., LREC 2014)

slide-17
SLIDE 17

17

I ’m in the green_room getting_ready for my panel #textworld –

V:STATIVE

– –

N:LOCATION V:COGNITION

– –

N:GROUP

supersenses

slide-18
SLIDE 18

NATURAL OBJECT ARTIFACT LOCATION PERSON GROUP SUBSTANCE TIME RELATION QUANTITY FEELING MOTIVE COMMUNICATION COGNITION STATE ATTRIBUTE ACT EVENT PROCESS PHENOMENON SHAPE POSSESSION FOOD BODY PLANT ANIMAL OTHER

! !

BODY CHANGE COGNITION COMMUNICATION COMPETITION CONSUMPTION CONTACT CREATION EMOTION MOTION PERCEPTION POSSESSION SOCIAL STATIVE WEATHER

18

noun verb

sewer

slide-19
SLIDE 19

Supersenses

  • Semantic classes originally defined by

WordNet

  • Can be inferred from WordNet annotations

in SemCor (Miller et al. 1993)

  • …or annotated directly (Schneider et al.

2012: Arabic Wikipedia; this work)

  • also Johannsen et al. 2014: English Twitter
  • automatic tagging (Ciaramita & Altun 2006;

Paaß & Reichartz 2009; Schneider et al. 2013; 
 Johannsen et al. 2014)

19

slide-20
SLIDE 20

Outline

✓ Background

  • multiword expressions
  • supersenses
  • Dataset
  • Joint model
  • Results

20

slide-21
SLIDE 21

STREUSLE Corpus

21

Supersense Tagged Repository of English with a Unified Semantics for Lexical Expressions

slide-22
SLIDE 22

STREUSLE Corpus

  • Annotated with
  • comprehensive MWEs
  • noun+verb supersenses

22

slide-23
SLIDE 23

23

I googled restaurants in the area and Fuji_Sushi came_up and reviews were great so I made_ a carry_out _order –

V:COMMUNICATION N:GROUP N:LOCATION N:GROUP V:COMMUNICATION N:COMMUNICATION V:COMMUNICATION N:POSSESSION

slide-24
SLIDE 24

24

I googled restaurants in the area and Fuji_Sushi came_up and reviews were great so I made_ a carry_out _order –

V:COMMUNICATION N:GROUP N:LOCATION N:GROUP V:COMMUNICATION N:COMMUNICATION V:COMMUNICATION N:POSSESSION

slide-25
SLIDE 25

25

I googled restaurants in the area and Fuji_Sushi came_up and reviews were great so I made_ a carry_out _order –

V:COMMUNICATION N:GROUP N:LOCATION N:GROUP V:COMMUNICATION N:COMMUNICATION V:COMMUNICATION N:POSSESSION

slide-26
SLIDE 26

26

I googled restaurants in the area and Fuji_Sushi came_up and reviews were great so I made_ a carry_out _order –

V:COMMUNICATION N:GROUP N:LOCATION N:GROUP V:COMMUNICATION N:COMMUNICATION V:COMMUNICATION N:POSSESSION

slide-27
SLIDE 27

27

I googled restaurants in the area and Fuji_Sushi came_up and reviews were great so I made_ a carry_out _order –

V:COMMUNICATION N:GROUP N:LOCATION N:GROUP V:COMMUNICATION N:COMMUNICATION V:COMMUNICATION N:POSSESSION

slide-28
SLIDE 28

28

I googled restaurants in the area and Fuji_Sushi came_up and reviews were great so I made_ a carry_out _order –

V:COMMUNICATION N:GROUP N:LOCATION N:GROUP V:COMMUNICATION N:COMMUNICATION V:COMMUNICATION N:POSSESSION

slide-29
SLIDE 29

STREUSLE Annotation

  • Starting point: CMWE corpus
  • 2 main phases:
  • noun supersenses
  • verb supersenses
  • Some sentences were reserved for

combined noun+verb annotation

29

slide-30
SLIDE 30

STREUSLE Annotation

30

  • Preexisting conventions for noun

supersenses that were applied to Arabic Wikipedia (Schneider et al., 2012)

  • This work: New conventions for verb

supersenses

slide-31
SLIDE 31

31

slide-32
SLIDE 32

STREUSLE Annotation: Verbs

32

slide-33
SLIDE 33

STREUSLE Annotation: Verbs

33

slide-34
SLIDE 34

STREUSLE IAA

34

  • We estimated inter-annotator F1 of

supersense labels at the end of each phase of annotation.

  • Nouns-only phase: 76%
  • Verbs-only phase: 93%
  • Combined phase: 88%
slide-35
SLIDE 35

Outline

✓ Background

  • multiword expressions
  • supersenses

✓ Dataset

  • Joint model
  • Results

35

slide-36
SLIDE 36

Outline

✓ Background

  • multiword expressions
  • supersenses

✓ Dataset

  • Joint model
  • Results

36

slide-37
SLIDE 37
  • Contiguous MWE identification resembles

chunking, so we can use the familiar BIO scheme (Ramshaw & Marcus 1995):
 


  • 3 new tags for gaps:


  • Assumption: no more than 1 level of nesting
  • Evaluation: MWE precision/recall
  • Link-based: partial credit for partial overlap

Gappy sequence tagging

37

a routine oil_change . O O B I O My wife had taken_ her '07_Ford_Fusion _in B O O O

  • b

i i I

(Schneider et al., TACL 2014)

slide-38
SLIDE 38
  • Standard supervised learning with the

enriched tagging scheme

  • Structured perceptron (Collins 2002)
  • Discriminative
  • 1st-order Markov assumption
  • Averaging
  • Fast to train

38

Gappy sequence tagging

(Schneider et al., TACL 2014)

slide-39
SLIDE 39
  • Basic features 


adapted from Constant et al. (2012):

  • word: current & context, unigrams & bigrams
  • POS: current & context, unigrams & bigrams
  • capitalization; word shape
  • prefixes, suffixes up to 4 characters
  • has digit; non-alphanumeric characters
  • lemma + context lemma if one is a V and the
  • ther is ∈ {N, V

, Adj., Adv., Prep., Part.}

  • Lexicon features: WordNet & other lexicons

39

Gappy sequence tagging

(Schneider et al., TACL 2014)

slide-40
SLIDE 40

Joint Tag Encoding

  • Augment the MWE tags with supersense

labels

40

MWE only Joint My O O wife O O'PERSON had O O'`a taken B B'motion her

  • ’07

b b'ARTIFACT Ford i i Fusion i i in I I

supersense label 


  • nly at beginning 

  • f lexical segment
slide-41
SLIDE 41

AMALGrAM

  • Tagger trained on STREUSLE: jointly

predicts MWEs and supersenses

  • |tagset| = 146
  • Same structured prediction setup as

Schneider et al. (TACL 2014): first-order structured perceptron

  • Evaluation: separate scores for
  • MWE identification
  • supersense tagging (first tag of each lexical

segment)

41

slide-42
SLIDE 42

AMALGrAM

  • This tagger allows us to measure:
  • the impact of joint tagging on MWE

performance

  • the value of word clusters, new features
  • the tagger’s resilience to ambiguity (see the paper)
  • Baseline for future supersense tagging

studies in the reviews domain

42

slide-43
SLIDE 43

Outline

✓ Background

  • multiword expressions
  • supersenses

✓ Dataset ✓ Joint model

  • Results

43

slide-44
SLIDE 44

Does joint tagging hurt MWE identification?

P R F1

  • MWE-only baseline (8 tags): 73 56 63
  • Simplest joint model (146 tags): 68 56 61
  • …so it hurts a bit in precision, but not

drastically

44

Link-based MWE score

slide-45
SLIDE 45

AMALGrAM: New features

  • aux verb feature: verb (adverb)? verb
  • WordNet features adapted from (Ciaramita

& Altun, 2006). E.g.:

  • has-supersense (in any matching synset)
  • supersense of 1st synset of longest

lemma match

  • (if a common noun, verb, or adjective):

supersense of 1st synset matching the following noun

45

slide-46
SLIDE 46

Impact of new features on supersense labeling

P R F1

  • Simplest joint model (146 tags): 65 67 66
  • + clusters 66 68 67
  • + new features 69 72 71

46

Supersense score

slide-47
SLIDE 47

Does joint tagging hurt MWE identification?

P R F1

  • MWE-only baseline (8 tags): 73 56 63
  • Simplest joint model (146 tags): 68 56 61
  • + clusters 69 57 62
  • + new features 71 56 63

47

Link-based MWE score

slide-48
SLIDE 48

Conclusion

  • Corpus of English web reviews annotated for

MWEs + supersenses (STREUSLE)

  • Tagger for this corpus attains 63% F1 for

MWEs and 71% F1 for supersenses (with gold POS)

48

slide-49
SLIDE 49

Possible Extensions

  • More genres & languages. Already have:
  • supersenses in English Twitter (Johannsen et al.,

2014), Arabic Wikipedia (Schneider et al., 2012), Italian (Dei Rossi et al., 2013), …

  • some MWEs in English Wikipedia (Vincze et al.,

2011), French news (Abeillé et al., 2003), …

  • More kinds of supersenses
  • adjectives (Tsvetkov et al., 2014)
  • prepositions (Schneider et al., LAW 2015)
  • Application to sentiment analysis, semantic

parsing, machine translation, …

49

slide-50
SLIDE 50

Links

  • Downloads: tiny.cc/streusle
  • Ideas for improving on this task?
  • “DiMSUM” shared task, SemEval 2016.

Subscribe to mailing list for further announcements.

50

slide-51
SLIDE 51

51

Many_thanks


(*Several thanks)

!

Thanks_a_million


(*Thanks a thousand)


Thanks_a_lot


(?Lots of thanks)
 social social social