Semantically Constrained Multilayer Annotation The Case of - - PowerPoint PPT Presentation

semantically constrained
SMART_READER_LITE
LIVE PREVIEW

Semantically Constrained Multilayer Annotation The Case of - - PowerPoint PPT Presentation

Semantically Constrained Multilayer Annotation The Case of Coreference Jakob Prange, Nathan Schneider, and Omri Abend How to annotate coreference? DMR, August 1, 2019 Prange, Schneider, Abend 2 Did anyone else have these fears ? DMR, August


slide-1
SLIDE 1

Semantically Constrained Multilayer Annotation

The Case of Coreference

Jakob Prange, Nathan Schneider, and Omri Abend

slide-2
SLIDE 2

How to annotate coreference?

DMR, August 1, 2019 Prange, Schneider, Abend 2

slide-3
SLIDE 3

Did anyone else have these fears ?

DMR, August 1, 2019 Prange, Schneider, Abend 3

slide-4
SLIDE 4

Did anyone else have these fears ?

DMR, August 1, 2019 Prange, Schneider, Abend 4

slide-5
SLIDE 5

Did anyone else have these fears ?

DMR, August 1, 2019 Prange, Schneider, Abend 5

slide-6
SLIDE 6

Did anyone else have these fears ? How did you get over them ?

DMR, August 1, 2019 Prange, Schneider, Abend 6

slide-7
SLIDE 7

Did anyone else have these fears ? How did you get over them ?

DMR, August 1, 2019 Prange, Schneider, Abend 7

slide-8
SLIDE 8

Representing Coreference Is Not Trivial

DMR, August 1, 2019 Prange, Schneider, Abend 8

What to annotate as mentions?

Syntactic criteria (e.g., all nouns) Semantic criteria (e.g., all events) Singletons?

How to annotate mention spans?

Minimum spans (only head words) Maximum spans (plus args & mods) Hybrid?

How to annotate coreference?

Identity, Bridging, … Unordered clusters

  • r ordered chains

Tricky linguistic phenomena (e.g., coordination)

slide-9
SLIDE 9

Representing Coreference Is Not Trivial

  • Many of these decisions seem arbitrary
  • Many different approaches with different guidelines

[Poesio et al., 2016]

  • Problematic for evaluation

[Moosavi and Strube, 2016; Moosavi et al., 2019]

  • Seldom integrated with other layers of meaning

DMR, August 1, 2019 Prange, Schneider, Abend 9

slide-10
SLIDE 10

Prange, Schneider, Abend DMR, August 1, 2019 H10

slide-11
SLIDE 11

DMR, August 1, 2019 Prange, Schneider, Abend 10

Our Approach: Build upon a basic framework for semantic units that can be

shared among many higher-level meaning representations.

slide-12
SLIDE 12

Prange, Schneider, Abend DMR, August 1, 2019 H11

Our Approach: Build upon a basic framework for semantic units that can be

shared among many higher-level meaning representations.

slide-13
SLIDE 13

DMR, August 1, 2019 Prange, Schneider, Abend 11

Our Approach: Build upon a basic framework for semantic units that can be

shared among many higher-level meaning representations.

slide-14
SLIDE 14

Prange, Schneider, Abend

Semantic Multilayering

DMR, August 1, 2019 H12

UCCA

slide-15
SLIDE 15

Prange, Schneider, Abend

Semantic Multilayering

DMR, August 1, 2019 H12

UCCA

Coreference

slide-16
SLIDE 16

Prange, Schneider, Abend

Semantic Multilayering

DMR, August 1, 2019 H12

UCCA

Coreference World Knowledge

slide-17
SLIDE 17

Prange, Schneider, Abend

Semantic Multilayering

DMR, August 1, 2019 H13

UCCA

Coreference

World Knowledge

slide-18
SLIDE 18

Prange, Schneider, Abend

Semantic Multilayering

DMR, August 1, 2019 H13

UCCA

Coreference

World Knowledge

slide-19
SLIDE 19

Semantic Multilayering with UCCA [Abend & Rappoport, 2013]

  • Identify and relate ”scenes” (events) and participants

→ Basic semantic units

  • No assumptions about grammar or lexicon

→ Cross-linguistically applicable

DMR, August 1, 2019 Prange, Schneider, Abend 14

slide-20
SLIDE 20

UCoref: UCCA + Coreference

DMR, August 1, 2019 Prange, Schneider, Abend 15

Did anyone else have these fears ? How did you get over them ?

slide-21
SLIDE 21

Prange, Schneider, Abend

UCoref: UCCA + Coreference

DMR, August 1, 2019 H16

Did anyone else have these fears ? How did you get over them ?

Scene-evoker Scene-evoker

slide-22
SLIDE 22

Prange, Schneider, Abend

UCoref: UCCA + Coreference

DMR, August 1, 2019 H16

Did anyone else have these fears ? How did you get over them ?

Participant Scene-evoker Scene-evoker Participant Participant

slide-23
SLIDE 23

Prange, Schneider, Abend

UCoref: UCCA + Coreference

DMR, August 1, 2019 H16

Did anyone else have these fears ? How did you get over them ?

Participant Scene-evoker Scene-evoker Participant Participant

slide-24
SLIDE 24

UCoref: Streamlined Annotation

  • UCCA units can be filtered in preprocessing
  • All scene and participant units are automatically

considered mentions

DMR, August 1, 2019 Prange, Schneider, Abend 17

slide-25
SLIDE 25

Prange, Schneider, Abend

UCoref: Streamlined Annotation

DMR, August 1, 2019 H18

Did anyone else have these fears ? How did you get over them ?

slide-26
SLIDE 26

Prange, Schneider, Abend

UCoref: Streamlined Annotation

DMR, August 1, 2019 H18

Did anyone else have these fears ? How did you get over them ?

slide-27
SLIDE 27

UCoref: Streamlined Annotation

  • UCCA units can be filtered in preprocessing
  • Human annotators identify remaining mentions and

coreference clusters

  • Semantic heads serve as minimum span versions of

complex mentions

DMR, August 1, 2019 Prange, Schneider, Abend 19

slide-28
SLIDE 28

Prange, Schneider, Abend

UCoref: Streamlined Annotation

DMR, August 1, 2019 H20

Did anyone else have these fears ? How did you get over them ?

slide-29
SLIDE 29

Prange, Schneider, Abend

UCoref: Streamlined Annotation

DMR, August 1, 2019 H20

Did anyone else have these fears ? How did you get over them ?

slide-30
SLIDE 30

Prange, Schneider, Abend

UCoref: Streamlined Annotation

DMR, August 1, 2019 H20

Did anyone else have these fears ? How did you get over them ?

slide-31
SLIDE 31

Comprehensive Representation

✓ Flexible mention spans

✓ Event and entity coreference in one framework ✓ Anchored in predicate argument structure, which…

… helps humans disambiguate … can be utilized in automatic resolution (as features, or in a joint or MTL setup)

DMR, August 1, 2019 Prange, Schneider, Abend 21

slide-32
SLIDE 32

Drawback: Need UCCA Annotations First

  • Efficiently annotatable by non-experts
  • Can use automatic parsers
  • It’s worth it!

DMR, August 1, 2019 Prange, Schneider, Abend 22

slide-33
SLIDE 33

Pilot Annotation

  • Small samples from 3 English coreference corpora
  • OntoNotes [Hovy et al., 2006]: Blog posts
  • GUM [Zeldes, 2017]: WikiHow articles
  • RED [O’Gorman et al., 2016]: Forum discussions
  • Similar genres, different guidelines

DMR, August 1, 2019 Prange, Schneider, Abend 23

Hypothesis: UCoref covers (most of) what existing schemes cover, and more.

slide-34
SLIDE 34

DMR, August 1, 2019 Prange, Schneider, Abend 24

OntoNotes GUM RED UCoref Anchored in syntax syntax tokens semantics Mention criteria syntax syntax semantics semantics Mention spans max max min flexible Events ✗ (✓) ✓ ✓ Singletons ✗ (✓) ✓ ✓ Annotation guidelines point in this direction.

Hypothesis: UCoref covers (most of) what existing schemes cover, and more.

slide-35
SLIDE 35

mentions UCoref mentions OntoNotes 40 128 GUM 288 466

DMR, August 1, 2019 Prange, Schneider, Abend 25

referents UCoref referents 20 96 155 291 Numbers of mentions and referents confirm: UCoref covers more than other schemes.

< <

Hypothesis: UCoref covers (most of) what existing schemes cover, and more.

slide-36
SLIDE 36

mentions UCoref mentions RED 120 117

DMR, August 1, 2019 Prange, Schneider, Abend 26

referents UCoref referents 82 78 RED is very similar to UCoref in terms of coverage.

≈ ≈

Hypothesis: UCoref covers (most of) what existing schemes cover, and more.

slide-37
SLIDE 37

Prange, Schneider, Abend

High Recall

DMR, August 1, 2019 H27

Exact Match

%

25 50 75 100

OntoNotes GUM RED

Menmons Referents

slide-38
SLIDE 38

Prange, Schneider, Abend

High Recall

Parmal Match

25 50 75 100

OntoNotes GUM RED

DMR, August 1, 2019 H27

Iterative greedy 1-to-1 alignment based on Dice coefficient.

Exact Match

%

25 50 75 100

OntoNotes GUM RED

Menmons Referents

slide-39
SLIDE 39

Prange, Schneider, Abend

High Recall

Parmal Match

25 50 75 100

OntoNotes GUM RED

DMR, August 1, 2019 H27

Iterative greedy 1-to-1 alignment based on Dice coefficient.

Exact Match

%

25 50 75 100

OntoNotes GUM RED

Menmons Referents

Mention spans are crucial for evaluation.

slide-40
SLIDE 40

Prange, Schneider, Abend

High Recall

Parmal Match

25 50 75 100

OntoNotes GUM RED

DMR, August 1, 2019 H27

Iterative greedy 1-to-1 alignment based on Dice coefficient.

Exact Match

%

25 50 75 100

OntoNotes GUM RED

Menmons Referents

Mention spans are crucial for evaluation.

Moosavi et al. [2019] automatically extract min spans.

slide-41
SLIDE 41

Conclusions

  • Scattered coreference research needs to be unified
  • Related: Universal Coreference Initiative
  • Semantic representations should be

semantically anchored and modular

  • UCoref is a first step in that direction
  • Main advantages: efficiency, consistency, and reusability

DMR, August 1, 2019 Prange, Schneider, Abend 28

slide-42
SLIDE 42

DMR, August 1, 2019 Prange, Schneider, Abend 29

FrameNet [Baker et al., 1998] RED [O’Gorman et al., 2016] PropBank [Palmer et al., 2005] Decompositional Semantics [White et al., 2016] AMR [Banarescu et al., 2013] Multi-sentence AMR [O’Gorman et al., 2018]

ANCHORING

Ap Approa

  • aches

es To Semantic Multilayer ering

Prague [Böhmová et al., 2003] Token Sentence Syntax Semantics

Modular Highly modular 1 Layer

OntoNotes [Hovy et al., 2006] GUM [Zeldes, 2017]

slide-43
SLIDE 43

Thank you! Questions?

Jakob Prange, Nathan Schneider, and Omri Abend

jakob@cs.georgetown.edu

https://github.com/jakpra/UCoref https://arxiv.org/abs/1906.00663

Data & Code Paper

slide-44
SLIDE 44

Appendix

DMR, August 1, 2019 Prange, Schneider, Abend 31

slide-45
SLIDE 45

Representing Coreference

A: Did anyone else have these fears ? A: How did you get over them ?

DMR, August 1, 2019 Prange, Schneider, Abend 32

slide-46
SLIDE 46

Representing Coreference

A: Did anyone else have these fears ? B: Yes , me . A: How did you get over them ?

DMR, August 1, 2019 Prange, Schneider, Abend 33

slide-47
SLIDE 47

Representing Coreference

DMR, August 1, 2019 Prange, Schneider, Abend 34

A: Did anyone else have these fears ? A: Did anyone else have other fears ?

slide-48
SLIDE 48

A Quick Introduction to UCCA

  • Main philosophy: Identifying ”scenes” (events) and their participants
  • Foundational layer is a DAG over tokens
  • Terminals (tokens) evoke pre-terminals
  • Non-terminals form larger units of meaning, such as elaborations and scenes
  • Edge labels indicate the role of a child within its parent
  • ”Remote edges” and “implicit units” handle zero anaphora and constructional

null-instantiation

  • Additional layers are anchored in the foundational layer

DMR, August 1, 2019 Prange, Schneider, Abend 35

slide-49
SLIDE 49

A Quick Introduction to UCCA

  • Agnostic about lexicon and syntax
  • Semantic and usage-based guidelines for annotation

→ Efficient annotation across languages, even by non-experts

DMR, August 1, 2019 Prange, Schneider, Abend 36

slide-50
SLIDE 50

Automatic Mention Preselection…

OntoNotes GUM RED Sentences 17 70 24 Tokens 303 1180 302 UCCA units 336 1436 379 Mention candidates 195 911 186

DMR, August 1, 2019 Prange, Schneider, Abend 37

… simplifies manual annotation.

< < <

slide-51
SLIDE 51

Semantic Multilayering: An Overview

  • Much recent work on designing meaning representations

and sembanking

  • Can be summarized along two axes:
  • Anchoring: How is annotation guided and constrained by underlying

structure (tokens, syntax, …)?

  • Modularity: To what extent are multiple kinds of information encoded in

separate structures/layers?

DMR, August 1, 2019 Prange, Schneider, Abend 38