Latent Structures for Coreference Resolution Sebastian Martschat - - PowerPoint PPT Presentation

latent structures for coreference resolution
SMART_READER_LITE
LIVE PREVIEW

Latent Structures for Coreference Resolution Sebastian Martschat - - PowerPoint PPT Presentation

Latent Structures for Coreference Resolution Sebastian Martschat and Michael Strube Heidelberg Institute for Theoretical Studies gGmbH 1 / 25 Coreference Resolution Coreference resolution is the task of determining which mentions in a text


slide-1
SLIDE 1

Latent Structures for Coreference Resolution

Sebastian Martschat and Michael Strube

Heidelberg Institute for Theoretical Studies gGmbH

1 / 25

slide-2
SLIDE 2

Coreference Resolution

Coreference resolution is the task of determining which mentions in a text refer to the same entity.

2 / 25

slide-3
SLIDE 3

An Example

Vicente del Bosque admits it will be difficult for him to select David de Gea in Spain’s squad if the goalkeeper remains on the sidelines at Manchester United. de Gea’s long-anticipated transfer to Real Madrid fell through

  • n

Monday due to miscommunication between the Spanish club and United and he will stay at Old Trafford until at least January.

3 / 25

slide-4
SLIDE 4

An Example

Vicente del Bosque admits it will be difficult for him to select David de Gea in Spain’s squad if the goalkeeper remains on the sidelines at Manchester United. de Gea’s long-anticipated transfer to Real Madrid fell through

  • n

Monday due to miscommunication between the Spanish club and United and he will stay at Old Trafford until at least January.

3 / 25

slide-5
SLIDE 5

Outline

Motivation Structures for Coreference Resolution Experiments and Analysis Conclusions and Future Work

4 / 25

slide-6
SLIDE 6

Outline

Motivation Structures for Coreference Resolution Experiments and Analysis Conclusions and Future Work

5 / 25

slide-7
SLIDE 7

General Paradigm

Vicente del Bosque admits it will be difficult for him to select David de Gea in Spain’s squad if the goalkeeper remains on the sidelines at Manchester United. de Gea’s long-anticipated transfer to Real Madrid fell through

  • n

Monday due to miscommunication between the Spanish club and United and he will stay at Old Trafford until at least January. Consolidate pairwise decisions for anaphor-antecedent pairs

6 / 25

slide-8
SLIDE 8

General Paradigm

Vicente del Bosque admits it will be difficult for him to select David de Gea in Spain’s squad if the goalkeeper remains on the sidelines at Manchester United. de Gea’s long-anticipated transfer to Real Madrid fell through

  • n

Monday due to miscommunication between the Spanish club and United and he will stay at Old Trafford until at least January. Consolidate pairwise decisions for anaphor-antecedent pairs

6 / 25

slide-9
SLIDE 9

Mention Pairs

Vicente del Bosque admits it will be difficult for him to select David de Gea in Spain’s squad if the goalkeeper remains on the sidelines at Manchester United. de Gea’s long-anticipated transfer to Real Madrid fell through

  • n

Monday due to miscommunication between the Spanish club and United and he will stay at Old Trafford until at least January.

m7 m7 m7 m7 m7 m7 m6 m5 m4 m3 m2 m1

− − + − − −

7 / 25

slide-10
SLIDE 10

Mention Ranking

Vicente del Bosque admits it will be difficult for him to select David de Gea in Spain’s squad if the goalkeeper remains on the sidelines at Manchester United. de Gea’s long-anticipated transfer to Real Madrid fell through

  • n

Monday due to miscommunication between the Spanish club and United and he will stay at Old Trafford until at least January.

m7 m6 m5 m4 m3 m2 m1

8 / 25

slide-11
SLIDE 11

Antecedent Trees

Vicente del Bosque admits it will be difficult for him to select David de Gea in Spain’s squad if the goalkeeper remains on the sidelines at Manchester United. de Gea’s long-anticipated transfer to Real Madrid fell through

  • n

Monday due to miscommunication between the Spanish club and United and he will stay at Old Trafford until at least January.

m4 m7 m10 m17 m1 m3 m9 m16 m12 m15 m2 m5 m6

...

m19

9 / 25

slide-12
SLIDE 12

Unifying Approaches

  • approaches operate on structures not annotated in training data
  • we can view these structures as latent structures

10 / 25

slide-13
SLIDE 13

Unifying Approaches

  • approaches operate on structures not annotated in training data
  • we can view these structures as latent structures

→ devise unified representation of approaches in terms of these

structures

10 / 25

slide-14
SLIDE 14

Outline

Motivation Structures for Coreference Resolution Experiments and Analysis Conclusions and Future Work

11 / 25

slide-15
SLIDE 15

Final Goal

Learn a mapping f : X → H×Z

12 / 25

slide-16
SLIDE 16

Final Goal

Learn a mapping f : X → H×Z

  • x ∈ X : structured input
  • documents containing mentions and linguistic information

12 / 25

slide-17
SLIDE 17

Final Goal

Learn a mapping f : X → H×Z

  • h ∈ H: document-level latent structure we actually predict
  • mention pairs, antecedent trees, ...
  • employ graph-based latent structures

12 / 25

slide-18
SLIDE 18

Final Goal

Learn a mapping f : X → H×Z Latent structures: subclass of directed labeled graphs G = (V,A,L)

12 / 25

slide-19
SLIDE 19

Final Goal

Learn a mapping f : X → H×Z Latent structures: subclass of directed labeled graphs G = (V,A,L)

m0 m1 m2 m3

Nodes V: mentions plus dummy mention m0 for anaphoricity detection

12 / 25

slide-20
SLIDE 20

Final Goal

Learn a mapping f : X → H×Z Latent structures: subclass of directed labeled graphs G = (V,A,L)

m0 m1 m2 m3

+ + +

Arcs A: subset of all backward arcs

12 / 25

slide-21
SLIDE 21

Final Goal

Learn a mapping f : X → H×Z Latent structures: subclass of directed labeled graphs G = (V,A,L)

m0 m1 m2 m3

+ + +

Labels L: labels for arcs

12 / 25

slide-22
SLIDE 22

Final Goal

Learn a mapping f : X → H×Z Latent structures: subclass of directed labeled graphs G = (V,A,L)

m0 m1 m2 m3

+ + +

Graph can be split into substructures which are handled individually

12 / 25

slide-23
SLIDE 23

Final Goal

Learn a mapping f : X → H×Z

  • z ∈ Z: mapping of mentions to entity identifiers
  • inferred via latent h ∈ H

12 / 25

slide-24
SLIDE 24

Linear Models

Employ an edge-factored linear model:

13 / 25

slide-25
SLIDE 25

Linear Models

Employ an edge-factored linear model: f(x) = argmax(h,z)∈Hx×Zx ∑

a∈h

θ,φ(x,a,z)

13 / 25

slide-26
SLIDE 26

Linear Models

Employ an edge-factored linear model: f(x) = argmax(h,z)∈Hx×Zx ∑

a∈h

θ,φ(x,a,z)

m3 m0 m1 m2 m3 m0 m1 m2 m3 m0 m1 m2

13 / 25

slide-27
SLIDE 27

Linear Models

Employ an edge-factored linear model: f(x) = argmax(h,z)∈Hx×Zx ∑

a∈h

θ,φ(x,a,z)

m3 m0 m1 m2 m3 m0 m1 m2 m3 m0 m1 m2 sentDist=2 anaType=PRO

13 / 25

slide-28
SLIDE 28

Linear Models

Employ an edge-factored linear model: f(x) = argmax(h,z)∈Hx×Zx ∑

a∈h

θ,φ(x,a,z)

m3 m0 m1 m2 m3 m0 m1 m2 m3 m0 m1 m2

  • 0.5

0.8

13 / 25

slide-29
SLIDE 29

Linear Models

Employ an edge-factored linear model: f(x) = argmax(h,z)∈Hx×Zx ∑

a∈h

θ,φ(x,a,z)

m3 m0 m1 m2 m3 m0 m1 m2

3.7

m3 m0 m1 m2

10

13 / 25

slide-30
SLIDE 30

Linear Models

Employ an edge-factored linear model: f(x) = argmax(h,z)∈Hx×Zx ∑

a∈h

θ,φ(x,a,z)

m3 m0 m1 m2 m3 m0 m1 m2

3.7

m3 m0 m1 m2

10

13 / 25

slide-31
SLIDE 31

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

14 / 25

slide-32
SLIDE 32

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

14 / 25

slide-33
SLIDE 33

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

14 / 25

slide-34
SLIDE 34

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

14 / 25

slide-35
SLIDE 35

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

14 / 25

slide-36
SLIDE 36

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

14 / 25

slide-37
SLIDE 37

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

m5 m0 m1 m2 m3 m4

14 / 25

slide-38
SLIDE 38

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

m5 m0 m1 m2 m3 m4

  • 14 / 25
slide-39
SLIDE 39

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

m5 m0 m1 m2 m3 m4

14 / 25

slide-40
SLIDE 40

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

m5 m0 m1 m2 m3 m4

14 / 25

slide-41
SLIDE 41

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

m5 m0 m1 m2 m3 m4

14 / 25

slide-42
SLIDE 42

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

14 / 25

slide-43
SLIDE 43

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ Reward solutions with high cost: large-margin approach

14 / 25

slide-44
SLIDE 44

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

14 / 25

slide-45
SLIDE 45

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

14 / 25

slide-46
SLIDE 46

Learning: Perceptron

Input: Training set D, cost function c, number of epochs n function PERCEPTRON(D, c, n) Set θ = (0,...,0) for epoch = 1,...,n do for (x,z) ∈ D do

ˆ

hopt = argmax

h∈Hx,z

θ,φ(x,h,z) (ˆ

h,ˆ z) = argmax

(h,z)∈Hx×Zx

θ,φ(x,h,z)+ c(x,h,ˆ

hopt,z) if ˆ h does not encode z then Set θ = θ +φ(x,ˆ hopt,z)−φ(x,ˆ h,ˆ z) Output: Weight vector θ

14 / 25

slide-47
SLIDE 47

Mention Pair

m1 m2 m3 m4

− + − − − −

Soon et al. (2001), Ng and Cardie (2002), Bengtson and Roth (2008), ...

15 / 25

slide-48
SLIDE 48

Mention Pair

m1 m2 m3 m4

− + − − − −

Latent structure

15 / 25

slide-49
SLIDE 49

Mention Pair

m1 m2 m3 m4

− + − − − −

Substructure

15 / 25

slide-50
SLIDE 50

Mention Pair

m1 m2 m3 m4

− + − − − −

No costs (use training data resampling)

15 / 25

slide-51
SLIDE 51

Mention Pair

m1 m2 m3 m4

− + − − − −

No costs (use training data resampling)

15 / 25

slide-52
SLIDE 52

Mention Ranking

m0 m1 m2 m3 m4

Denis and Baldridge (2008), Chang et al. (2013), ...

16 / 25

slide-53
SLIDE 53

Mention Ranking

m0 m1 m2 m3 m4

Latent structure

16 / 25

slide-54
SLIDE 54

Mention Ranking

m0 m1 m2 m3 m4

Substructure

16 / 25

slide-55
SLIDE 55

Mention Ranking

m0 m1 m2 m3 m4

2 1 Cost function (Durrett and Klein, 2013; Fernandes et al., 2014)

16 / 25

slide-56
SLIDE 56

Outline

Motivation Structures for Coreference Resolution Experiments and Analysis Conclusions and Future Work

17 / 25

slide-57
SLIDE 57

Data

  • conduct analysis and experiments on the English data from the

CoNLL-2012 shared task on multilingual coreference resolution

  • evaluate via CoNLL scorer (average of three widely used evaluation

metrics)

18 / 25

slide-58
SLIDE 58

Results on Test Data

HOTCoref nn_coref Pair Rank1 Rank2 Tree 56 58 60 62 64 CoNLL F1

19 / 25

slide-59
SLIDE 59

Results on Test Data

HOTCoref nn_coref Pair Rank1 Rank2 Tree 56 58 60 62 64 CoNLL F1

  • state-of-the-art system based on antecedent trees with non-local

features (Björkelund and Kuhn, 2014)

19 / 25

slide-60
SLIDE 60

Results on Test Data

HOTCoref nn_coref Pair Rank1 Rank2 Tree 56 58 60 62 64 CoNLL F1

  • mention pair model with standard strategy for training data balancing
  • rich lexical feature set

19 / 25

slide-61
SLIDE 61

Results on Test Data

HOTCoref nn_coref Pair Rank1 Rank2 Tree 56 58 60 62 64 CoNLL F1

  • ranking with closest antecedent as gold antecedent
  • mainly gains in precision

19 / 25

slide-62
SLIDE 62

Results on Test Data

HOTCoref nn_coref Pair Rank1 Rank2 Tree 56 58 60 62 64 CoNLL F1

  • ranking with latent antecedent as gold antecedent
  • slight gains in precision and recall

19 / 25

slide-63
SLIDE 63

Results on Test Data

HOTCoref nn_coref Pair Rank1 Rank2 Tree 56 58 60 62 64 CoNLL F1

  • antecedent trees
  • higher precision, but lower recall

19 / 25

slide-64
SLIDE 64

Results on Test Data

HOTCoref nn_coref Pair Rank1 Rank2 Tree 56 58 60 62 64 CoNLL F1

  • mention ranking, feature combinations learned via neural networks

(Wiseman et al., 2015)

19 / 25

slide-65
SLIDE 65

Analysis Tools

Employ our coreference resolution error analysis framework (Martschat and Strube, EMNLP 2014)

  • extract precision and recall errors on development data
  • compare errors made to assess strengths and weaknesses of

approaches

20 / 25

slide-66
SLIDE 66

Analysis

  • ranking vs mention pair
  • mainly better anaphoricity determination
  • antecedent competition useful for pronouns
  • latent ranking vs ranking with closest antecedents
  • mainly less precision errors for hard cases
  • antecedent trees vs ranking
  • document-level modeling: more cautious updates → higher

precision at expense of recall

21 / 25

slide-67
SLIDE 67

Outline

Motivation Structures for Coreference Resolution Experiments and Analysis Conclusions and Future Work

22 / 25

slide-68
SLIDE 68

Conclusions

  • coreference resolution approaches can be represented by latent

structures they operate on

  • devised a framework and implemented mention pair, mention

ranking, antecedent trees

  • mention ranking performs best, mainly due to anaphoricity modeling

23 / 25

slide-69
SLIDE 69

Future Work

  • apply framework to entity-centric approaches
  • analyze more approaches
  • devise new models in the framework

24 / 25

slide-70
SLIDE 70

Thanks!

Python implementation, state-of-the-art models, tutorials available at:

http://github.com/smartschat/cort

This work has been funded by the Klaus Tschira Foundation.

25 / 25

slide-71
SLIDE 71

Thanks!

Python implementation, state-of-the-art models, tutorials available at:

http://github.com/smartschat/cort

This work has been funded by the Klaus Tschira Foundation.

Thank you for your attention!

25 / 25

slide-72
SLIDE 72

Entity-centric Approaches

m1 m3 m5 m4 m6 m7 m8

25 / 25

slide-73
SLIDE 73

All Results

MUC B3 CEAFe Model R P F1 R P F1 R P F1 Avg CoNLL-2012 English development data Pair 66.68 71.71 69.10 53.57 62.44 57.67 52.56 53.87 53.21 59.99 Rank1 67.85 76.66 71.99∗ 55.33 65.45 59.97∗ 53.16 61.28 56.93∗ 62.96 Rank2 68.02 76.73 72.11⋄× 55.61 66.91 60.74†⋄ 54.48 61.36 57.72†⋄× 63.52 Tree 65.91 77.92 71.41 52.72 67.98 59.39 52.13 60.82 56.14 62.31 CoNLL-2012 English test data Pair 67.16 71.48 69.25 51.97 60.55 55.93 51.02 51.89 51.45 58.88 Rank1 67.96 76.61 72.03∗ 54.07 64.98 59.03∗ 51.45 59.02 54.97∗ 62.01 Rank2 68.13 76.72 72.17⋄ 54.22 66.12 59.58†⋄ 52.33 59.47 55.67†⋄ 62.47 Tree 65.79 78.04 71.39 50.92 67.76 58.15 50.55 58.34 54.17 61.24

25 / 25

slide-74
SLIDE 74

Analysis: Recall Errors

Name/noun Anaphor pronoun Model Both name Mixed Both noun I/you/we he/she it/they Rem. Max 3579 948 2063 2967 1990 2471 591 Pair 815 657 1074 394 373 1005 549 Rank1 879 637 1221 348 247 806 557 Rank2 857 647 1158 370 251 822 566 Tree 911 686 1258 441 247 863 572

25 / 25

slide-75
SLIDE 75

Analysis: Precision Errors

Name/noun Anaphor pronoun Model Both name Mixed Both noun I/you/we he/she it/they Rem. Pair 885 83 1055 836 289 864 175 2673 79 1098 2479 1546 1408 115 Rank1 587 93 494 873 324 844 121 2620 96 960 2521 1692 1510 97 Rank2 640 92 567 862 318 835 42 2664 102 1038 2461 1692 1594 43 Tree 595 57 442 836 318 757 37 2628 82 924 2398 1691 1557 36

25 / 25