Explicit and Implicit Discourse Relations: An Extrinsic Evaluation - - PowerPoint PPT Presentation

explicit and implicit discourse
SMART_READER_LITE
LIVE PREVIEW

Explicit and Implicit Discourse Relations: An Extrinsic Evaluation - - PowerPoint PPT Presentation

Explicit and Implicit Discourse Relations: An Extrinsic Evaluation Peter Bourgonje and Manfred Stede Applied Computational Linguistics Universitt Potsdam Workshop on Coherence Relations Humboldt-Universitt zu Berlin January 17-18, 2020


slide-1
SLIDE 1

Explicit and Implicit Discourse Relations: An Extrinsic Evaluation

Peter Bourgonje and Manfred Stede Applied Computational Linguistics Universität Potsdam

Workshop on Coherence Relations Humboldt-Universität zu Berlin January 17-18, 2020

slide-2
SLIDE 2

Overview

  • Explicit vs. Implicit
  • Classification Setup & Results
  • English vs. German
  • Conclusions & Future Work

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-3
SLIDE 3

Explicit vs. Implicit

  • “the assumption of implicitness of the discourse connector as a sign of

expectation of the discourse relation” (Asr & Demberg, 2012)

  • „Training on marked examples alone will work only if two conditions

are fulfilled: First, there has to be a certain amount of redundancy between the discourse marker and the general linguistic context“ (Sporleder & Lascarides, 2008)

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-4
SLIDE 4

Explicit vs. Implicit

  • Mary quit her job. The commute was too long.
  • Mary quit her job. The commute was too long, anyway.

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-5
SLIDE 5

Explicit vs. Implicit

  • “This notice must not be removed from the software, and in the event that

the software is divided, it should be attached to every part.” (conditional relation)

  • “Some entrepreneurs say the red tape they most love to hate is red tape

they would also hate to lose. They concede that much of the government meddling that torments them is essential to the public good.” (concession relation)

  • “Insisting that they are protected by the Voting Rights Act, a group of

whites brought a federal suit in 1987 to demand that the city abandon at- large voting for the nine-member City Council.” (circumstance relation)

All examples takes from Taboada (2009)

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-6
SLIDE 6

Explicit vs. Implicit

  • Using the PDTB (2.0), we adopt the definition of explicit/implicit

relations.

  • Alternative signaling that attributes to expectancy of a discourse

relation can be anything but an explicit discourse connective.

  • Language models deal with the expectancy, or likelihood of an

utterance, by predicting its probabilty given its context.

  • Using language modelling, a classifier should be able to pick up on

these signals.

  • Moreover, when holding back the connective for explicit relations, we

expect implicit relations to be easier to classify than explicit relations.

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-7
SLIDE 7

Classification Setup

  • BERT, state-of-the-art in language modelling, using contextualized

vector representations for token sequences.

  • Training separate classifiers for all implicit and explicit relations in the

PDTB, predicting the relation sense for both types.

  • Adopting BERT MRPC (paraphrase detection) classifier.
  • Implicit classifier input:

Sense #1 String #2 String 8 It's a horrible machine\, I'm ashamed I own the stupid thing

  • Explicit classifier input:

Sense #1 String #2 String 10 bringing the message is a crime I'm guilty of it

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-8
SLIDE 8

Classification Results

  • Rejecting hypothesis:
  • Explicits f1-score: 47.44
  • Implicits f1-score: 46.08
  • Explicit: 16,894* instances
  • Implicit: 14,886* instances

* CoNLL-2016 Shared Task version of the data

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-9
SLIDE 9

Classification Results

1000 2000 3000 4000 5000 6000 7000 8000 9000 Comparison Contingency Expansion Temporal

instances top level sense

Implicit Explicit 0,1 0,2 0,3 0,4 0,5 0,6 Comparison Contingency Expansion Temporal

f1-score top level sense

Implicit Explicit

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-10
SLIDE 10

Classification Results

  • Distinction by continuous/discontinous and causality, following

Asr & Demberg (2012).

  • Continuity: same frame of reference, no shift in reference with regard to

events or entities talked about (Segal et al., 1991)

  • Causality: X because Y
  • Continuous relations are implicit more often than discontinuous ones
  • Causal relations are implicit more often than non-causal relations

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-11
SLIDE 11

Classification Results

1000 2000 3000 4000 5000 6000

instances sense

Implicit Explicit

causal continuous

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-12
SLIDE 12

Classification Results

0,1 0,2 0,3 0,4 0,5 0,6 Contingency.Cause.Reason Contingency.Cause.Result Expansion.Instantiation Expansion.Restatement

f1-score sense

Implicit Explicit

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-13
SLIDE 13

Classification Results

0,2 0,4 0,6 0,8 1 1,2 1,4 continuous discontinuous causal non-causal f-score support (*0.0001)

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-14
SLIDE 14

What about German?

  • The Potsdam Commentary Corpus (Stede & Neumann, 2014)

contains a layer of connectives and their arguments (hence only explicits, rendering comparison to implicits impossible).

  • Work in Progress (under review @LREC2020):
  • Senses for explicit relations in the PCC (conform PDTB 3.0 hierarchy)
  • Explicit classifier performance: 35.99 (compared to 47.44 for PDTB)
  • New implicit, AltLex, EntRel and NoRel relations

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-15
SLIDE 15

Potsdam Commentary Corpus 2.2

PCC 2.2 AltLex 122 EntRel 56 Explicit 1,120 Implicit 887 NoRel 35 Total 2,220

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-16
SLIDE 16

Potsdam Commentary Corpus 2.2

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-17
SLIDE 17

Potsdam Commentary Corpus 2.2

  • Largely following PDTB 3.0 guidelines, but excluding intra-sentential

implicit relations.

  • Future work:
  • Including intra-sentential implicit relations.
  • Investigating disagreement;
  • Senses for pre-exisiting explicit relations: Cohen‘s Kappa of 0.74
  • New relation types: Cohen‘s Kappa of 0.28
  • New relation senses: Cohen‘s Kappa of 0.30

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-18
SLIDE 18

Conclusions

  • No evidence that senses for implicits are easier to classifiy than

senses for explicits*.

*without their connective

  • Separating the data by continuity and causality seems more

informative.

  • Preliminary results for German only on explicits. More data already

annotated and to be released in near future.

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-19
SLIDE 19

Future Work

  • Training specialised classifiers for (dis)continuous/(non-)causal

instead of explicit/implicit.

  • Extracting most informative signals for classifier.
  • Current classifier (paraphrase detection) meant for binary

classification, experiment with different parameters for multi-class setup.

  • Same experiments on German (PCC) can validate findings for another

language.

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-20
SLIDE 20

References

  • Asr, F. and Demberg, V. (2012). Implicitness of discourse relations. 24th International

Conference on Computational Linguistics: proceedings of COLING 2012 , Mumbai, pp. 2669–2684

  • Segal, E., Duchan, J., and Scott, P. (1991). The role of interclausal connectives in narrative

structuring: Evidence from adults’ interpretations of simple stories. Discourse Processes, 14(1):27–54.

  • Sporleder, C., & Lascarides, A. (2008). Using automatically labelled examples to classify

rhetorical relations: An assessment. Natural Language Engineering, 14, 369–416.

  • Stede, M. and Neumann, A. (2014), Potsdam Commentary Corpus 2.0: Annotation for

discourse research, Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14) , European Language Resources Association (ELRA), Reykjavik, Iceland.

  • Taboada, M. (2009), Implicit and explicit coherence relations, Discourse, of Course.

Humboldt-Universität zu Berlin, January 17-18, 2020

slide-21
SLIDE 21

Thank you! Questions?

Humboldt-Universität zu Berlin, January 17-18, 2020