[PPT] - Joint prediction in MST style discourse parsing for argumentation PowerPoint Presentation

SLIDE 1

Joint prediction in MST style discourse parsing for argumentation mining

Andreas Peldszus Manfred Stede

Applied Computational Linguistics, University of Potsdam

EMNLP 20.09.2015

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 1 / 37

SLIDE 2

Outline

1 Argumentation Mining 2 Dataset & Scheme 3 Models 4 Results for the attachment task 5 Results for all tasks

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 2 / 37

SLIDE 3

Outline

1 Argumentation Mining 2 Dataset & Scheme 3 Models 4 Results for the attachment task 5 Results for all tasks

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 3 / 37

SLIDE 4

What is argumentation mining?

Health insurance companies should naturally cover alternative medical

treatments. Not all practices

and approaches that are lumped together under this term may have been proven in clinical trials, yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as

beneficial. Besides many

general practitioners offer such counselling and treatments in parallel anyway - and who would want to question their broad expertise? Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 4 / 37

SLIDE 5

SLIDE 6

What is argumentation mining?

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

c3 c4 c2

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 4 / 37

SLIDE 7

What is argumentation mining?

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

c3 c4 c2

Tasks:

EDU segmentation
ADU segmentation
resp. argumentative relevance
ADU type classification
Relation identification
Relation type classification

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 4 / 37

SLIDE 8

Outline

1 Argumentation Mining 2 Dataset & Scheme 3 Models 4 Results for the attachment task 5 Results for all tasks

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 5 / 37

SLIDE 9

Dataset: argumentative microtexts

Texts:

112 texts: collected in a controlled text generation experiment
with professional parallel translation to English
annotated with argumentation structures
see [Peldszus and Stede, to appear]
freely available, CC-by-nc-sa license
https://github.com/peldszus/arg-microtexts

Properties: + about 5 segments long + each segment is arg. relevant + explicit main claim + at least one possible objection considered

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 6 / 37

SLIDE 10

Scheme

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

c3 c4 c2

Freeman’s theory, revised & slightly generalized:

[Freeman, 1991, 2011] [Peldszus and Stede, 2013]

node types = argumentative role

proponent (presents and defends claims)

pponent (critically questions)
link types = argumentative function

support own claims (normally, by example) attack other’s claims (rebut, undercut)

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 7 / 37

SLIDE 11

Preprocessing: Graph reduction

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

c3 c4 c2

From complex structures. . .

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 8 / 37

SLIDE 12

Preprocessing: Graph reduction

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

c3 c4 c2

From complex structures. . .

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

. . . to simple support-attack graphs.

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 8 / 37

SLIDE 13

Tasks tackeled in this paper:

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

attachment (at)

464 pairs yes, 2000 pairs no

central claim (cc)

112 yes, 451 no

role (ro)

451 proponent, 125 opponent

function (fu)

290 support, 174 attacks

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 9 / 37

SLIDE 14

Tasks tackeled in this paper:

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

attachment (at)

464 pairs yes, 2000 pairs no

central claim (cc)

112 yes, 451 no

role (ro)

451 proponent, 125 opponent

function (fu)

290 support, 174 attacks

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 9 / 37

SLIDE 15

Tasks tackeled in this paper:

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

attachment (at)

464 pairs yes, 2000 pairs no

central claim (cc)

112 yes, 451 no

role (ro)

451 proponent, 125 opponent

function (fu)

290 support, 174 attacks

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 9 / 37

SLIDE 16

Outline

1 Argumentation Mining 2 Dataset & Scheme 3 Models 4 Results for the attachment task 5 Results for all tasks

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 10 / 37

SLIDE 17

Models: Overview

Baseline 1
Baseline 2
Simple classifier 1
Simple classifier 2
Evidence graph 1
Evidence graph 2
MSTparser 1
MSTparser 2
MSTparser 3
MSTparser 4
MSTparser 5

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 11 / 37

SLIDE 18

Models: Baseline 1

BL-first: attach to first

tendency to state the central claim first

in English-speaking debating tradition

covers convergent argumentation, but

not serial argumentation

in 50 of the 112 texts (44.6%) the first

segment is the central claim

176 of the 464 relations (37,9%) attach

to the first segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 12 / 37

SLIDE 19

Models: Baseline 1

BL-first: attach to first

tendency to state the central claim first

in English-speaking debating tradition

covers convergent argumentation, but

not serial argumentation

in 50 of the 112 texts (44.6%) the first

segment is the central claim

176 of the 464 relations (37,9%) attach

to the first segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 12 / 37

SLIDE 20

Models: Baseline 2

BL-preced.: attach to preceeding

strong baseline, usually hard to beat

[Muller et al., 2012]

covers serial argumentation, but not

convergent argumentation

210 of all 464 relations (45.3%) attach

to the preceeding segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 13 / 37

SLIDE 21

Models: Baseline 2

BL-preced.: attach to preceeding

strong baseline, usually hard to beat

[Muller et al., 2012]

covers serial argumentation, but not

convergent argumentation

210 of all 464 relations (45.3%) attach

to the preceeding segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 13 / 37

SLIDE 22

Models: simple classifier

Simple:

pair-wise classification
log-loss linear model, SGD training

Procedure:

predict edge score
apply classification threshold

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 14 / 37

SLIDE 23

Models: simple classifier

Simple:

pair-wise classification
log-loss linear model, SGD training

Procedure:

predict edge score
apply classification threshold

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 14 / 37

SLIDE 24

Models: simple classifier

Simple:

pair-wise classification
log-loss linear model, SGD training

Procedure:

predict edge score
apply classification threshold

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 14 / 37

SLIDE 25

Models: simple classifier

Simple:

pair-wise classification
log-loss linear model, SGD training

Procedure:

predict edge score
apply classification threshold

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 14 / 37

SLIDE 26

Models: simple classifier

Simple:

pair-wise classification
log-loss linear model, SGD training

Procedure:

predict edge score
apply classification threshold

Might not be a tree!

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 14 / 37

SLIDE 27

Models: features

features from source and target segment:

lemma unigrams (*) and bigrams
first three lemma
POS tags (*)
main verb mood & tempus
dependency parse triples
discourse connectives & relations

[Stede, 2002] (*)

position in text

(*) extracted also from adjacent segment features of source target pair:

linerarity (forward, backward)
absolut distance
relative distance

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 15 / 37

SLIDE 28

Models: features

features from source and target segment:

lemma unigrams (*) and bigrams
first three lemma
POS tags (*)
main verb mood & tempus
dependency parse triples
discourse connectives & relations

[Stede, 2002] (*)

position in text

(*) extracted also from adjacent segment features of source target pair:

linerarity (forward, backward)
absolut distance
relative distance

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 15 / 37

SLIDE 29

Models: features

features from source and target segment:

lemma unigrams (*) and bigrams
first three lemma
POS tags (*)
main verb mood & tempus
dependency parse triples
discourse connectives & relations

[Stede, 2002] (*)

position in text

(*) extracted also from adjacent segment features of source target pair:

linerarity (forward, backward)
absolut distance
relative distance

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 15 / 37

SLIDE 30

Models: simple classifier

Simple:

pair-wise classification
log-loss linear model, SGD training

Procedure:

predict edge score
apply classification threshold

Might not be a tree!?

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 16 / 37

SLIDE 31

Models: simple classifier

Simple:

pair-wise classification
log-loss linear model, SGD training

Procedure:

predict edge score
apply classification threshold

Not a tree for 85% of the texts! Too many edges.

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 16 / 37

SLIDE 32

Models: simple classifier + MST-decoding

Procedure:

predict edge score
apply minimum spanning tree algorithm

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 17 / 37

SLIDE 33

Models: simple classifier + MST-decoding

Procedure:

predict edge score
apply minimum spanning tree algorithm

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 17 / 37

SLIDE 34

Models: simple classifier + MST-decoding

Procedure:

predict edge score
apply minimum spanning tree algorithm

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 17 / 37

SLIDE 35

Models: simple classifier + MST-decoding

Procedure:

predict edge score
apply minimum spanning tree algorithm

[Chu and Liu, 1965, Edmonds, 1967] [McDonald et al., 2005b] [Baldridge et al., 2007, Muller et al., 2012]

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 17 / 37

SLIDE 36

Models: simple classifier + MST-decoding

Procedure:

predict edge score
apply minimum spanning tree algorithm

[Chu and Liu, 1965, Edmonds, 1967] [McDonald et al., 2005b] [Baldridge et al., 2007, Muller et al., 2012]

Always a tree!

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 17 / 37

SLIDE 37

Attachment classification: results for German

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 18 / 37

SLIDE 38

Attachment classification: results for German

BL-first BL-preced. simple simple+MST F1 macro .618 .662 .679 .688 attach F1 .380 .452 .504 .494 κ .236 .325 .365 .377 Average scores of 10 repetitions of 5-fold CV.

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 18 / 37

SLIDE 39

Attachment classification: results for English

BL-first BL-preced. simple simple+MST F1 macro .618 .662 .663 .674 attach F1 .380 .452 .478 .470 κ .236 .325 .333 .347 Average scores of 10 repetitions of 5-fold CV.

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 19 / 37

SLIDE 40

Why restrict to only one feature of the graph?

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

Argumentation aspects annotated in the graph:

attachment (at)
central claim (cc)
role (ro)
function (fu)

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 20 / 37

SLIDE 41

Models: Joint prediction in one evidence graph

Procedure:

predict attachment probability
predict role, function, cc probability
combine predictions in one score
apply MST algorithm
identify central claim
derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 21 / 37

SLIDE 42

Models: Joint prediction in one evidence graph

Procedure:

predict attachment probability
predict role, function, cc probability
combine predictions in one score
apply MST algorithm
identify central claim
derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 21 / 37

SLIDE 43

Models: Joint prediction in one evidence graph

Procedure:

predict attachment probability
predict role, function, cc probability
combine predictions in one score
apply MST algorithm
identify central claim
derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 21 / 37

SLIDE 44

Models: Joint prediction in one evidence graph

Procedure:

predict attachment probability
predict role, function, cc probability
combine predictions in one score
apply MST algorithm
identify central claim
derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 21 / 37

SLIDE 45

Models: Joint prediction in one evidence graph

Procedure:

predict attachment probability
predict role, function, cc probability
combine predictions in one score
apply MST algorithm
identify central claim
derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 21 / 37

SLIDE 46

Models: Joint prediction in one evidence graph

Procedure:

predict attachment probability
predict role, function, cc probability
combine predictions in one score
apply MST algorithm
identify central claim
derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 21 / 37

SLIDE 47

Models: Joint prediction in one evidence graph

Procedure:

predict attachment probability
predict role, function, cc probability
combine predictions in one score
apply MST algorithm
identify central claim
derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 21 / 37

SLIDE 48

Models: Joint prediction in one evidence graph

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 22 / 37

SLIDE 49

Models: Joint prediction in one evidence graph

How to move segment-wise predictions into the edge weights?

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 22 / 37

SLIDE 50

Models: Joint prediction in one evidence graph

wi,j = φ1RO+φ2FU+φ3CC+φ4AT

φn

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 22 / 37

SLIDE 51

Models: Joint prediction in one evidence graph

Probability of attachment from source to target

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 23 / 37

SLIDE 52

Models: Joint prediction in one evidence graph

Probability of proper function per edge type

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 23 / 37

SLIDE 53

Models: Joint prediction in one evidence graph

Probability of not being the central claim

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 23 / 37

SLIDE 54

Models: Joint prediction in one evidence graph

Probability of preserved/switched role for sup/att edges

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 23 / 37

SLIDE 55

Models: Joint prediction in one evidence graph

wi,j = φ1RO+φ2FU+φ3CC+φ4AT

φn

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 23 / 37

SLIDE 56

Models: Joint prediction in one evidence graph

EG-equal:

all base-classifiers weighted equally

EG-best:

optimize base-classifier weighting with a

simple evolutionary search

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 24 / 37

SLIDE 57

Models: Joint prediction in one evidence graph

EG-equal:

all base-classifiers weighted equally

EG-best:

optimize base-classifier weighting with a

simple evolutionary search

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 24 / 37

SLIDE 58

Attachment classification: results for German

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 25 / 37

SLIDE 59

Attachment classification: results for German

BL-first BL-preced. simple simple+MST EG equal EG best F1 macro .618 .662 .679 .688 .712 .710 attach F1 .380 .452 .504 .494 .533 .530 κ .236 .325 .365 .377 .424 .421

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 25 / 37

SLIDE 60

Attachment classification: results for English

BL-first BL-preced. simple simple+MST EG equal EG best F1 macro .618 .662 .663 .674 .692 .693 attach F1 .380 .452 .478 .470 .501 .502 κ .236 .325 .333 .347 .384 .386

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 26 / 37

SLIDE 61

Models: Comparison with MSTParser pipelines

Background:

1-best MIRA structured learning

for non-projective dep. parsing

[McDonald et al., 2005a] [Baldridge et al., 2007]

same feature sets

Procedure:

predict edge score
apply MST decoding
apply internal or external relation

labeller

derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 27 / 37

SLIDE 62

Models: Comparison with MSTParser pipelines

Background:

1-best MIRA structured learning

for non-projective dep. parsing

[McDonald et al., 2005a] [Baldridge et al., 2007]

same feature sets

Procedure:

predict edge score
apply MST decoding
apply internal or external relation

labeller

derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 27 / 37

SLIDE 63

Models: Comparison with MSTParser pipelines

Background:

1-best MIRA structured learning

for non-projective dep. parsing

[McDonald et al., 2005a] [Baldridge et al., 2007]

same feature sets

Procedure:

predict edge score
apply MST decoding
apply internal or external relation

labeller

derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 27 / 37

SLIDE 64

Models: Comparison with MSTParser pipelines

Background:

1-best MIRA structured learning

for non-projective dep. parsing

[McDonald et al., 2005a] [Baldridge et al., 2007]

same feature sets

Procedure:

predict edge score
apply MST decoding
apply internal or external relation

labeller

derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 27 / 37

SLIDE 65

Models: Comparison with MSTParser pipelines

Background:

1-best MIRA structured learning

for non-projective dep. parsing

[McDonald et al., 2005a] [Baldridge et al., 2007]

same feature sets

Procedure:

predict edge score
apply MST decoding
apply internal or external relation

labeller

derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 27 / 37

SLIDE 66

Models: Comparison with MSTParser pipelines

Background:

1-best MIRA structured learning

for non-projective dep. parsing

[McDonald et al., 2005a] [Baldridge et al., 2007]

same feature sets

Procedure:

predict edge score
apply MST decoding
apply internal or external relation

labeller

derive final role class for each

segment

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 27 / 37

SLIDE 67

Models: Comparison with MSTParser pipelines

MP:

normal features
internal relation labeler

MP+p:

normal features plus base

classifier predictions

internal relation labeler

MP+p+r:

normal features plus base

classifier predictions

external relation labeler

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 28 / 37

SLIDE 68

Models: Comparison with MSTParser pipelines

MP:

normal features
internal relation labeler

MP+p:

normal features plus base

classifier predictions

internal relation labeler

MP+p+r:

normal features plus base

classifier predictions

external relation labeler

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 28 / 37

SLIDE 69

Models: Comparison with MSTParser pipelines

MP:

normal features
internal relation labeler

MP+p:

normal features plus base

classifier predictions

internal relation labeler

MP+p+r:

normal features plus base

classifier predictions

external relation labeler

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 28 / 37

SLIDE 70

Attachment classification: results for German

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 29 / 37

SLIDE 71

Attachment classification: results for German

BL-first BL-preced. simple simple+MST EG equal EG best MP MP+p F1 macro .618 .662 .679 .688 .712 .710 .724 .728 attach F1 .380 .452 .504 .494 .533 .530 .553 .559 κ .236 .325 .365 .377 .424 .421 .449 .456

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 29 / 37

SLIDE 72

Attachment classification: results for English

BL-first BL-preced. simple simple+MST EG equal EG best MP MP+p F1 macro .618 .662 .663 .674 .692 .693 .707 .720 attach F1 .380 .452 .478 .470 .501 .502 .524 .546 κ .236 .325 .333 .347 .384 .386 .414 .440

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 30 / 37

SLIDE 73

Outline

1 Argumentation Mining 2 Dataset & Scheme 3 Models 4 Results for the attachment task 5 Results for all tasks

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 31 / 37

SLIDE 74

Tasks tackeled in this paper:

[e1] Health insurance companies should naturally cover alternative medical treatments. [e2] Not all practices and approaches that are lumped together under this term may have been proven in clinical trials,

1

[e3] yet it's precisely their positive effect when accompanying conventional 'western' medical therapies that's been demonstrated as beneficial.

2

[e4] Besides many general practitioners offer such counselling and treatments in parallel anyway -

3

[e5] and who would want to question their broad expertise?

4 5

attachment (at)

464 pairs yes, 2000 pairs no

central claim (cc)

112 yes, 451 no

role (ro)

451 proponent, 125 opponent

function (fu)

290 support, 174 attacks

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 32 / 37

SLIDE 75

Results for all levels: German

simple EG equal EG best MP MP+p MP+p+r at maF1 .679 .712 .710 .724 .728 .728 κ .365 .424 .421 .449 .456 .456

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 33 / 37

SLIDE 76

Results for all levels: German

simple EG equal EG best MP MP+p MP+p+r at maF1 .679 .712 .710 .724 .728 .728 κ .365 .424 .421 .449 .456 .456 cc maF1 .849 .879 .890 .825 .855 .855 κ .698 .759 .780 .650 .710 .710

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 33 / 37

SLIDE 77

Results for all levels: German

simple EG equal EG best MP MP+p MP+p+r at maF1 .679 .712 .710 .724 .728 .728 κ .365 .424 .421 .449 .456 .456 cc maF1 .849 .879 .890 .825 .855 .855 κ .698 .759 .780 .650 .710 .710 ro maF1 .755 .737 .734 .464 .477 .669 κ .511 .477 .472 .014 .022 .340

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 33 / 37

SLIDE 78

Results for all levels: German

simple EG equal EG best MP MP+p MP+p+r at maF1 .679 .712 .710 .724 .728 .728 κ .365 .424 .421 .449 .456 .456 cc maF1 .849 .879 .890 .825 .855 .855 κ .698 .759 .780 .650 .710 .710 ro maF1 .755 .737 .734 .464 .477 .669 κ .511 .477 .472 .014 .022 .340 fu maF1 .703 .735 .736 .499 .527 .723 κ .528 .573 .570 .293 .326 .557

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 33 / 37

SLIDE 79

Results for all levels: German

simple EG equal EG best MP MP+p MP+p+r at maF1 .679 .712 .710 .724 .728 .728 κ .365 .424 .421 .449 .456 .456 cc maF1 .849 .879 .890 .825 .855 .855 κ .698 .759 .780 .650 .710 .710 ro maF1 .755 .737 .734 .464 .477 .669 κ .511 .477 .472 .014 .022 .340 fu maF1 .703 .735 .736 .499 .527 .723 κ .528 .573 .570 .293 .326 .557 avg maF1 .747 .766 .768 .628 .647 .744 κ .526 .558 .561 .352 .379 .516

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 33 / 37

SLIDE 80

Results for all levels: English

simple EG equal EG best MP MP+p MP+p+r at maF1 .663 .692 .693 .707 .720 .720 κ .333 .384 .386 .414 .440 .440 cc maF1 .817 .860 .869 .780 .831 .831 κ .634 .720 .737 .559 .661 .661 ro maF1 .750 .721 .720 .482 .475 .638 κ .502 .445 .442 .024 .015 .280 fu maF1 .671 .707 .710 .489 .514 .681 κ .475 .529 .530 .254 .296 .491 avg maF1 .725 .745 .748 .615 .635 .718 κ .486 .520 .524 .313 .353 .468

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 34 / 37

SLIDE 81

Impact of joint prediction

0% 20% 40% 60% 80% 100% 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

ro fu cc at

0% 20% 40% 60% 80% 100% 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0% 20% 40% 60% 80% 100% 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0% 20% 40% 60% 80% 100% 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 35 / 37

SLIDE 82

Impact of joint prediction

0% 20% 40% 60% 80% 100% 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

ro fu cc at

0% 20% 40% 60% 80% 100% 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0% 20% 40% 60% 80% 100% 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0% 20% 40% 60% 80% 100% 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Simulations of better base classifiers for English (dashed levels artificially improved): x: number of predictions overwritten with ground truth y: average κ score in 10 iterations of 5fold CV

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 35 / 37

SLIDE 83

Contributions:

First data-driven model optimizing argumentation structure globally.
First model for argumentation mining jointly tackling segment type classification, relation

identification and relation type classification.

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 36 / 37

SLIDE 84

Contributions:

First data-driven model optimizing argumentation structure globally.
First model for argumentation mining jointly tackling segment type classification, relation

identification and relation type classification. That’s it!

Checkout the corpus:

https://github.com/peldszus/arg-microtexts

Checkout some evaluations scripts, parameters and (soon) predictions:

https://github.com/peldszus/emnlp2015

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 36 / 37

SLIDE 85

Literatur I

Jason Baldridge, Nicholas Asher, and Julie Hunter. Annotation for and robust parsing of discourse structure on unrestricted texts. Zeitschrift für Sprachwissenschaft, 26:213–239, 2007.

Y. J. Chu and T. H. Liu. On the shortest arborescence of a directed graph. Science Sinica, 14:1396–1400, 1965.

Jack Edmonds. Optimum Branchings. Journal of Research of the National Bureau of Standards, 71B:233–240, 1967. James B. Freeman. Dialectics and the Macrostructure of Argument. Foris, Berlin, 1991. James B. Freeman. Argument Structure: Representation and Theory. Argumentation Library (18). Springer, 2011. Ryan McDonald, Koby Crammer, and Fernando Pereira. Online large-margin training of dependency parsers. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL ’05), pages 91–98, Ann Arbor, Michigan, June 2005a. Association for Computational Linguistics. Ryan McDonald, Fernando Pereira, Kiril Ribarov, and Jan Hajic. Non-projective dependency parsing using spanning tree algorithms. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pages 523–530, Vancouver, British Columbia, Canada, October 2005b. Association for Computational Linguistics. Philippe Muller, Stergos Afantenos, Pascal Denis, and Nicholas Asher. Constrained decoding for text-level discourse parsing. In Proceedings

f COLING 2012, pages 1883–1900, Mumbai, India, December 2012. The COLING 2012 Organizing Committee. URL

http://www.aclweb.org/anthology/C12-1115. Andreas Peldszus and Manfred Stede. From argument diagrams to automatic argument mining: A survey. International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), 7(1):1–31, 2013. Andreas Peldszus and Manfred Stede. An annotated corpus of argumentative microtexts. In Proceedings of the First European Conference

n Argumentation: Argumentation and Reasoned Action, Lisbon, Portugal, June 2015 to appear.

Manfred Stede. DiMLex: A Lexical Approach to Discourse Markers. In Vittorio Di Tomaso Alessandro Lenci, editor, Exploring the Lexicon - Theory and Computation. Edizioni dell’Orso, Alessandria, Italy, 2002.

Peldszus, Stede (Uni Potsdam) Joint prediction for argumentation mining EMNLP 2015 37 / 37