means of Natural Language Processing Viorel Morari Chair of Web - - PowerPoint PPT Presentation

means of natural language processing
SMART_READER_LITE
LIVE PREVIEW

means of Natural Language Processing Viorel Morari Chair of Web - - PowerPoint PPT Presentation

Mining Rhetorical Devices by means of Natural Language Processing Viorel Morari Chair of Web Technology and Information Systems Master Thesis Defense January 23 rd , 2018 viorel.morari@uni-weimar.de Prof. Dr. Benno Stein Advisor: Khalid


slide-1
SLIDE 1

Mining Rhetorical Devices by means of Natural Language Processing

Chair of Web Technology and Information Systems

  • Prof. Dr. Benno Stein

Viorel Morari viorel.morari@uni-weimar.de Master Thesis Defense January 23rd , 2018 Advisor: Khalid Al-Khatib

slide-2
SLIDE 2

What is Rhetoric?

Bob

2

slide-3
SLIDE 3

What is Rhetoric?

PERSUASION

Bob

3

slide-4
SLIDE 4

What is Rhetoric?

PERSUASION

Bob

4

slide-5
SLIDE 5

What is a Rhetorical Device?

PERSUASION

…40% whiter than the other leading brands This is a must-see movie …the biggest, the brightest and the best. If you buy it now…

5

slide-6
SLIDE 6

Classification

Rhetoric Invention Arrangement Style Memory Delivery

6

slide-7
SLIDE 7

Classification

Rhetoric Invention Arrangement Style Memory Delivery

Happiness Sad Drink Bottle Feelings Open happiness. Grab a bottle. Don’t be sad!

Bob

7

slide-8
SLIDE 8

Classification

Rhetoric Invention Arrangement Style Memory Delivery

Don’t be sad! Grab a bottle. Open happiness. Open happiness. Grab a bottle. Don’t be sad!

Bob

8

slide-9
SLIDE 9

Classification

Rhetoric Invention Arrangement Style Memory Delivery

Don’t be sad! Grab a bottle. Open happiness. Feeling down? Open a bottle, open happiness! Rhetorical question Repetition, Balance

Bob

9

slide-10
SLIDE 10

Classification

Rhetoric Invention Arrangement Style Memory Delivery

Feeling down? Open a bottle, open happiness! Feeling down? Open a bottle, open happiness! Feeling down? Open a bottle, open happiness!

Bob

10

slide-11
SLIDE 11

Classification

Rhetoric Invention Arrangement Style Memory Delivery

Feeling down Open a bottle open happiness

? !

,

Bob

11

slide-12
SLIDE 12

Classification

Rhetoric Invention Arrangement Style Memory Delivery

12

slide-13
SLIDE 13

Classification

Rhetoric Invention Arrangement Style Rhetorical Devices

Schemes

(syntax)

Tropes

(semantic)

Figures of thought Memory Delivery

13

slide-14
SLIDE 14

Classification

Rhetoric Invention Arrangement Style Rhetorical Devices

Schemes

(syntax)

Tropes

(semantic)

Figures of thought Memory Delivery

14

slide-15
SLIDE 15

Envisioned Applications

Rhetoric-based NLG system

15

slide-16
SLIDE 16

Envisioned Applications

Rhetorical style suggestion system

16

slide-17
SLIDE 17

Research Questions

?

17

slide-18
SLIDE 18

Research Questions

1

18

slide-19
SLIDE 19

Research Questions

1 2

19

slide-20
SLIDE 20

Research Questions

Genres Topics

1 2 3

Authors

20

slide-21
SLIDE 21

Detection of Rhetorical Devices

1

21

slide-22
SLIDE 22

Pipeline – UIMA

input

22

slide-23
SLIDE 23

Pipeline – UIMA

23

slide-24
SLIDE 24

Pipeline – UIMA

24

slide-25
SLIDE 25

Pipeline – UIMA Ruta

  • UIMA Rule-based Text Annotation - intuitive and

flexible domain specific language for defining patterns of annotations (Klügl et al. [2016]).

  • Example:

25

… . This is a sample sentence. …

slide-26
SLIDE 26

Pipeline – Stanford CoreNLP

  • Stanford CoreNLP – a suite of tools for

linguistic analysis.

  • We use:

▪ Stanford Parser ▪ Stanford Dependencies

26

slide-27
SLIDE 27

Pipeline – UIMA

27

slide-28
SLIDE 28

Pipeline – UIMA

28

slide-29
SLIDE 29

Pipeline – UIMA

  • utput

But I must explain to you how all this mistaken idea of denouncing pleasure and praising pain was born and I will give you a complete account of the system, and expound the actual teachings of the great explorer of the truth, the master- builder of human happiness. No

  • ne rejects, dislikes, or avoids

pleasure itself, because it is pleasure.

Device A Device B Device C

29

slide-30
SLIDE 30

Rhetorical Devices

Balance schemes

Interplay between equivalent ideas Control the rhythm of thought

Custom schemes Omission schemes Repetition schemes

30

slide-31
SLIDE 31

Rhetorical Devices

Omission schemes Balance schemes

Interplay between equivalent ideas Control the rhythm of thought Deliberate omission of intuitive words Cause incompleteness

Custom schemes Repetition schemes

31

slide-32
SLIDE 32

Rhetorical Devices

Repetition schemes Omission schemes Balance schemes

Interplay between equivalent ideas Control the rhythm of thought Deliberate omission of intuitive words Cause incompleteness Repetition of key words/ideas Used for emphasis or amplification Key to persuasion (according to Aristotle)

Custom schemes

32

slide-33
SLIDE 33

Rhetorical Devices

Custom schemes Omission schemes Balance schemes

Interplay between equivalent ideas Control the rhythm of thought Deliberate omission of intuitive words Cause incompleteness

Repetition schemes

Repetition of key words/ideas Used for emphasis or amplification Key to persuasion (according to Aristotle) Informal rhetorical devices Strong emotional effect Includes causality, comparatives and voice

33

slide-34
SLIDE 34

Rhetorical Devices

Custom schemes Omission schemes Balance schemes Repetition schemes

  • Enumeration
  • Pysma
  • Isocolon
  • bicolon
  • tricolon
  • tetracolon
  • Asyndeton
  • Hypozeugma
  • Epizeugma
  • Epanalepsis
  • Mesarchia
  • Epiphoza
  • Mesodiplosis
  • Anadiplosis
  • Diacope
  • Epizeuxis
  • Polysyndeton
  • If-conditional 0
  • If-conditional 1
  • If-conditional 2
  • If-conditional 3
  • If-counterfactual
  • Unless-cond.
  • Whether-cond.
  • Comparative

Adjectives/Adverbs

  • Superlative

Adjectives/Adverbs

34

slide-35
SLIDE 35

Rhetorical Devices

Custom schemes Omission schemes Balance schemes Repetition schemes

  • Enumeration
  • Pysma
  • Isocolon
  • bicolon
  • tricolon
  • tetracolon
  • Asyndeton
  • Hypozeugma
  • Epizeugma
  • Epanalepsis
  • Mesarchia
  • Epiphoza
  • Mesodiplosis
  • Anadiplosis
  • Diacope
  • Epizeuxis
  • Polysyndeton
  • If-conditional 0
  • If-conditional 1
  • If-conditional 2
  • If-conditional 3
  • If-counterfactual
  • Unless-cond.
  • Whether-cond.
  • Comparative

Adjectives/Adverbs

  • Superlative

Adjectives/Adverbs

35

slide-36
SLIDE 36

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

36

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-37
SLIDE 37

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

37

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-38
SLIDE 38

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

38

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-39
SLIDE 39

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

39

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-40
SLIDE 40

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

40

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-41
SLIDE 41

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

41

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-42
SLIDE 42

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

42

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-43
SLIDE 43

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

43

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-44
SLIDE 44

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

44

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-45
SLIDE 45

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

45

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-46
SLIDE 46

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

46

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-47
SLIDE 47

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

47

slide-48
SLIDE 48

Balance: Enumeration

Old farmer had a pig, a dog, a cow and a horse.

UIMA Ruta

Old farmer had a pig, a dog, a cow and a horse.

48

Enumeration - a rhetorical device used to list a series of details, words or phrases. (literarydevices.net)

slide-49
SLIDE 49

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

49

slide-50
SLIDE 50

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

50

slide-51
SLIDE 51

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Stanford Dependencies Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

A rooster, a prince and a lion walk into a bar…

51

slide-52
SLIDE 52

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Stanford Dependencies Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

A rooster, a prince and a lion walk into a bar…

governor-dependent relation

52

slide-53
SLIDE 53

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Stanford Dependencies Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

A rooster, a prince and a lion walk into a bar… rooster, a prince and a lion walk

UIMA Ruta

53

slide-54
SLIDE 54

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Stanford Dependencies Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

A rooster, a prince and a lion walk into a bar… rooster, a prince and a lion walk

UIMA Ruta

54

slide-55
SLIDE 55

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Stanford Dependencies Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

A rooster, a prince and a lion walk into a bar… rooster, a prince and a lion walk

UIMA Ruta

55

slide-56
SLIDE 56

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Stanford Dependencies Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

A rooster, a prince and a lion walk into a bar… rooster, a prince and a lion walk

UIMA Ruta

56

slide-57
SLIDE 57

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Stanford Dependencies Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

A rooster, a prince and a lion walk into a bar… rooster, a prince and a lion walk

UIMA Ruta

57

slide-58
SLIDE 58

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Stanford Dependencies Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

A rooster, a prince and a lion walk into a bar… rooster, a prince and a lion walk

UIMA Ruta

58

slide-59
SLIDE 59

Omission: Hypozeugma

A rooster, a prince and a lion walk into a bar…

Stanford Dependencies Hypozeugma - placing last, in a construction containing several words or phrases of equal value, the word or words on which all of them depend. (Silva Rhetoricae)

A rooster, a prince and a lion walk into a bar… rooster, a prince and a lion walk

UIMA Ruta

59

slide-60
SLIDE 60

Repetition: Epanalepsis

Our eyes saw it, but we could not believe our eyes.

Epanalepsis - repeats the beginning word of a sentence at the end.

60

slide-61
SLIDE 61

Repetition: Epanalepsis

Our eyes saw it, but we could not believe our eyes.

Epanalepsis - repeats the beginning word of a sentence at the end.

61

slide-62
SLIDE 62

Repetition: Epanalepsis

Our eyes saw it, but we could not believe our eyes.

Epanalepsis - repeats the beginning word of a sentence at the end.

Our eyes saw believe our eyes.

62

slide-63
SLIDE 63

Repetition: Epanalepsis

Our eyes saw it, but we could not believe our eyes.

Epanalepsis - repeats the beginning word of a sentence at the end.

Our eyes saw believe our eyes.

63

slide-64
SLIDE 64

Repetition: Epanalepsis

Our eyes saw it, but we could not believe our eyes.

Epanalepsis - repeats the beginning word of a sentence at the end.

Our eyes saw believe our eyes.

64

slide-65
SLIDE 65

Repetition: Epanalepsis

Our eyes saw it, but we could not believe our eyes.

Epanalepsis - repeats the beginning word of a sentence at the end.

Our eyes saw believe our eyes.

65

slide-66
SLIDE 66

Repetition: Epanalepsis

Our eyes saw it, but we could not believe our eyes.

Epanalepsis - repeats the beginning word of a sentence at the end.

Our eyes saw believe our eyes.

66

slide-67
SLIDE 67

Repetition: Epanalepsis

Our eyes saw it, but we could not believe our eyes.

Epanalepsis - repeats the beginning word of a sentence at the end.

Our eyes saw believe our eyes.

67

slide-68
SLIDE 68

Repetition: Epanalepsis

Our eyes saw it, but we could not believe our eyes.

Epanalepsis - repeats the beginning word of a sentence at the end.

Our eyes saw believe our eyes.

68

slide-69
SLIDE 69

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future.

69

slide-70
SLIDE 70

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future.

If I were president, I would cut taxes.

Stanford Dependencies

70

slide-71
SLIDE 71

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future.

If I were president, I would cut taxes.

Stanford Dependencies

governor-dependent relations

71

slide-72
SLIDE 72

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future.

If I were president, I would cut taxes.

Stanford Dependencies

72

slide-73
SLIDE 73

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future.

If I were president, I would cut taxes.

Stanford Dependencies

P-clause Q-clause

73

slide-74
SLIDE 74

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future. Stanford Dependencies

If I were president

UIMA Ruta

If I were president, I would cut taxes. I would cut

74

slide-75
SLIDE 75

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future. Stanford Dependencies UIMA Ruta

If I were president, I would cut taxes. If I were president I would cut

75

slide-76
SLIDE 76

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future. Stanford Dependencies UIMA Ruta

If I were president, I would cut taxes. I would cut If I were president

76

slide-77
SLIDE 77

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future. Stanford Dependencies UIMA Ruta

If I were president, I would cut taxes. If I were president I would cut

77

slide-78
SLIDE 78

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future. Stanford Dependencies UIMA Ruta

If I were president, I would cut taxes. I would cut If I were president

78

slide-79
SLIDE 79

Custom: If-conditional 2

If I were president, I would cut taxes.

If-conditional 2 - expresses consequences that are totally unrealistic or will not likely happen in the future. Stanford Dependencies UIMA Ruta

If I were president, I would cut taxes. I would cut If I were president

79

slide-80
SLIDE 80

Evaluation dataset

Rhetorical Devices Bible Literature Political speeches Commercials Expert websites

80

slide-81
SLIDE 81

Evaluation dataset

Rhetorical Devices Bible Literature Political speeches Commercials Expert websites

𝑄𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜 = 𝑢𝑞 𝑢𝑞 + 𝑔𝑞 𝑆𝑓𝑑𝑏𝑚𝑚 = 𝑢𝑞 𝑢𝑞 + 𝑔𝑜 𝐺1 𝑡𝑑𝑝𝑠𝑓 = 2 ∙ 𝑞𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜 ∙ 𝑠𝑓𝑑𝑏𝑚𝑚 𝑞𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜 + 𝑠𝑓𝑑𝑏𝑚𝑚

Evaluation measures

81

slide-82
SLIDE 82

Evaluation Results

0.4 0.69 0.69

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Omission schemes

Asynd. Epizeugma Hypozeugma

0.84 0.68 1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Balance schemes

Enumeration Isocolon Pysma

82

slide-83
SLIDE 83

Evaluation Results

0.4 0.69 0.69

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Omission schemes

Asynd. Epizeugma Hypozeugma

0.84 0.68 1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Balance schemes

Enumeration Isocolon Pysma

0.95 0.25

Precision Recall

83

slide-84
SLIDE 84

Evaluation Results

0.4 0.69 0.69

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Omission schemes

Asynd. Epizeugma Hypozeugma

0.84 0.68 1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Balance schemes

Enumeration Isocolon Pysma

0.95 0.25

Precision Recall

Old farmer had a pig, a dog, a cow and a horse.

Asyndeton Enumeration

84

slide-85
SLIDE 85

Evaluation Results

0.74 0.74 0.72 0.74 0.78 0.59 0.4 0.73

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Repetition schemes

Anadiplosis Diacope Epanalepsis Epiphoza Epizeuxis Mesarchia Mesod. Polysyndeton

0.4 0.69 0.69

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Omission schemes

Asynd. Epizeugma Hypozeugma

0.73 0.78 0.78 0.74 0.85 0.87 0.56 0.61 0.67 0.56 1 0.91

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Custom schemes

If-cond 0 If-cond 1 If-cond 2 If-cond 3 If-counterFact Passive Voice

  • Comp. Adj.
  • Comp. Adv.
  • Super. Adj.
  • Super. Adv.

Unless-cond Whether-cond

0.84 0.68 1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Balance schemes

Enumeration Isocolon Pysma

85

slide-86
SLIDE 86

Evaluation Results

0.74 0.74 0.72 0.74 0.78 0.59 0.4 0.73

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Repetition schemes

Anadiplosis Diacope Epanalepsis Epiphoza Epizeuxis Mesarchia Mesod. Polysyndeton

0.4 0.69 0.69

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Omission schemes

Asynd. Epizeugma Hypozeugma

0.73 0.78 0.78 0.74 0.85 0.87 0.56 0.61 0.67 0.56 1 0.91

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Custom schemes

If-cond 0 If-cond 1 If-cond 2 If-cond 3 If-counterFact Passive Voice

  • Comp. Adj.
  • Comp. Adv.
  • Super. Adj.
  • Super. Adv.

Unless-cond Whether-cond

0.84 0.68 1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Balance schemes

Enumeration Isocolon Pysma

86

slide-87
SLIDE 87

Evaluation Results F1-Score

0.74 0.74 0.72 0.74 0.78 0.59 0.4 0.73

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Repetition schemes

Anadiplosis Diacope Epanalepsis Epiphoza Epizeuxis Mesarchia Mesod. Polysyndeton

0.4 0.69 0.69

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Omission schemes

Asynd. Epizeugma Hypozeugma

0.73 0.78 0.78 0.74 0.85 0.87 0.56 0.61 0.67 0.56 1 0.91

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Custom schemes

If-cond 0 If-cond 1 If-cond 2 If-cond 3 If-counterFact Passive Voice

  • Comp. Adj.
  • Comp. Adv.
  • Super. Adj.
  • Super. Adv.

Unless-cond Whether-cond

0.84 0.68 1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Balance schemes

Enumeration Isocolon Pysma

87

slide-88
SLIDE 88

Pipeline

  • utput

88

slide-89
SLIDE 89

Pipeline

  • utput

89

slide-90
SLIDE 90

Analysis of Rhetorical Devices

2

90

slide-91
SLIDE 91

Pipeline

  • utput

91

slide-92
SLIDE 92

Pipeline

  • utput

Frequency Combinations

+ + + + + +

Group distribution

92

slide-93
SLIDE 93

Pipeline

  • utput

Confidence interval

n * precision (rd) n/recall (rd)

Al-Khatib et al. [2017] Device Precision Recall

93

slide-94
SLIDE 94

Pipeline

  • utput

Significance Test Effect-size Test

94

A B A B

slide-95
SLIDE 95

Pipeline

  • utput

Type Topic

95

slide-96
SLIDE 96

Pipeline

96

slide-97
SLIDE 97

Analysis Experiments

Data Preparation

3

97

slide-98
SLIDE 98

Experiments: datasets

The New York Times US Presidential Debates 2016

Ben Wiseman [2016]

98

slide-99
SLIDE 99

Data dimensionality

Language English Mode Written Communication Monological Medium Newspaper Type Descriptive Genre Editorial Topic Education Author Identity Audience U.S. Presidential Debates Argumentative Review Biography Debate Science Art Politics

99

slide-100
SLIDE 100

NYT Experiment: data subsampling

Random Article-length based Matching

1000 articles 600 articles 343 articles

NYT Dataset

100

slide-101
SLIDE 101

NYT Experiment: Findings

Rhetoric Detection System Rhetoric Detection System “Random” dataset “Article-length based” dataset Articles cover multiple dimensions Hard to deduce particular styles

101

slide-102
SLIDE 102

NYT Experiment: Findings

Rhetoric Detection System Rhetoric Detection System “Random” dataset “Article-length based” dataset Articles cover multiple dimensions Hard to deduce particular styles CONFOUNDING

102

slide-103
SLIDE 103

NYT Experiment: Confounding

Author

(independent variable)

Genre

(confounding variable)

Rhetorical style

(dependent variable) 103

slide-104
SLIDE 104

NYT Experiment: Confounding

Author

(independent variable)

Genre

(confounding variable)

Rhetorical style

(dependent variable)

MATCHING

104

slide-105
SLIDE 105

NYT Experiment: Matching

Genre 1 Genre 2 Genre 3 Genre 4

105

slide-106
SLIDE 106

NYT Experiment: Matching

Genre 1 Genre 2 Genre 3 Genre 4

106

slide-107
SLIDE 107

NYT Experiment: Matching

Genre 1 Genre 2 Genre 3 Genre 4

107

slide-108
SLIDE 108

NYT Experiment: Matching

Genre 1 Genre 2 Genre 3 Genre 4

108

Articles written by author X

  • n topic Y across genres
slide-109
SLIDE 109

NYT Experiment: Matching

Genre 1 Genre 2 Genre 3 Genre 4

109

slide-110
SLIDE 110

NYT Experiment: Matching

Genre 1 Genre 2 Genre 3 Genre 4

110

slide-111
SLIDE 111

NYT Experiment: Matching

Genre 1 Genre 2 Genre 3 Genre 4

111

slide-112
SLIDE 112

NYT Experiment: Matching

Genre 1 Genre 2 Genre 3 Genre 4

112

slide-113
SLIDE 113

NYT Experiment: Matching

Genre 1 Genre 2 Author 1 Author 2

113

Topic 1 Topic 2

slide-114
SLIDE 114

Analysis Experiments

Findings

3

114

slide-115
SLIDE 115

NYT Experiment: Frequency

6.28 17.92 6.75 3.26 4.34 6.05 8.53 6.05 0.62 13.65 0.85 0.08 0.39 0.93 0.47 5.04 4.03 28.08 15.52 0.78 4.27 2.02 10.78 0.23 0.47 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 Anadiplosis Asyndeton comparativeAdj comparativeAdv Diacope Epanalepsis Epiphoza Epizeugma Epizeuxis Hypozeugma IfCondOne IfCondThree IfCondTwo IfCondZero IfCounterfactual Isocolon Mesarchia Mesodiplosis PassiveVoice Polysyndeton Pysma superlativeAdj superlativeAdv TripleEnumeration Unless Whether

Genres: Review distribution

5.19 16.99 10.28 4.98 3.79 2.71 10.28 7.79 0.43 15.48 1.95 0.22 0.76 1.62 0.76 6.17 3.57 24.35 14.61 0.76 0.43 6.28 1.62 10.71 0.32 0.32 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 Anadiplosis Asyndeton comparativeAdj comparativeAdv Diacope Epanalepsis Epiphoza Epizeugma Epizeuxis Hypozeugma IfCondOne IfCondThree IfCondTwo IfCondZero IfCounterfactual Isocolon Mesarchia Mesodiplosis PassiveVoice Polysyndeton Pysma superlativeAdj superlativeAdv TripleEnumeration Unless Whether

Genres: Editorial distribution 115

slide-116
SLIDE 116

NYT Experiment: Findings

Style-based frequency of rhetorical devices

116

slide-117
SLIDE 117

NYT Experiment: Findings

Authors

117

slide-118
SLIDE 118

NYT Experiment: Findings

! Same pattern across all articles

Authors

118

slide-119
SLIDE 119

NYT Experiment: Findings

Authors

Good Job, Lewis!

119

slide-120
SLIDE 120

NYT Experiment: Findings

Authors

120

slide-121
SLIDE 121

NYT Experiment: Findings

Genres

Comparatives

121

author topic

slide-122
SLIDE 122

NYT Experiment: Findings

Genres

122

slide-123
SLIDE 123

NYT Experiment: Findings

Genres: tests’ results

123

slide-124
SLIDE 124

NYT Experiment: Findings

Topics

124

author genre

slide-125
SLIDE 125

NYT Experiment: Findings

Style-based frequency of rhetorical devices Characteristic style patterns within each dimension

125

slide-126
SLIDE 126

NYT Experiment: Findings

Topics: tests’ results

126

slide-127
SLIDE 127

NYT Experiment: Findings

Style is more author- and genre-dependent

127

Style-based frequency of rhetorical devices Characteristic style patterns within each dimension

slide-128
SLIDE 128

Presidential Debates: Datasets

CLINTON TRUMP REST

VS.

CLINTON

VS.

CLINTON TRUMP

VS.

REST

VS.

TRUMP

128

slide-129
SLIDE 129

Presidential Debates: Findings

129

slide-130
SLIDE 130

Presidential Debates: Findings

Asyndeton = clarity and rhythm

130

slide-131
SLIDE 131

Presidential Debates: Findings

Acceptance Speech Analysis by Huffington Post

131

slide-132
SLIDE 132

Presidential Debates: Findings

Acceptance Speech Analysis by Huffington Post

132

slide-133
SLIDE 133

Presidential Debates: Findings

Acceptance Speech Analysis by Huffington Post

133

slide-134
SLIDE 134

Presidential Debates: Findings

Significance Test

134

slide-135
SLIDE 135

Presidential Debates: Findings

Trump doesn’t change his style

Significance Test

135

slide-136
SLIDE 136

Summary

Conclusions

Vague style patterns across random and article- length based subsampling: Confounding Better style identification with Matching Rhetorical style depends more on author and genre of writings rather than their topics Debates: candidates employ different styles Debates: domain experience trains an adaptive rhetorical style

136

System for rhetorical style identification in high-quality text documents Rule-based algorithms for detection of RD

slide-137
SLIDE 137

Summary

Resources

Novel framework for detecting rhetorical devices Comprehensive dataset for evaluation of rhetoric detection systems Elaborative style patterns and intriguing findings

137

Conclusions

Vague style patterns across random and article- length based subsampling: Confounding Better style identification with Matching Rhetorical style depends more on author and genre of writings rather than their topics Debates: candidates employ different styles Debates: domain experience trains an adaptive rhetorical style System for rhetorical style identification in high-quality text documents Rule-based algorithms for detection of RD

slide-138
SLIDE 138

Summary

Resources

Novel framework for detecting rhetorical devices Elaborative style patterns and intriguing findings

Efficiency

1st sentence → 5.8 sec. 2nd sentence → 0.4 sec. Initialization → 1.7 sec.

138

Conclusions

Vague style patterns across random and article- length based subsampling: Confounding Better style identification with Matching Rhetorical style depends more on author and genre of writings rather than their topics Debates: candidates employ different styles Debates: domain experience trains an adaptive rhetorical style System for rhetorical style identification in high-quality text documents Rule-based algorithms for detection of RD Comprehensive dataset for evaluation of rhetoric detection systems

slide-139
SLIDE 139

Summary

Resources

Novel framework for detecting rhetorical devices Elaborative style patterns and intriguing findings

Future Work

Larger dataset for analysis Focus of semantical rhetoric Analysis measures like placement and flows of rhetorical devices

Efficiency

139

Conclusions

Vague style patterns across random and article- length based subsampling: Confounding Better style identification with Matching Rhetorical style depends more on author and genre of writings rather than their topics Debates: candidates employ different styles Debates: domain experience trains an adaptive rhetorical style System for rhetorical style identification in high-quality text documents Rule-based algorithms for detection of RD Comprehensive dataset for evaluation of rhetoric detection systems 1st sentence → 5.8 sec. 2nd sentence → 0.4 sec. Initialization → 1.7 sec.

slide-140
SLIDE 140

Thank you!

140

slide-141
SLIDE 141

References

  • Ben Wiseman, New York Times https://www.nytimes.com/2016/09/25/opinion/campaign-

stops/my-debate-nightmare-a-duller-donald-trump.html

  • Peter Kluegl and Martin Atzmueller. Textmarker: A tool for rule-based infor-mation extraction, 2009.
  • Khalid Al-Khatib, Henning Wachsmuth, Matthias Hagen, and Benno Stein. Patterns of

Argumentation Strategies across Topics. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 17), pages 1362–1368. Association for Computational Linguistics, September 2017. URL http://aclweb.org/anthology/D17-1142.

  • https://literarydevices.net/enumeration/
  • G. Burton. The forest of rhetoric (silva rhetoricae), 2007.

141

slide-142
SLIDE 142

References - Icons and Images

142

  • advertise by David from the Noun Project
  • buy by Arthur Shlain from the Noun Project
  • Money by Desbenoit from the Noun Project
  • Idea by MRFA from the Noun Project
  • arrange by Gregor Cresnar from the Noun Project
  • font style by iconsmind.com from the Noun Project
  • memories by Henning Gross from the Noun Project
  • what by Paffi from the Noun Project
  • Translation by Mun May Tee from the Noun Project
  • analysis by Chameleon Design from the Noun Project
  • like by Bluetip Design from the Noun Project
  • analysis by Chameleon Design from the Noun Project
  • Folder by AlfredoCreates.com/Icons from the Noun Project
  • different by AlfredoCreates.com/Icons from the Noun Project
  • Flag by Hare Krishna from the Noun Project
  • Map Marker by shashank singh from the Noun Project
  • Icon by Llisole from the Noun Project
  • Icons made by Freepik on flaticon.com
  • Icons made by Becris on flaticon.com
  • Icons made by Vectors Market on flaticon.com
  • Jar by S. Salinas from the Noun Project
  • Check mark designed by Freepik
  • Icons made by Smashicons on flaticon.com
  • http://community.wikia.com/wiki/File:Aristotle-17.jpg
  • https://www.washingtonpost.com/graphics/politics/2016-

election/presidential-debate-schedule/

slide-143
SLIDE 143

Existing research

  • Gawryjołek et al. [2009] – authorship identification system based on

rhetorical style.

  • Strommer [2011] – authorial intent detection system based on the

anaphora usage.

  • Java [2015] – machine-learning based authorship identification system

using rhetorical devices (based on Gawryjołek et al. [2009] )

143

slide-144
SLIDE 144

Evaluation results

Device Total No. Precision Recall F1-score Device Total No. Precision Recall F1-score Anadiplosis 60 0.76 0.73 0.74 If Conditional Two 60 0.82 0.75 0.78 Asyndeton 60 0.25 0.95 0.4 If Conditional Zero 60 0.71 0.76 0.73 Comparative Adjective 67 0.51 0.61 0.56 If Counterfactual 60 0.84 0.87 0.85 Comparative Adverb 71 0.6 0.62 0.61 Isocolon 180 0.57 0.83 0.68 Diacope 60 0.75 0.73 0.74 Mesarchia 20 0.45 0.85 0.59 Enumeration 60 0.76 0.93 0.84 Mesodiplosis 40 0.28 0.68 0.4 Epanalepsis 60 0.63 0.83 0.72 Passive Voice 60 0.79 0.98 0.87 Epiphoza 60 0.61 0.93 0.74 Polysyndeton 60 0.77 0.7 0.73 Epizeugma 60 0.68 0.7 0.69 Pysma 60 1 1 1 Epizeuxis 60 0.79 0.77 0.78 Superlative Adjective 70 0.62 0.73 0.67 Hypozeugma 60 0.61 0.8 0.69 Superlative Adverb 70 0.63 0.5 0.56 If Conditional One 60 0.78 0.78 0.78 Unless Conditional 60 1 1 1 If Conditional Three 60 0.86 0.65 0.74 Whether Conditional 60 1 0.83 0.91

  • Balance schemes
  • Omission schemes
  • Repetition schemes
  • Custom schemes

144

slide-145
SLIDE 145

Evaluation results

  • Balance schemes
  • Omission schemes
  • Repetition schemes
  • Custom schemes

Device Total No. Precision Recall F1-score Device Total No. Precision Recall F1-score Anadiplosis 60 0.76 0.73 0.74 If Conditional Two 60 0.82 0.75 0.78 Asyndeton 60 0.25 0.95 0.4 If Conditional Zero 60 0.71 0.76 0.73 Comparative Adjective 67 0.51 0.61 0.56 If Counterfactual 60 0.84 0.87 0.85 Comparative Adverb 71 0.6 0.62 0.61 Isocolon 180 0.57 0.83 0.68 Diacope 60 0.75 0.73 0.74 Mesarchia 20 0.45 0.85 0.59 Enumeration 60 0.76 0.93 0.84 Mesodiplosis 40 0.28 0.68 0.4 Epanalepsis 60 0.63 0.83 0.72 Passive Voice 60 0.79 0.98 0.87 Epiphoza 60 0.61 0.93 0.74 Polysyndeton 60 0.77 0.7 0.73 Epizeugma 60 0.68 0.7 0.69 Pysma 60 1 1 1 Epizeuxis 60 0.79 0.77 0.78 Superlative Adjective 70 0.62 0.73 0.67 Hypozeugma 60 0.61 0.8 0.69 Superlative Adverb 70 0.63 0.5 0.56 If Conditional One 60 0.78 0.78 0.78 Unless Conditional 60 1 1 1 If Conditional Three 60 0.86 0.65 0.74 Whether Conditional 60 1 0.83 0.91

145

slide-146
SLIDE 146

Evaluation results

  • Balance schemes
  • Omission schemes
  • Repetition schemes
  • Custom schemes

Device Total No. Precision Recall F1-score Device Total No. Precision Recall F1-score Anadiplosis 60 0.76 0.73 0.74 If Conditional Two 60 0.82 0.75 0.78 Asyndeton 60 0.25 0.95 0.4 If Conditional Zero 60 0.71 0.76 0.73 Comparative Adjective 67 0.51 0.61 0.56 If Counterfactual 60 0.84 0.87 0.85 Comparative Adverb 71 0.6 0.62 0.61 Isocolon 180 0.57 0.83 0.68 Diacope 60 0.75 0.73 0.74 Mesarchia 20 0.45 0.85 0.59 Enumeration 60 0.76 0.93 0.84 Mesodiplosis 40 0.28 0.68 0.4 Epanalepsis 60 0.63 0.83 0.72 Passive Voice 60 0.79 0.98 0.87 Epiphoza 60 0.61 0.93 0.74 Polysyndeton 60 0.77 0.7 0.73 Epizeugma 60 0.68 0.7 0.69 Pysma 60 1 1 1 Epizeuxis 60 0.79 0.77 0.78 Superlative Adjective 70 0.62 0.73 0.67 Hypozeugma 60 0.61 0.8 0.69 Superlative Adverb 70 0.63 0.5 0.56 If Conditional One 60 0.78 0.78 0.78 Unless Conditional 60 1 1 1 If Conditional Three 60 0.86 0.65 0.74 Whether Conditional 60 1 0.83 0.91

146

slide-147
SLIDE 147

Evaluation results

  • Balance schemes
  • Omission schemes
  • Repetition schemes
  • Custom schemes

Device Total No. Precision Recall F1-score Device Total No. Precision Recall F1-score Anadiplosis 60 0.76 0.73 0.74 If Conditional Two 60 0.82 0.75 0.78 Asyndeton 60 0.25 0.95 0.4 If Conditional Zero 60 0.71 0.76 0.73 Comparative Adjective 67 0.51 0.61 0.56 If Counterfactual 60 0.84 0.87 0.85 Comparative Adverb 71 0.6 0.62 0.61 Isocolon 180 0.57 0.83 0.68 Diacope 60 0.75 0.73 0.74 Mesarchia 20 0.45 0.85 0.59 Enumeration 60 0.76 0.93 0.84 Mesodiplosis 40 0.28 0.68 0.4 Epanalepsis 60 0.63 0.83 0.72 Passive Voice 60 0.79 0.98 0.87 Epiphoza 60 0.61 0.93 0.74 Polysyndeton 60 0.77 0.7 0.73 Epizeugma 60 0.68 0.7 0.69 Pysma 60 1 1 1 Epizeuxis 60 0.79 0.77 0.78 Superlative Adjective 70 0.62 0.73 0.67 Hypozeugma 60 0.61 0.8 0.69 Superlative Adverb 70 0.63 0.5 0.56 If Conditional One 60 0.78 0.78 0.78 Unless Conditional 60 1 1 1 If Conditional Three 60 0.86 0.65 0.74 Whether Conditional 60 1 0.83 0.91

147

slide-148
SLIDE 148

Evaluation results

  • Balance schemes
  • Omission schemes
  • Repetition schemes
  • Custom schemes

Device Total No. Precision Recall F1-score Device Total No. Precision Recall F1-score Anadiplosis 60 0.76 0.73 0.74 If Conditional Two 60 0.82 0.75 0.78 Asyndeton 60 0.25 0.95 0.4 If Conditional Zero 60 0.71 0.76 0.73 Comparative Adjective 67 0.51 0.61 0.56 If Counterfactual 60 0.84 0.87 0.85 Comparative Adverb 71 0.6 0.62 0.61 Isocolon 180 0.57 0.83 0.68 Diacope 60 0.75 0.73 0.74 Mesarchia 20 0.45 0.85 0.59 Enumeration 60 0.76 0.93 0.84 Mesodiplosis 40 0.28 0.68 0.4 Epanalepsis 60 0.63 0.83 0.72 Passive Voice 60 0.79 0.98 0.87 Epiphoza 60 0.61 0.93 0.74 Polysyndeton 60 0.77 0.7 0.73 Epizeugma 60 0.68 0.7 0.69 Pysma 60 1 1 1 Epizeuxis 60 0.79 0.77 0.78 Superlative Adjective 70 0.62 0.73 0.67 Hypozeugma 60 0.61 0.8 0.69 Superlative Adverb 70 0.63 0.5 0.56 If Conditional One 60 0.78 0.78 0.78 Unless Conditional 60 1 1 1 If Conditional Three 60 0.86 0.65 0.74 Whether Conditional 60 1 0.83 0.91

148

slide-149
SLIDE 149

If-conditional Detection

If P, then Q

Stanford Dependencies Governors

If … governor … governor

4 tokens

P-clause Q-clause If-conditional

grammatical rules

149

slide-150
SLIDE 150

If-counterfactual Detection

If P, then Q

Identify P and Q

P → past tense

(VBN/VBD)

Q → modals

(would/could…)

If-counterfactual

150

slide-151
SLIDE 151

Presidential Debates: Findings

Comparatives

151