Detecting Protagonists in German Plays around 1800 as a - - PowerPoint PPT Presentation

detecting protagonists in german plays around 1800 as a
SMART_READER_LITE
LIVE PREVIEW

Detecting Protagonists in German Plays around 1800 as a - - PowerPoint PPT Presentation

Nils Reiter, Benjamin Krautter, Janis Pagel, Marcus Willand Detecting Protagonists in German Plays around 1800 as a Classification Task Disclaimer We have progressed in our work We will present the current state of our research Not only


slide-1
SLIDE 1

Detecting Protagonists in German Plays around 1800 as a Classification Task

Nils Reiter, Benjamin Krautter, Janis Pagel, Marcus Willand

slide-2
SLIDE 2

Disclaimer

We have progressed in our work We will present the current state of our research Not only what is in the submitted paper Updated results Talk includes analysis on dramas from 1700 to 1900 See also Krautter et. al. 2018: Titelhelden und Protagonisten – Interpretierbare Figurenklassifikation in deutschsprachigen

  • Dramen. LitLab Pamphlets, vol. 7

, November 2018.

2

slide-3
SLIDE 3

Quantitative Drama Analytics (QuaDramA)

Cooperation between German literary studies and computational linguistics Analyse (German) dramatic texts computationally Historical perspective Investigate character types Coreference resolution for dramatic texts

3

slide-4
SLIDE 4

Quantitative Drama Analytics (QuaDramA)

Cooperation between German literary studies and computational linguistics Analyse (German) dramatic texts computationally Historical perspective Investigate character types Coreference resolution for dramatic texts

3

slide-5
SLIDE 5

Quantitative Drama Analytics (QuaDramA)

Cooperation between German literary studies and computational linguistics Analyse (German) dramatic texts computationally Historical perspective Investigate character types Coreference resolution for dramatic texts

3

slide-6
SLIDE 6

Quantitative Drama Analytics (QuaDramA)

Cooperation between German literary studies and computational linguistics Analyse (German) dramatic texts computationally Historical perspective Investigate character types Coreference resolution for dramatic texts

3

slide-7
SLIDE 7

Quantitative Drama Analytics (QuaDramA)

Cooperation between German literary studies and computational linguistics Analyse (German) dramatic texts computationally Historical perspective Investigate character types Coreference resolution for dramatic texts

3

slide-8
SLIDE 8

Introduction 1

slide-9
SLIDE 9

What is a Drama?

Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict

5

slide-10
SLIDE 10

What is a Drama?

Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict

5

slide-11
SLIDE 11

What is a Drama?

Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict

5

slide-12
SLIDE 12

What is a Drama?

Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict

5

slide-13
SLIDE 13

What is a Drama?

Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict

5

slide-14
SLIDE 14

What is a Drama?

Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict

5

slide-15
SLIDE 15

Excerpt from Lessing’s Emilia Galotti

Fünfter Auftritt Der Prinz. Emilia. Marinelli. DER PRINZ. Wo ist sie? wo? - Wir suchen Sie überall, schönstes Fräulein.

  • Sie sind doch wohl?
  • Nun so ist alles wohl!

Der Graf, Ihre Mutter, - EMILIA. Ah, gnädigster Herr! wo sind sie? Wo ist meine Mutter?

<div type="scene"> <div> <desc> <title>5. Auftritt</title> </desc> </div> <div type="text"> <div type="h4"> <head>Fünfter Auftritt</head> <stage> <hi>Der Prinz. Emilia. Marinelli.</hi> </stage> <sp who="#der_prinz"> <speaker>DER PRINZ.</speaker> <p> Wo ist sie? wo? – Wir suchen Sie überall, schönstes Fräulein.– Sie sind doch wohl?– Nun so ist alles wohl! Der Graf, Ihre Mutter, –</p> </sp> <sp who="#emilia"> <speaker>EMILIA.</speaker> <l> Ah, gnädigster Herr! wo sind sie? Wo ist meine Mutter?</l> </sp>

6

slide-16
SLIDE 16

Excerpt from Lessing’s Emilia Galotti

Fünfter Auftritt Der Prinz. Emilia. Marinelli. DER PRINZ. Wo ist sie? wo? - Wir suchen Sie überall, schönstes Fräulein.

  • Sie sind doch wohl?
  • Nun so ist alles wohl!

Der Graf, Ihre Mutter, - EMILIA. Ah, gnädigster Herr! wo sind sie? Wo ist meine Mutter?

<div type="scene"> <div> <desc> <title>5. Auftritt</title> </desc> </div> <div type="text"> <div type="h4"> <head>Fünfter Auftritt</head> <stage> <hi>Der Prinz. Emilia. Marinelli.</hi> </stage> <sp who="#der_prinz"> <speaker>DER PRINZ.</speaker> <p> Wo ist sie? wo? – Wir suchen Sie überall, schönstes Fräulein.– Sie sind doch wohl?– Nun so ist alles wohl! Der Graf, Ihre Mutter, –</p> </sp> <sp who="#emilia"> <speaker>EMILIA.</speaker> <l> Ah, gnädigster Herr! wo sind sie? Wo ist meine Mutter?</l> </sp>

6

slide-17
SLIDE 17

Goals

Classify all figures in play regarding the classes: Protagonist - Not Protagonist Analyse results w.r.t. literary interpretation

7

slide-18
SLIDE 18

Goals

Classify all figures in play regarding the classes: Protagonist - Not Protagonist Analyse results w.r.t. literary interpretation

7

slide-19
SLIDE 19

Definition of Being a Protagonist

Difficult from theoretical point of view We settled on:

Protagonist

Causes or solves the central dramatic conflict This can happen actively or passively From this follows: There can be more than one protagonist per drama Not only “heroes” in a positive sense, but also “anti-heroes” allowed

8

slide-20
SLIDE 20

Definition of Being a Protagonist

Difficult from theoretical point of view We settled on:

Protagonist

Causes or solves the central dramatic conflict This can happen actively or passively From this follows: There can be more than one protagonist per drama Not only “heroes” in a positive sense, but also “anti-heroes” allowed

8

slide-21
SLIDE 21

Definition of Being a Protagonist

Difficult from theoretical point of view We settled on:

Protagonist

Causes or solves the central dramatic conflict This can happen actively or passively From this follows: There can be more than one protagonist per drama Not only “heroes” in a positive sense, but also “anti-heroes” allowed

8

slide-22
SLIDE 22

Experiments 2

slide-23
SLIDE 23

Features

Feature Name Description Tokens Character speech, normalised on whole text

10

slide-24
SLIDE 24

Features

Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector

10

slide-25
SLIDE 25

Features

Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas

10

slide-26
SLIDE 26

Features

Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas Actives Number of scenes where figure is present

10

slide-27
SLIDE 27

Features

Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas Actives Number of scenes where figure is present Passives Number of Scenes where figure is mentioned

10

slide-28
SLIDE 28

Features

Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas Actives Number of scenes where figure is present Passives Number of Scenes where figure is mentioned LastAct Is figure present in the last act?

10

slide-29
SLIDE 29

Features

Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas Actives Number of scenes where figure is present Passives Number of Scenes where figure is mentioned LastAct Is figure present in the last act? NumberFig In respective drama

10

slide-30
SLIDE 30

Features

Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas Actives Number of scenes where figure is present Passives Number of Scenes where figure is mentioned LastAct Is figure present in the last act? NumberFig In respective drama Genre e.g. Weimar Classicism, Bourgeois Tragedy, Naturalism, etc.

10

slide-31
SLIDE 31

Centrality

Figure 1: A) Betweenness centrality, B) Closeness centrality, C) Eigenvector centrality, D) Degree centrality. Source:

https://en.wikipedia.org/wiki/Centrality

11

slide-32
SLIDE 32

Experimental Setup

114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,

class=P}

Random forest

12

slide-33
SLIDE 33

Experimental Setup

114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,

class=P}

Random forest

12

slide-34
SLIDE 34

Experimental Setup

114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,

class=P}

Random forest

12

slide-35
SLIDE 35

Experimental Setup

114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,

class=P}

Random forest

12

slide-36
SLIDE 36

Experimental Setup

114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,

class=P}

Random forest

12

slide-37
SLIDE 37

Experimental Setup

114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,

class=P}

Random forest

12

slide-38
SLIDE 38

Results 3

slide-39
SLIDE 39

Results

Precision Recall F1 Accuracy Majority BL

  • 0.00
  • 0.86

Tokens BL 0.62 0.99 0.76 0.93 Random Forest 0.72 1.00 0.83 0.95

Table 1: Results for classification of protagonists plus baselines. Shown are the average values of the experiments for each annotator.

14

slide-40
SLIDE 40

Feature Distribution

Going a step back...

actives passives lastAct T7 T8 T9 T10 T3 T4 T5 T6 between eigen T1 T2 tokens degree wdegree close 1 0.0 0.5 1 0.0 0.5 0.0 0.2 0.4 0.0 0.1 0.0 0.2 0.0 0.1 0.2 0.0 0.5 0.0 0.5 1.0 0.0 0.1 0.0 0.5 1 0.0 0.1 0.0 0.2 0.0 0.5 1 100 0.0 0.5 1.0 C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P

Value Class

A1

Figure 2: Feature distribution for one annotator’s data set.

15

slide-41
SLIDE 41

Feature Importance

Going back to the model...

AUF BT POP ROM VM WK NAT WM passives close lastAct wdegree eigen degree T4 T6 T3 between T7 T5 nfig actives SD T8 T1 T10 T2 T9 tokens 50 100 150 200 250

Importance Feature

A1

Figure 3: Relative Feature Importance for one model.

16

slide-42
SLIDE 42

Character Analysis 4

slide-43
SLIDE 43

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-44
SLIDE 44

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-45
SLIDE 45

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-46
SLIDE 46

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-47
SLIDE 47

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-48
SLIDE 48

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-49
SLIDE 49

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-50
SLIDE 50

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-51
SLIDE 51

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-52
SLIDE 52

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-53
SLIDE 53

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-54
SLIDE 54

Analysis of Single Characters

Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death

18

slide-55
SLIDE 55

T

  • kens

Figure 4: Top seven figures with the highest token number in Emilia Galotti.

19

slide-56
SLIDE 56

Presence

Figure 5: Active and passive presence in Emilia Galotti. A figure is only passively present in a scene if they are not actively present.

20

slide-57
SLIDE 57

Shapley Graphs

What has the model learnt about these characters?

tokens=0.09 T2=0.01 between=0.11 lastAct=1 T3=0.06 eigen=0.56 T4=0.01 passives=0.09 T8=0 BT=1 degree=0.46 T9=0 VM=0 AUF=0 close=0.38 NAT=0 POP=0 ROM=0 SD=0 T10=0.01 T5=0.83 wdegree=13 WK=0 WM=0 actives=0.16 T1=0.03 nfig=15 T6=0.01 T7=0.04 0.0 0.2 0.4 0.6

phi Feature Emilia (P − P)

Figure 6: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).

21

slide-58
SLIDE 58

Shapley Graphs

What has the model learnt about these characters?

tokens=0.22 T2=0 BT=1 actives=0.44 lastAct=1 T4=0.01 T3=0.07 VM=0 wdegree=30 passives=0.16 T7=0.01 AUF=0 between=0.13 degree=0.77 NAT=0 nfig=15 POP=0 ROM=0 SD=0 T8=0.01 WK=0 WM=0 eigen=1 T1=0.03 T9=0 close=0.32 T10=0.01 T6=0.01 T5=0.86 0.0 0.2 0.4 0.6

phi Feature Marinelli (C − P)

Figure 7: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).

22

slide-59
SLIDE 59

Shapley Graphs

What has the model learnt about these characters?

tokens=0.22 T2=0 actives=0.4 lastAct=1 T8=0.01 VM=0 BT=1 close=0.45 passives=0.37 T3=0.07 T7=0.01 SD=0 T9=0 wdegree=20 AUF=0 between=0.47 degree=0.62 eigen=0.79 NAT=0 nfig=15 POP=0 ROM=0 T10=0 WK=0 WM=0 T4=0.01 T6=0.01 T1=0.02 T5=0.87 0.0 0.2 0.4 0.6

phi Feature Prinz (P − P)

Figure 8: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).

23

slide-60
SLIDE 60

Shapley Graphs

What has the model learnt about these characters?

tokens=0.13 T2=0 T3=0.07 lastAct=1 T5=0.85 T8=0 eigen=0.61 passives=0.02 degree=0.46 between=0.08 BT=1 close=0.38 NAT=0 POP=0 ROM=0 SD=0 T1=0.01 T4=0.02 T6=0.01 T9=0 WK=0 WM=0 actives=0.28 AUF=0 nfig=15 wdegree=15 VM=0 T10=0.01 T7=0.01 0.0 0.2 0.4 0.6

phi Feature Odoardo (P − P)

Figure 9: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).

24

slide-61
SLIDE 61

Shapley Graphs

What has the model learnt about these characters?

T8=0.01 T3=0.04 T6=0.01 actives=0.05 AUF=0 BT=1 close=0.25 degree=0.08 eigen=0.1 NAT=0 POP=0 ROM=0 SD=0 T10=0.01 T4=0.03 T5=0.78 T7=0.03 T9=0.01 VM=0 WK=0 WM=0 between=0 nfig=15 passives=0 T1=0.04 T2=0.03 tokens=0.03 wdegree=2 lastAct=0 −0.04 −0.02 0.00 0.02

phi Feature Conti (C − C)

Figure 10: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).

25

slide-62
SLIDE 62

Shapley Graphs

What has the model learnt about these characters?

tokens=0.12 T2=0.01 passives=0.12 T9=0.01 BT=1 eigen=0.39 close=0.34 T3=0.07 T7=0.01 degree=0.31 AUF=0 NAT=0 POP=0 ROM=0 WM=0 actives=0.14 SD=0 WK=0 nfig=15 T1=0.02 between=0.03 T5=0.86 wdegree=8 T8=0.01 VM=0 T6=0.02 lastAct=0 T4=0 T10=0.01 −0.2 0.0 0.2 0.4

phi Feature Orsina (C − C)

Figure 11: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).

26

slide-63
SLIDE 63

Shapley Graphs

What has the model learnt about these characters?

tokens=0.04 T2=0.02 T8=0.01 passives=0.23 BT=1 T6=0.01 T1=0.04 close=0.32 degree=0.31 eigen=0.3 T3=0.06 T9=0.01 VM=0 AUF=0 NAT=0 POP=0 ROM=0 SD=0 T4=0.03 WK=0 WM=0 actives=0.12 between=0.04 T5=0.77 wdegree=8 T10=0.02 T7=0.03 nfig=15 lastAct=0 −0.1 0.0 0.1 0.2 0.3 0.4

phi Feature Appiani (C − P)

Figure 12: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).

27

slide-64
SLIDE 64

T ake-away

High performance for protagonist classification Tendency to produce False Positives Tokens feature is strong but not sufficient Analysis of single characters yields interesting insides

28

slide-65
SLIDE 65

T ake-away

High performance for protagonist classification Tendency to produce False Positives Tokens feature is strong but not sufficient Analysis of single characters yields interesting insides

28

slide-66
SLIDE 66

T ake-away

High performance for protagonist classification Tendency to produce False Positives Tokens feature is strong but not sufficient Analysis of single characters yields interesting insides

28

slide-67
SLIDE 67

T ake-away

High performance for protagonist classification Tendency to produce False Positives Tokens feature is strong but not sufficient Analysis of single characters yields interesting insides

28

slide-68
SLIDE 68

Appendix 5

slide-69
SLIDE 69

Results

Experiment 2 Precision Recall F1 Accuracy Majority BL

  • 0.00
  • 0.97

Tokens BL 0.38 1.00 0.55 0.95 Random Forest 0.46 1.00 0.63 0.96

Table 2: Results for classification of title characters plus baselines.

30

slide-70
SLIDE 70

Results

Experiment 3 Precision Recall F1 Accuracy A1woTokens 0.82 0.98 0.89 0.96 A2woTokens 0.78 1.00 0.88 0.96 A3woTokens 0.51 1.00 0.67 0.93 TFwoTokens 0.37 1.00 0.54 0.95

Table 3: Results without using tokens feature.

31

slide-71
SLIDE 71

Centrality Correlation

−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

wdegree between close eigen degree wdegree between close

A1

0.58 0.29 0.26 0.66 0.45 −0.02 0.88 0.67 0.32 0.52

−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

wdegree between close eigen degree wdegree between close

A2

0.6 0.4 0.03 0.45 0.33 0.16 0.84 0.55 0.32 0.27

−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

wdegree between close eigen degree wdegree between close

A3

0.81 0.24 −0.04 0.79 0.71 −0.02 0.89 0.83 0.2 0.63

−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

wdegree between close eigen degree wdegree between close

TF

0.69 0.22 0.02 0.81 0.61 −0.01 0.9 0.73 0.21 0.66

Figure 13: Correlation for centrality features.

32

slide-72
SLIDE 72

Feature Distribution

actives passives lastAct T7 T8 T9 T10 T3 T4 T5 T6 between eigen T1 T2 tokens degree wdegree close 1 0.0 0.5 1 0.0 0.5 0.0 0.2 0.4 0.0 0.1 0.0 0.2 0.0 0.1 0.2 0.0 0.5 0.0 0.5 1.0 0.0 0.1 0.0 0.5 1 0.0 0.1 0.0 0.2 0.0 0.5 1 100 0.0 0.5 1.0 C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P

Wert Klasse

A1

actives passives lastAct T7 T8 T9 T10 T3 T4 T5 T6 between eigen T1 T2 tokens degree wdegree close 0.0 0.5 1.0 0.0 0.5 1 0.0 0.2 0.0 0.1 0.0 0.1 0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.5 0.0 0.5 1.0 0.0 0.1 0.2 0.0 0.5 1 0.0 0.5 0.0 0.1 0.0 0.5 1 100 200 0.0 0.5 1.0 C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P

Wert Klasse

A2

actives passives lastAct T7 T8 T9 T10 T3 T4 T5 T6 between eigen T1 T2 tokens degree wdegree close 1 0.0 0.5 1.0 1 0.0 0.2 0.0 0.2 0.0 0.5 0.0 0.2 0.0 0.2 0.0 0.5 1 0.0 0.2 0.0 0.5 1 0.0 0.2 0.0 0.2 0.0 0.5 1 100 1 C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P

Wert Klasse

A3

actives passives lastAct T7 T8 T9 T10 T3 T4 T5 T6 between eigen T1 T2 tokens degree wdegree close 1 0.0 0.5 1 0.0 0.5 0.0 0.1 0.0 0.5 0.0 0.1 0.0 0.1 0.2 0.0 0.5 0.0 0.5 1.0 0.0 0.1 0.2 0.0 0.5 1 0.0 0.2 0.0 0.1 0.0 0.5 1 100 1 C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P

Wert Klasse

TF

Figure 14: Feature distribution.

33

slide-73
SLIDE 73

Feature Importance

AUF BT POP ROM VM WK NAT WM passives close lastAct wdegree eigen degree T4 T6 T3 between T7 T5 nfig actives SD T8 T1 T10 T2 T9 tokens 50 100 150 200 250

Importance Feature

A1

NAT POP VM WK WM AUF BT lastAct ROM nfig T9 T10 wdegree degree actives passives close T5 T4 between eigen T7 T8 T2 T6 T1 T3 SD tokens 100 200 300 400

Importance Feature

A2

AUF NAT ROM WM WK POP BT SD VM nfig wdegree passives close degree between eigen actives T1 T6 T9 T4 T3 T7 T5 lastAct T10 T2 T8 tokens 50 100

Importance Feature

A3

NAT POP WK WM SD AUF ROM VM BT lastAct passives T6 degree T3 close T4 between T1 T7 nfig wdegree T9 eigen T5 T10 actives T8 T2 tokens 20 40 60 80

Importance Feature

TF

Figure 15: Relative Feature Importance.

34

slide-74
SLIDE 74

Confusion Matrix

Ref C P Pred C 878 P 32 171

Table 4: Confusion matrix for A1.

Ref C P Pred C 883 P 45 176

Table 5: Confusion matrix for A2.

Ref C P Pred C 1196 P 100 106

Table 6: Confusion matrix for A3.

Ref C P Pred C 1456 P 57 49

Table 7: Confusion matrix for TF .

35

slide-75
SLIDE 75

Annotation

Three annotators with overlapping and unique dramas Annotator # Dramas # Protag-

  • nists

(%) # Non- Protagonists (%) # Figures Total A1 34 171 (16) 910 (84) 1081 A2 37 176 (16) 928 (84) 1104 A3 36 106 (8) 1296 (92) 1402

Table 8: Distribution of annotations.

Combination # Dramas Cohen’s A1+A2 6 0.83 A1+A3 6 0.46 A2+A3 7 0.43

Table 9: Cohen’s for different combinations of annotations.

36

slide-76
SLIDE 76

Annotation

Three annotators with overlapping and unique dramas Annotator # Dramas # Protag-

  • nists

(%) # Non- Protagonists (%) # Figures Total A1 34 171 (16) 910 (84) 1081 A2 37 176 (16) 928 (84) 1104 A3 36 106 (8) 1296 (92) 1402

Table 8: Distribution of annotations.

Combination # Dramas Cohen’s κ A1+A2 6 0.83 A1+A3 6 0.46 A2+A3 7 0.43

Table 9: Cohen’s κ for different combinations of annotations.

36

slide-77
SLIDE 77

Results

Experiment 1 Data Precision Recall F1 Accuracy Majority Baseline A1

  • 0.00
  • 0.84

A2

  • 0.00
  • 0.84

A3

  • 0.00
  • 0.92

Tokens Baseline A1 0.72 1.00 0.84 0.94 A2 0.70 0.99 0.82 0.93 A3 0.44 1.00 0.61 0.91 Random Forest A1 0.84 1.00 0.91 0.97 A2 0.80 1.00 0.89 0.96 A3 0.51 1.00 0.68 0.93

Table 10: Precision, Recall and F1 for classifying protagonists and accuracy.

37