Detecting Protagonists in German Plays around 1800 as a Classification Task
Nils Reiter, Benjamin Krautter, Janis Pagel, Marcus Willand
Detecting Protagonists in German Plays around 1800 as a - - PowerPoint PPT Presentation
Nils Reiter, Benjamin Krautter, Janis Pagel, Marcus Willand Detecting Protagonists in German Plays around 1800 as a Classification Task Disclaimer We have progressed in our work We will present the current state of our research Not only
Nils Reiter, Benjamin Krautter, Janis Pagel, Marcus Willand
We have progressed in our work We will present the current state of our research Not only what is in the submitted paper Updated results Talk includes analysis on dramas from 1700 to 1900 See also Krautter et. al. 2018: Titelhelden und Protagonisten – Interpretierbare Figurenklassifikation in deutschsprachigen
, November 2018.
2
Cooperation between German literary studies and computational linguistics Analyse (German) dramatic texts computationally Historical perspective Investigate character types Coreference resolution for dramatic texts
3
Cooperation between German literary studies and computational linguistics Analyse (German) dramatic texts computationally Historical perspective Investigate character types Coreference resolution for dramatic texts
3
Cooperation between German literary studies and computational linguistics Analyse (German) dramatic texts computationally Historical perspective Investigate character types Coreference resolution for dramatic texts
3
Cooperation between German literary studies and computational linguistics Analyse (German) dramatic texts computationally Historical perspective Investigate character types Coreference resolution for dramatic texts
3
Cooperation between German literary studies and computational linguistics Analyse (German) dramatic texts computationally Historical perspective Investigate character types Coreference resolution for dramatic texts
3
Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict
5
Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict
5
Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict
5
Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict
5
Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict
5
Use of action and dialogue Typically divided into acts and scenes Typically designed to be performed (on stage) Stage directions Cast Dramatic conflict
5
Fünfter Auftritt Der Prinz. Emilia. Marinelli. DER PRINZ. Wo ist sie? wo? - Wir suchen Sie überall, schönstes Fräulein.
Der Graf, Ihre Mutter, - EMILIA. Ah, gnädigster Herr! wo sind sie? Wo ist meine Mutter?
<div type="scene"> <div> <desc> <title>5. Auftritt</title> </desc> </div> <div type="text"> <div type="h4"> <head>Fünfter Auftritt</head> <stage> <hi>Der Prinz. Emilia. Marinelli.</hi> </stage> <sp who="#der_prinz"> <speaker>DER PRINZ.</speaker> <p> Wo ist sie? wo? – Wir suchen Sie überall, schönstes Fräulein.– Sie sind doch wohl?– Nun so ist alles wohl! Der Graf, Ihre Mutter, –</p> </sp> <sp who="#emilia"> <speaker>EMILIA.</speaker> <l> Ah, gnädigster Herr! wo sind sie? Wo ist meine Mutter?</l> </sp>
6
Fünfter Auftritt Der Prinz. Emilia. Marinelli. DER PRINZ. Wo ist sie? wo? - Wir suchen Sie überall, schönstes Fräulein.
Der Graf, Ihre Mutter, - EMILIA. Ah, gnädigster Herr! wo sind sie? Wo ist meine Mutter?
<div type="scene"> <div> <desc> <title>5. Auftritt</title> </desc> </div> <div type="text"> <div type="h4"> <head>Fünfter Auftritt</head> <stage> <hi>Der Prinz. Emilia. Marinelli.</hi> </stage> <sp who="#der_prinz"> <speaker>DER PRINZ.</speaker> <p> Wo ist sie? wo? – Wir suchen Sie überall, schönstes Fräulein.– Sie sind doch wohl?– Nun so ist alles wohl! Der Graf, Ihre Mutter, –</p> </sp> <sp who="#emilia"> <speaker>EMILIA.</speaker> <l> Ah, gnädigster Herr! wo sind sie? Wo ist meine Mutter?</l> </sp>
6
Classify all figures in play regarding the classes: Protagonist - Not Protagonist Analyse results w.r.t. literary interpretation
7
Classify all figures in play regarding the classes: Protagonist - Not Protagonist Analyse results w.r.t. literary interpretation
7
Difficult from theoretical point of view We settled on:
Protagonist
Causes or solves the central dramatic conflict This can happen actively or passively From this follows: There can be more than one protagonist per drama Not only “heroes” in a positive sense, but also “anti-heroes” allowed
8
Difficult from theoretical point of view We settled on:
Protagonist
Causes or solves the central dramatic conflict This can happen actively or passively From this follows: There can be more than one protagonist per drama Not only “heroes” in a positive sense, but also “anti-heroes” allowed
8
Difficult from theoretical point of view We settled on:
Protagonist
Causes or solves the central dramatic conflict This can happen actively or passively From this follows: There can be more than one protagonist per drama Not only “heroes” in a positive sense, but also “anti-heroes” allowed
8
Feature Name Description Tokens Character speech, normalised on whole text
10
Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector
10
Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas
10
Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas Actives Number of scenes where figure is present
10
Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas Actives Number of scenes where figure is present Passives Number of Scenes where figure is mentioned
10
Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas Actives Number of scenes where figure is present Passives Number of Scenes where figure is mentioned LastAct Is figure present in the last act?
10
Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas Actives Number of scenes where figure is present Passives Number of Scenes where figure is mentioned LastAct Is figure present in the last act? NumberFig In respective drama
10
Feature Name Description Tokens Character speech, normalised on whole text Centrality Different measures: (weighted) degree, closeness, betweenness, eigenvector Topic Modelling 10 Topics trained on the dramas Actives Number of scenes where figure is present Passives Number of Scenes where figure is mentioned LastAct Is figure present in the last act? NumberFig In respective drama Genre e.g. Weimar Classicism, Bourgeois Tragedy, Naturalism, etc.
10
Figure 1: A) Betweenness centrality, B) Closeness centrality, C) Eigenvector centrality, D) Degree centrality. Source:
https://en.wikipedia.org/wiki/Centrality
11
114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,
class=P}
Random forest
12
114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,
class=P}
Random forest
12
114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,
class=P}
Random forest
12
114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,
class=P}
Random forest
12
114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,
class=P}
Random forest
12
114 dramas in our corpus Three annotators Each data point represents a character in a play Arrays of feature values Class: P (Title character) / C (Not title character) Example Emilia: {tokens=0.09, actives=0.16, ...,
class=P}
Random forest
12
Precision Recall F1 Accuracy Majority BL
Tokens BL 0.62 0.99 0.76 0.93 Random Forest 0.72 1.00 0.83 0.95
Table 1: Results for classification of protagonists plus baselines. Shown are the average values of the experiments for each annotator.
14
Going a step back...
actives passives lastAct T7 T8 T9 T10 T3 T4 T5 T6 between eigen T1 T2 tokens degree wdegree close 1 0.0 0.5 1 0.0 0.5 0.0 0.2 0.4 0.0 0.1 0.0 0.2 0.0 0.1 0.2 0.0 0.5 0.0 0.5 1.0 0.0 0.1 0.0 0.5 1 0.0 0.1 0.0 0.2 0.0 0.5 1 100 0.0 0.5 1.0 C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P
Value Class
A1
Figure 2: Feature distribution for one annotator’s data set.
15
Going back to the model...
AUF BT POP ROM VM WK NAT WM passives close lastAct wdegree eigen degree T4 T6 T3 between T7 T5 nfig actives SD T8 T1 T10 T2 T9 tokens 50 100 150 200 250
Importance Feature
A1
Figure 3: Relative Feature Importance for one model.
16
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Example: Emilia Galotti by Gotthold Ephraim Lessing Bourgeois tragedy Emilia: Young bourgeois woman Der Prinz: “The prince” , ruler of the region Marinelli: Chamberlain of the prince Odoardo & Claudia: Emilia’s parents Appiani & Orsina: Count/Countess Summary of plot Emilia is engaged to Count Appiani The prince wants Emilia for himself His chamberlain Marinelli assassinates Appiani Orsina was the prince’s mistress and plots to kill him Odoardo kills Emilia at her wish The prince blames Marinelli for her death
18
Figure 4: Top seven figures with the highest token number in Emilia Galotti.
19
Figure 5: Active and passive presence in Emilia Galotti. A figure is only passively present in a scene if they are not actively present.
20
What has the model learnt about these characters?
tokens=0.09 T2=0.01 between=0.11 lastAct=1 T3=0.06 eigen=0.56 T4=0.01 passives=0.09 T8=0 BT=1 degree=0.46 T9=0 VM=0 AUF=0 close=0.38 NAT=0 POP=0 ROM=0 SD=0 T10=0.01 T5=0.83 wdegree=13 WK=0 WM=0 actives=0.16 T1=0.03 nfig=15 T6=0.01 T7=0.04 0.0 0.2 0.4 0.6
phi Feature Emilia (P − P)
Figure 6: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).
21
What has the model learnt about these characters?
tokens=0.22 T2=0 BT=1 actives=0.44 lastAct=1 T4=0.01 T3=0.07 VM=0 wdegree=30 passives=0.16 T7=0.01 AUF=0 between=0.13 degree=0.77 NAT=0 nfig=15 POP=0 ROM=0 SD=0 T8=0.01 WK=0 WM=0 eigen=1 T1=0.03 T9=0 close=0.32 T10=0.01 T6=0.01 T5=0.86 0.0 0.2 0.4 0.6
phi Feature Marinelli (C − P)
Figure 7: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).
22
What has the model learnt about these characters?
tokens=0.22 T2=0 actives=0.4 lastAct=1 T8=0.01 VM=0 BT=1 close=0.45 passives=0.37 T3=0.07 T7=0.01 SD=0 T9=0 wdegree=20 AUF=0 between=0.47 degree=0.62 eigen=0.79 NAT=0 nfig=15 POP=0 ROM=0 T10=0 WK=0 WM=0 T4=0.01 T6=0.01 T1=0.02 T5=0.87 0.0 0.2 0.4 0.6
phi Feature Prinz (P − P)
Figure 8: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).
23
What has the model learnt about these characters?
tokens=0.13 T2=0 T3=0.07 lastAct=1 T5=0.85 T8=0 eigen=0.61 passives=0.02 degree=0.46 between=0.08 BT=1 close=0.38 NAT=0 POP=0 ROM=0 SD=0 T1=0.01 T4=0.02 T6=0.01 T9=0 WK=0 WM=0 actives=0.28 AUF=0 nfig=15 wdegree=15 VM=0 T10=0.01 T7=0.01 0.0 0.2 0.4 0.6
phi Feature Odoardo (P − P)
Figure 9: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).
24
What has the model learnt about these characters?
T8=0.01 T3=0.04 T6=0.01 actives=0.05 AUF=0 BT=1 close=0.25 degree=0.08 eigen=0.1 NAT=0 POP=0 ROM=0 SD=0 T10=0.01 T4=0.03 T5=0.78 T7=0.03 T9=0.01 VM=0 WK=0 WM=0 between=0 nfig=15 passives=0 T1=0.04 T2=0.03 tokens=0.03 wdegree=2 lastAct=0 −0.04 −0.02 0.00 0.02
phi Feature Conti (C − C)
Figure 10: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).
25
What has the model learnt about these characters?
tokens=0.12 T2=0.01 passives=0.12 T9=0.01 BT=1 eigen=0.39 close=0.34 T3=0.07 T7=0.01 degree=0.31 AUF=0 NAT=0 POP=0 ROM=0 WM=0 actives=0.14 SD=0 WK=0 nfig=15 T1=0.02 between=0.03 T5=0.86 wdegree=8 T8=0.01 VM=0 T6=0.02 lastAct=0 T4=0 T10=0.01 −0.2 0.0 0.2 0.4
phi Feature Orsina (C − C)
Figure 11: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).
26
What has the model learnt about these characters?
tokens=0.04 T2=0.02 T8=0.01 passives=0.23 BT=1 T6=0.01 T1=0.04 close=0.32 degree=0.31 eigen=0.3 T3=0.06 T9=0.01 VM=0 AUF=0 NAT=0 POP=0 ROM=0 SD=0 T4=0.03 WK=0 WM=0 actives=0.12 between=0.04 T5=0.77 wdegree=8 T10=0.02 T7=0.03 nfig=15 lastAct=0 −0.1 0.0 0.1 0.2 0.3 0.4
phi Feature Appiani (C − P)
Figure 12: Shapley graph for single figures in Emilia Galotti. Brackets mean: (Actual class – Predicted class).
27
High performance for protagonist classification Tendency to produce False Positives Tokens feature is strong but not sufficient Analysis of single characters yields interesting insides
28
High performance for protagonist classification Tendency to produce False Positives Tokens feature is strong but not sufficient Analysis of single characters yields interesting insides
28
High performance for protagonist classification Tendency to produce False Positives Tokens feature is strong but not sufficient Analysis of single characters yields interesting insides
28
High performance for protagonist classification Tendency to produce False Positives Tokens feature is strong but not sufficient Analysis of single characters yields interesting insides
28
Experiment 2 Precision Recall F1 Accuracy Majority BL
Tokens BL 0.38 1.00 0.55 0.95 Random Forest 0.46 1.00 0.63 0.96
Table 2: Results for classification of title characters plus baselines.
30
Experiment 3 Precision Recall F1 Accuracy A1woTokens 0.82 0.98 0.89 0.96 A2woTokens 0.78 1.00 0.88 0.96 A3woTokens 0.51 1.00 0.67 0.93 TFwoTokens 0.37 1.00 0.54 0.95
Table 3: Results without using tokens feature.
31
−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1
wdegree between close eigen degree wdegree between close
A1
0.58 0.29 0.26 0.66 0.45 −0.02 0.88 0.67 0.32 0.52
−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1
wdegree between close eigen degree wdegree between close
A2
0.6 0.4 0.03 0.45 0.33 0.16 0.84 0.55 0.32 0.27
−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1
wdegree between close eigen degree wdegree between close
A3
0.81 0.24 −0.04 0.79 0.71 −0.02 0.89 0.83 0.2 0.63
−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1
wdegree between close eigen degree wdegree between close
TF
0.69 0.22 0.02 0.81 0.61 −0.01 0.9 0.73 0.21 0.66
Figure 13: Correlation for centrality features.
32
actives passives lastAct T7 T8 T9 T10 T3 T4 T5 T6 between eigen T1 T2 tokens degree wdegree close 1 0.0 0.5 1 0.0 0.5 0.0 0.2 0.4 0.0 0.1 0.0 0.2 0.0 0.1 0.2 0.0 0.5 0.0 0.5 1.0 0.0 0.1 0.0 0.5 1 0.0 0.1 0.0 0.2 0.0 0.5 1 100 0.0 0.5 1.0 C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P
Wert Klasse
A1
actives passives lastAct T7 T8 T9 T10 T3 T4 T5 T6 between eigen T1 T2 tokens degree wdegree close 0.0 0.5 1.0 0.0 0.5 1 0.0 0.2 0.0 0.1 0.0 0.1 0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.5 0.0 0.5 1.0 0.0 0.1 0.2 0.0 0.5 1 0.0 0.5 0.0 0.1 0.0 0.5 1 100 200 0.0 0.5 1.0 C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P
Wert Klasse
A2
actives passives lastAct T7 T8 T9 T10 T3 T4 T5 T6 between eigen T1 T2 tokens degree wdegree close 1 0.0 0.5 1.0 1 0.0 0.2 0.0 0.2 0.0 0.5 0.0 0.2 0.0 0.2 0.0 0.5 1 0.0 0.2 0.0 0.5 1 0.0 0.2 0.0 0.2 0.0 0.5 1 100 1 C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P
Wert Klasse
A3
actives passives lastAct T7 T8 T9 T10 T3 T4 T5 T6 between eigen T1 T2 tokens degree wdegree close 1 0.0 0.5 1 0.0 0.5 0.0 0.1 0.0 0.5 0.0 0.1 0.0 0.1 0.2 0.0 0.5 0.0 0.5 1.0 0.0 0.1 0.2 0.0 0.5 1 0.0 0.2 0.0 0.1 0.0 0.5 1 100 1 C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P C P
Wert Klasse
TF
Figure 14: Feature distribution.
33
AUF BT POP ROM VM WK NAT WM passives close lastAct wdegree eigen degree T4 T6 T3 between T7 T5 nfig actives SD T8 T1 T10 T2 T9 tokens 50 100 150 200 250
Importance Feature
A1
NAT POP VM WK WM AUF BT lastAct ROM nfig T9 T10 wdegree degree actives passives close T5 T4 between eigen T7 T8 T2 T6 T1 T3 SD tokens 100 200 300 400
Importance Feature
A2
AUF NAT ROM WM WK POP BT SD VM nfig wdegree passives close degree between eigen actives T1 T6 T9 T4 T3 T7 T5 lastAct T10 T2 T8 tokens 50 100
Importance Feature
A3
NAT POP WK WM SD AUF ROM VM BT lastAct passives T6 degree T3 close T4 between T1 T7 nfig wdegree T9 eigen T5 T10 actives T8 T2 tokens 20 40 60 80
Importance Feature
TF
Figure 15: Relative Feature Importance.
34
Ref C P Pred C 878 P 32 171
Table 4: Confusion matrix for A1.
Ref C P Pred C 883 P 45 176
Table 5: Confusion matrix for A2.
Ref C P Pred C 1196 P 100 106
Table 6: Confusion matrix for A3.
Ref C P Pred C 1456 P 57 49
Table 7: Confusion matrix for TF .
35
Three annotators with overlapping and unique dramas Annotator # Dramas # Protag-
(%) # Non- Protagonists (%) # Figures Total A1 34 171 (16) 910 (84) 1081 A2 37 176 (16) 928 (84) 1104 A3 36 106 (8) 1296 (92) 1402
Table 8: Distribution of annotations.
Combination # Dramas Cohen’s A1+A2 6 0.83 A1+A3 6 0.46 A2+A3 7 0.43
Table 9: Cohen’s for different combinations of annotations.
36
Three annotators with overlapping and unique dramas Annotator # Dramas # Protag-
(%) # Non- Protagonists (%) # Figures Total A1 34 171 (16) 910 (84) 1081 A2 37 176 (16) 928 (84) 1104 A3 36 106 (8) 1296 (92) 1402
Table 8: Distribution of annotations.
Combination # Dramas Cohen’s κ A1+A2 6 0.83 A1+A3 6 0.46 A2+A3 7 0.43
Table 9: Cohen’s κ for different combinations of annotations.
36
Experiment 1 Data Precision Recall F1 Accuracy Majority Baseline A1
A2
A3
Tokens Baseline A1 0.72 1.00 0.84 0.94 A2 0.70 0.99 0.82 0.93 A3 0.44 1.00 0.61 0.91 Random Forest A1 0.84 1.00 0.91 0.97 A2 0.80 1.00 0.89 0.96 A3 0.51 1.00 0.68 0.93
Table 10: Precision, Recall and F1 for classifying protagonists and accuracy.
37