1/13
German in Flux: Detecting Metaphoric Change via Word Entropy August - - PowerPoint PPT Presentation
German in Flux: Detecting Metaphoric Change via Word Entropy August - - PowerPoint PPT Presentation
German in Flux: Detecting Metaphoric Change via Word Entropy August 4, 2017 Dominik Schlechtweg, Stefanie Eckmann, Enrico Santus, Sabine Schulte im Walde, Daniel Hole dominik.schlechtweg@gmx.de , stefanie.eckmann@campus.lmu.de , esantus@mit.edu
2/13
Introduction
◮ Our aim:
◮ overall: build a computational model detecting semantic
change
◮ in this paper: distinguish metaphoric change from semantic
stability
◮ How we do it:
◮ exploit the idea of semantic generality from hypernym
detection
◮ apply entropy to distributional semantic model ◮ sample language German ◮ introduce the first resource for evaluation of models of
metaphoric change
3/13
Shortcomings of Related Work
◮ Previous work includes mainly:
(i) spatial displacement models (ii) word sense induction models
◮ quantify the degree of overall change rather than being able
to qualify different types
◮ do not examine metaphoric change
4/13
Metaphoric Change
◮ frequent and important type of semantic change ◮ source and target concept are related by similarity or a
reduced comparison (cf. Koch, 2016, p. 47)
earlier: ... muß ich mich vmbweltzen / vnd kan keinen schlaff in meine augen bringen ‘... I have to turn around and cannot bring sleep into my eyes.’ later: Kinadon wollte den Staat umw¨ alzen ... ‘Kinadon wanted to revolutionize the state ...’
(i) creates polysemy (ii) often results in more abstract or general meanings → assumption: (i) and (ii) imply extension and dispersion in the range of linguistic contexts
5/13
Corpus
◮ Deutsches Textarchiv (erweitert) (DTA) ◮ large: provides more than 2447 lemmatized and POS-tagged
texts (with more than 140M tokens)
◮ covers long time period: late 15th to the early 20th century ◮ balanced: includes literary and scientific texts as well as
functional writings
6/13
Word Entropy
◮ corresponds to entropy of word vector ◮ is assumed to reflect semantic generality in hypernym
detection
◮ is given by
H(C) = −
n
- i=1
P(ci | w) log2 P(ci | w) where P(ci | w) is the occurrence probability of context word ci given target word w
◮ measures the unpredictability of w’s co-occurrences
7/13
Evaluation
◮ no standard test set of semantic or metaphoric change ◮ we create a small but first test set via annotation (28 items) ◮ annotators judged 560 context pairs for a metaphorical
relation Workflow:
(i) preselect 14 changing words (ii) add 14 stable distractors (iii) identify a date of change (iv) extract 20 contexts for each target from before and after date
- f change
(v) for each word combine contexts between time periods randomly (vi) annotation of context pairs
8/13
Annotation
◮ steps to identify metaphoric relation of C1 to C2:
- 1. Does any of these hold?:
◮ C1 is less concrete than C2 ◮ C1 is less human-oriented than C2 ◮ C1 is not related to bodily action in contrast to C2 ◮ C1 is less precise than C2
- 2. if yes: does C1 contrast with C2 but can be understood in
comparison with it?
◮ agreement: κ (Fleiss’ Kappa) between .40 and .46 ◮ result is gold ranking of targets for strength of metaphoric
change
9/13
Annotation Results
target POS type date meaning score Donnerwetter N met 1805 thunderstorm > thunderstorm, blowup 0.78 ... Unh¨
- flichkeit
N sta 1605 discourtesy 0.1 ...
Table 1 : Sample of test set items ordered by their annotated degree of metaphoric change.
10/13
Results
1700-1800 1800-1900 all entropy .64*** .10 .39* frequency .29
- .07
.26
Table 2 : Correlation (ρ) between predicted and gold ranks. Significance is determined with a t-test.
11/13
Result Analysis
◮ ausstechen
1605: Von einem Bawren / welcher einem Kalbskopff die Augen außstach. ‘About a Farmer / who cut out the eyes of a calf’s head.’ 1869: Sie wollen ihre Aufgabe nicht nur l¨
- sen, sondern auch elegant,
- d. h. rasch l¨
- sen, um Nebenbuhler auszustechen.
‘They not only wanted to solve their task, but also elegantly, i.e., solve it fast, in order to excel rivals.’
◮ gold rank: 12/28, entropy: 13, frequency: 17
◮ Donnerwetter
1631: Die Lufft ist heiß / vnd gibt viel Blitzen vnd Donnerwetter ... ‘The air is hot / and there are many lightnings and thunderstorms ...’ 1893: Potz Donnerwetter! ‘Man alive!’
◮ gold rank: 1/28, entropy: 27, frequency: 15
12/13