German in Flux: Detecting Metaphoric Change via Word Entropy August - - PowerPoint PPT Presentation

german in flux
SMART_READER_LITE
LIVE PREVIEW

German in Flux: Detecting Metaphoric Change via Word Entropy August - - PowerPoint PPT Presentation

German in Flux: Detecting Metaphoric Change via Word Entropy August 4, 2017 Dominik Schlechtweg, Stefanie Eckmann, Enrico Santus, Sabine Schulte im Walde, Daniel Hole dominik.schlechtweg@gmx.de , stefanie.eckmann@campus.lmu.de , esantus@mit.edu


slide-1
SLIDE 1

1/13

German in Flux:

Detecting Metaphoric Change via Word Entropy August 4, 2017 Dominik Schlechtweg, Stefanie Eckmann, Enrico Santus, Sabine Schulte im Walde, Daniel Hole

dominik.schlechtweg@gmx.de, stefanie.eckmann@campus.lmu.de, esantus@mit.edu, schulte@ims.uni-stuttgart.de, holedan@gmail.com

slide-2
SLIDE 2

2/13

Introduction

◮ Our aim:

◮ overall: build a computational model detecting semantic

change

◮ in this paper: distinguish metaphoric change from semantic

stability

◮ How we do it:

◮ exploit the idea of semantic generality from hypernym

detection

◮ apply entropy to distributional semantic model ◮ sample language German ◮ introduce the first resource for evaluation of models of

metaphoric change

slide-3
SLIDE 3

3/13

Shortcomings of Related Work

◮ Previous work includes mainly:

(i) spatial displacement models (ii) word sense induction models

◮ quantify the degree of overall change rather than being able

to qualify different types

◮ do not examine metaphoric change

slide-4
SLIDE 4

4/13

Metaphoric Change

◮ frequent and important type of semantic change ◮ source and target concept are related by similarity or a

reduced comparison (cf. Koch, 2016, p. 47)

earlier: ... muß ich mich vmbweltzen / vnd kan keinen schlaff in meine augen bringen ‘... I have to turn around and cannot bring sleep into my eyes.’ later: Kinadon wollte den Staat umw¨ alzen ... ‘Kinadon wanted to revolutionize the state ...’

(i) creates polysemy (ii) often results in more abstract or general meanings → assumption: (i) and (ii) imply extension and dispersion in the range of linguistic contexts

slide-5
SLIDE 5

5/13

Corpus

◮ Deutsches Textarchiv (erweitert) (DTA) ◮ large: provides more than 2447 lemmatized and POS-tagged

texts (with more than 140M tokens)

◮ covers long time period: late 15th to the early 20th century ◮ balanced: includes literary and scientific texts as well as

functional writings

slide-6
SLIDE 6

6/13

Word Entropy

◮ corresponds to entropy of word vector ◮ is assumed to reflect semantic generality in hypernym

detection

◮ is given by

H(C) = −

n

  • i=1

P(ci | w) log2 P(ci | w) where P(ci | w) is the occurrence probability of context word ci given target word w

◮ measures the unpredictability of w’s co-occurrences

slide-7
SLIDE 7

7/13

Evaluation

◮ no standard test set of semantic or metaphoric change ◮ we create a small but first test set via annotation (28 items) ◮ annotators judged 560 context pairs for a metaphorical

relation Workflow:

(i) preselect 14 changing words (ii) add 14 stable distractors (iii) identify a date of change (iv) extract 20 contexts for each target from before and after date

  • f change

(v) for each word combine contexts between time periods randomly (vi) annotation of context pairs

slide-8
SLIDE 8

8/13

Annotation

◮ steps to identify metaphoric relation of C1 to C2:

  • 1. Does any of these hold?:

◮ C1 is less concrete than C2 ◮ C1 is less human-oriented than C2 ◮ C1 is not related to bodily action in contrast to C2 ◮ C1 is less precise than C2

  • 2. if yes: does C1 contrast with C2 but can be understood in

comparison with it?

◮ agreement: κ (Fleiss’ Kappa) between .40 and .46 ◮ result is gold ranking of targets for strength of metaphoric

change

slide-9
SLIDE 9

9/13

Annotation Results

target POS type date meaning score Donnerwetter N met 1805 thunderstorm > thunderstorm, blowup 0.78 ... Unh¨

  • flichkeit

N sta 1605 discourtesy 0.1 ...

Table 1 : Sample of test set items ordered by their annotated degree of metaphoric change.

slide-10
SLIDE 10

10/13

Results

1700-1800 1800-1900 all entropy .64*** .10 .39* frequency .29

  • .07

.26

Table 2 : Correlation (ρ) between predicted and gold ranks. Significance is determined with a t-test.

slide-11
SLIDE 11

11/13

Result Analysis

◮ ausstechen

1605: Von einem Bawren / welcher einem Kalbskopff die Augen außstach. ‘About a Farmer / who cut out the eyes of a calf’s head.’ 1869: Sie wollen ihre Aufgabe nicht nur l¨

  • sen, sondern auch elegant,
  • d. h. rasch l¨
  • sen, um Nebenbuhler auszustechen.

‘They not only wanted to solve their task, but also elegantly, i.e., solve it fast, in order to excel rivals.’

◮ gold rank: 12/28, entropy: 13, frequency: 17

◮ Donnerwetter

1631: Die Lufft ist heiß / vnd gibt viel Blitzen vnd Donnerwetter ... ‘The air is hot / and there are many lightnings and thunderstorms ...’ 1893: Potz Donnerwetter! ‘Man alive!’

◮ gold rank: 1/28, entropy: 27, frequency: 15

slide-12
SLIDE 12

12/13

Conclusions

◮ you can annotate semantic change in a corpus (so do it) ◮ entropy correlates strongly and significantly with degree of

metaphoric change

◮ frequency correlates moderately, but non-significantly on small

data set

◮ annotation and model are generalizable to different types of

semantic change

https://github.com/Garrafao/MetaphoricChange