Fabienne Cap March 9th, 2017 Overview Motivation Methodology - - PowerPoint PPT Presentation

fabienne cap
SMART_READER_LITE
LIVE PREVIEW

Fabienne Cap March 9th, 2017 Overview Motivation Methodology - - PowerPoint PPT Presentation

Approximating Compound Compositionality based on Word Alignments Fabienne Cap March 9th, 2017 Overview Motivation Methodology Results Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments Background Several previous


slide-1
SLIDE 1

Approximating Compound Compositionality based on Word Alignments

Fabienne Cap

March 9th, 2017

slide-2
SLIDE 2

Overview Motivation Methodology Results

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-3
SLIDE 3

Background

Several previous works have used alignments to identify MWEs: Medeiros de Caseli et al. (2010) used alignment assymmetries to identify MWEs in Brazilian Portuguese. Salehi and Cook (2013) compared the translations of English MWEs with the translations of their parts. Salehi et al. (2014) measured distributional similarity of English and German MWEs and their translations. Villada Moir´

  • n and Tiedemann (2006) used the variance of

MWE alignments to identify idiomatic MWEs in Dutch.

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-4
SLIDE 4

Background

Several previous works have used alignments to identify MWEs: Medeiros de Caseli et al. (2010) used alignment assymmetries to identify MWEs in Brazilian Portuguese. Salehi and Cook (2013) compared the translations of English MWEs with the translations of their parts. Salehi et al. (2014) measured distributional similarity of English and German MWEs and their translations. Villada Moir´

  • n and Tiedemann (2006) used the variance of

MWE alignments to identify idiomatic MWEs in Dutch.

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-5
SLIDE 5

Background

Several previous works have used alignments to identify MWEs: Medeiros de Caseli et al. (2010) used alignment assymmetries to identify MWEs in Brazilian Portuguese. Salehi and Cook (2013) compared the translations of English MWEs with the translations of their parts. Salehi et al. (2014) measured distributional similarity of English and German MWEs and their translations. Villada Moir´

  • n and Tiedemann (2006) used the variance of

MWE alignments to identify idiomatic MWEs in Dutch.

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-6
SLIDE 6

Background

Several previous works have used alignments to identify MWEs: Medeiros de Caseli et al. (2010) used alignment assymmetries to identify MWEs in Brazilian Portuguese. Salehi and Cook (2013) compared the translations of English MWEs with the translations of their parts. Salehi et al. (2014) measured distributional similarity of English and German MWEs and their translations. Villada Moir´

  • n and Tiedemann (2006) used the variance of

MWE alignments to identify idiomatic MWEs in Dutch.

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-7
SLIDE 7

Background

Several previous works have used alignments to identify MWEs: Medeiros de Caseli et al. (2010) used alignment assymmetries to identify MWEs in Brazilian Portuguese. Salehi and Cook (2013) compared the translations of English MWEs with the translations of their parts. Salehi et al. (2014) measured distributional similarity of English and German MWEs and their translations. Villada Moir´

  • n and Tiedemann (2006) used the variance of

MWE alignments to identify idiomatic MWEs in Dutch.

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-8
SLIDE 8

Background

Several previous works have used alignments to identify MWEs: Medeiros de Caseli et al. (2010) used alignment assymmetries to identify MWEs in Brazilian Portuguese. Salehi and Cook (2013) compared the translations of English MWEs with the translations of their parts. vpart, nn Salehi et al. (2014) measured distributional similarity of English and German MWEs and their translations. vpart, nn Villada Moir´

  • n and Tiedemann (2006) used the variance of

MWE alignments to identify idiomatic MWEs in Dutch. v+pp

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-9
SLIDE 9

Background

Several previous works have used alignments to identify MWEs: Medeiros de Caseli et al. (2010) used alignment assymmetries to identify MWEs in Brazilian Portuguese. Salehi and Cook (2013) compared the translations of English MWEs with the translations of their parts. vpart, nn Salehi et al. (2014) measured distributional similarity of English and German MWEs and their translations. vpart, nn Villada Moir´

  • n and Tiedemann (2006) used the variance
  • f MWE alignments to identify idiomatic MWEs in Dutch.

v+pp We used this approach and applied it to determine the compositionality of German noun-noun compounds.

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-10
SLIDE 10

Why alignment variance?

Motivation 1: missing counterparts The meaning of non-compositional compounds is lexicalised A lexical counterpart might be missing in the other language Translators have to work around it → highly likely that these ”work-arounds” will differ Example: Herzblut: passion, commitment, dedication Ad-hoc created compounds might also lack a counterpart But: due to their compositional meaning, the translator is likely to create the same compound in the other language Example: Blutbus: blood bus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-11
SLIDE 11

Why alignment variance?

Motivation 1: missing counterparts The meaning of non-compositional compounds is lexicalised A lexical counterpart might be missing in the other language Translators have to work around it → highly likely that these ”work-arounds” will differ Example: Herzblut: passion, commitment, dedication Ad-hoc created compounds might also lack a counterpart But: due to their compositional meaning, the translator is likely to create the same compound in the other language Example: Blutbus: blood bus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-12
SLIDE 12

Why alignment variance?

Motivation 1: missing counterparts The meaning of non-compositional compounds is lexicalised A lexical counterpart might be missing in the other language Translators have to work around it → highly likely that these ”work-arounds” will differ Example: Herzblut: passion, commitment, dedication Ad-hoc created compounds might also lack a counterpart But: due to their compositional meaning, the translator is likely to create the same compound in the other language Example: Blutbus: blood bus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-13
SLIDE 13

Why alignment variance?

Motivation 1: missing counterparts The meaning of non-compositional compounds is lexicalised A lexical counterpart might be missing in the other language Translators have to work around it → highly likely that these ”work-arounds” will differ Example: Herzblut: passion, commitment, dedication Ad-hoc created compounds might also lack a counterpart But: due to their compositional meaning, the translator is likely to create the same compound in the other language Example: Blutbus: blood bus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-14
SLIDE 14

Why alignment variance?

Motivation 1: missing counterparts The meaning of non-compositional compounds is lexicalised A lexical counterpart might be missing in the other language Translators have to work around it → highly likely that these ”work-arounds” will differ Example: Herzblut: passion, commitment, dedication Ad-hoc created compounds might also lack a counterpart But: due to their compositional meaning, the translator is likely to create the same compound in the other language Example: Blutbus: blood bus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-15
SLIDE 15

Why alignment variance?

Motivation 1: missing counterparts The meaning of non-compositional compounds is lexicalised A lexical counterpart might be missing in the other language Translators have to work around it → highly likely that these ”work-arounds” will differ Example: Herzblut: passion, commitment, dedication Ad-hoc created compounds might also lack a counterpart But: due to their compositional meaning, the translator is likely to create the same compound in the other language Example: Blutbus: blood bus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-16
SLIDE 16

Why alignment variance?

Motivation 1: missing counterparts The meaning of non-compositional compounds is lexicalised A lexical counterpart might be missing in the other language Translators have to work around it → highly likely that these ”work-arounds” will differ Example: Herzblut: passion, commitment, dedication How about compositional constructions? Ad-hoc created compounds might also lack a counterpart But: due to their compositional meaning, the translator is likely to create the same compound in the other language Example: Blutbus: blood bus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-17
SLIDE 17

Why alignment variance?

Motivation 1: missing counterparts The meaning of non-compositional compounds is lexicalised A lexical counterpart might be missing in the other language Translators have to work around it → highly likely that these ”work-arounds” will differ Example: Herzblut: passion, commitment, dedication How about compositional constructions? Ad-hoc created compounds might also lack a counterpart But: due to their compositional meaning, the translator is likely to create the same compound in the other language Example: Blutbus: blood bus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-18
SLIDE 18

Why alignment variance?

Motivation 1: missing counterparts The meaning of non-compositional compounds is lexicalised A lexical counterpart might be missing in the other language Translators have to work around it → highly likely that these ”work-arounds” will differ Example: Herzblut: passion, commitment, dedication How about compositional constructions? Ad-hoc created compounds might also lack a counterpart But: due to their compositional meaning, the translator is likely to create the same compound in the other language Example: Blutbus: blood bus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-19
SLIDE 19

Why alignment variance?

Motivation 1: missing counterparts The meaning of non-compositional compounds is lexicalised A lexical counterpart might be missing in the other language Translators have to work around it → highly likely that these ”work-arounds” will differ Example: Herzblut: passion, commitment, dedication How about compositional constructions? Ad-hoc created compounds might also lack a counterpart But: due to their compositional meaning, the translator is likely to create the same compound in the other language Example: Blutbus: blood bus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-20
SLIDE 20

Why alignment variance?

Motivation 2: contexts Some compounds have both, a compositional and a non-compositional meaning, depending on their context. compositional: Die Bl¨ utezeit der Kirschb¨ aume. “The flowering period of the cherry trees.” non-compositional: Die Bl¨ utezeit der Dampfmaschine. “The heyday of the steam machine.” Translations thus differ considerably, which adds variance

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-21
SLIDE 21

Why alignment variance?

Motivation 2: contexts Some compounds have both, a compositional and a non-compositional meaning, depending on their context. compositional: Die Bl¨ utezeit der Kirschb¨ aume. “The flowering period of the cherry trees.” non-compositional: Die Bl¨ utezeit der Dampfmaschine. “The heyday of the steam machine.” Translations thus differ considerably, which adds variance

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-22
SLIDE 22

Why alignment variance?

Motivation 2: contexts Some compounds have both, a compositional and a non-compositional meaning, depending on their context. compositional: Die Bl¨ utezeit der Kirschb¨ aume. “The flowering period of the cherry trees.” non-compositional: Die Bl¨ utezeit der Dampfmaschine. “The heyday of the steam machine.” Translations thus differ considerably, which adds variance

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-23
SLIDE 23

Why alignment variance?

Motivation 2: contexts Some compounds have both, a compositional and a non-compositional meaning, depending on their context. compositional: Die Bl¨ utezeit der Kirschb¨ aume. “The flowering period of the cherry trees.” non-compositional: Die Bl¨ utezeit der Dampfmaschine. “The heyday of the steam machine.” Translations thus differ considerably, which adds variance

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-24
SLIDE 24

Why alignment variance?

Motivation 2: contexts Some compounds have both, a compositional and a non-compositional meaning, depending on their context. compositional: Die Bl¨ utezeit der Kirschb¨ aume. “The flowering period of the cherry trees.” non-compositional: Die Bl¨ utezeit der Dampfmaschine. “The heyday of the steam machine.” Translations thus differ considerably, which adds variance

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-25
SLIDE 25

Why alignment variance?

Motivation 2: contexts Some compounds have both, a compositional and a non-compositional meaning, depending on their context. compositional: Die Bl¨ utezeit der Kirschb¨ aume. “The flowering period of the cherry trees.” non-compositional: Die Bl¨ utezeit der Dampfmaschine. “The heyday of the steam machine.” Translations thus differ considerably, which adds variance How about compositional constructions? → less variation of contexts

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-26
SLIDE 26

Why alignment variance?

Motivation 3: part of larger idioms Some non-compositional compounds occur

  • nly/mostly within larger idioms

Translation variations shown by previous works on MWEs Example 1: von der Bildfl¨ ache verschwinden “disappear”, “vanishing into thin air” Example 2: Dreh- und Angelpunkt sein “the crux of the matter”, “the key element”,

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-27
SLIDE 27

Why alignment variance?

Motivation 3: part of larger idioms Some non-compositional compounds occur

  • nly/mostly within larger idioms

Translation variations shown by previous works on MWEs Example 1: von der Bildfl¨ ache verschwinden “disappear”, “vanishing into thin air” Example 2: Dreh- und Angelpunkt sein “the crux of the matter”, “the key element”,

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-28
SLIDE 28

Why alignment variance?

Motivation 3: part of larger idioms Some non-compositional compounds occur

  • nly/mostly within larger idioms

Translation variations shown by previous works on MWEs Example 1: von der Bildfl¨ ache verschwinden “disappear”, “vanishing into thin air” Example 2: Dreh- und Angelpunkt sein “the crux of the matter”, “the key element”,

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-29
SLIDE 29

Why alignment variance?

Motivation 3: part of larger idioms Some non-compositional compounds occur

  • nly/mostly within larger idioms

Translation variations shown by previous works on MWEs Example 1: von der Bildfl¨ ache verschwinden “disappear”, “vanishing into thin air” Example 2: Dreh- und Angelpunkt sein “the crux of the matter”, “the key element”,

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-30
SLIDE 30

Why alignment variance?

Motivation 3: part of larger idioms Some non-compositional compounds occur

  • nly/mostly within larger idioms

Translation variations shown by previous works on MWEs Example 1: von der Bildfl¨ ache verschwinden “disappear”, “vanishing into thin air” Example 2: Dreh- und Angelpunkt sein “the crux of the matter”, “the key element”,

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-31
SLIDE 31

Overview Motivation Methodology Results

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-32
SLIDE 32

Overview Motivation Methodology Results

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-33
SLIDE 33

Methodology

The should be large enough size font Die Schriftgröße sollte groß genug sein

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-34
SLIDE 34

Methodology

The should be large enough size font Die Schriftgröße sollte groß genug sein

  • Parallel Corpus

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-35
SLIDE 35

Methodology

The should be large enough size font Die Schriftgröße sollte groß genug sein

  • Parallel Corpus
  • Statistical Word Alignment

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-36
SLIDE 36

Methodology

The should be large enough size font Die Schriftgröße sollte groß genug sein

  • Parallel Corpus
  • Statistical Word Alignment

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-37
SLIDE 37

Methodology

Die sollte groß genug sein Schrift größe The should be large enough size font

  • Parallel Corpus
  • Statistical Word Alignment
  • Compound Splitting

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-38
SLIDE 38

Methodology

The should be large enough font size Die sollte groß genug sein Schrift größe

  • Parallel Corpus
  • Statistical Word Alignment
  • Compound Splitting

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-39
SLIDE 39

Word Alignment

(a) compositional: Schriftgr¨

  • ße (102 occurrences)

Word Alignments Schrift = font (65), text (7), fonts (3), size (3), type (2), character (2), sizes (2), font text (1), record (1) (... 16 more singletons ...) Gr¨

  • ße

= size (74), sizes (13), relative size (1), (... 14 more singletons ...)

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-40
SLIDE 40

Word Alignment

(c) compositional: Schriftgr¨

  • ße (102 occurrences)

Word Alignments Schrift = font (65), text (7), fonts (3), size (3), type (2), character (2), sizes (2), font text (1), record (1) (... 16 more singletons ...) Gr¨

  • ße

= size (74), sizes (13), relative size (1), (... 14 more singletons ...)

(d) non-compositional: Schriftzug (89 occurrences)

Word Alignments Schrift = lettering (10), logo (6), label (5), logotype (4), text (3), writing (3), texts (3), inscription (2), sticker (2), etched (2), word (1) , imprints (1), (... 47 more singletons ...) Zug = lettering (10), label (5), logo (5), logotype (4),

  • f (4), inscription (3), sticker (2), letters (2),

writings (1), nameplate (1), handwriting (1), (... 51 more singletons ...)

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-41
SLIDE 41

Word Alignment

(e) compositional: Schriftgr¨

  • ße (102 occurrences)

Word Alignments Schrift = font (65), text (7), fonts (3), size (3), type (2), character (2), sizes (2), font text (1), record (1) (... 16 more singletons ...) Gr¨

  • ße

= size (74), sizes (13), relative size (1), (... 14 more singletons ...)

(f) non-compositional: Schriftzug (89 occurrences)

Word Alignments Schrift = lettering (10), logo (6), label (5), logotype (4), text (3), writing (3), texts (3), inscription (2), sticker (2), etched (2), word (1) , imprints (1), (... 47 more singletons ...) Zug = lettering (10), label (5), logo (5), logotype (4),

  • f (4), inscription (3), sticker (2), letters (2),

writings (1), nameplate (1), handwriting (1), (... 51 more singletons ...)

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-42
SLIDE 42

Variance

(g) compositional: Schriftgr¨

  • ße

Word Alignments Schrift = font (65), text (7), fonts (3), size (3), type (2), character (2), sizes (2), font text (1), record (1) (... 16 more singletons ...) Gr¨

  • ße

= size (74), sizes (13), relative size (1), (... 14 more singletons ...)

(h) non-compositional: Schriftzug

Word Alignments Schrift = lettering (10), logo (6), label (5), logotype (4), text (3), writing (3), texts (3), inscription (2), sticker (2), etched (2), word (1) , imprints (1), (... 47 more singletons ...) Zug = lettering (10), label (5), logo (5), logotype (4),

  • f (4), inscription (3), sticker (2), letters (2),

writings (1), nameplate (1), handwriting (1), (... 51 more singletons ...)

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-43
SLIDE 43

Variance

(i) compositional: Schriftgr¨

  • ße

Word Alignments Schrift = font (65), text (7), fonts (3), size (3), type (2), character (2), sizes (2), font text (1), record (1) (... 16 more singletons ...) Gr¨

  • ße

= size (74), sizes (13), relative size (1), (... 14 more singletons ...)

(j) non-compositional: Schriftzug

Word Alignments Schrift = lettering (10), logo (6), label (5), logotype (4), text (3), writing (3), texts (3), inscription (2), sticker (2), etched (2), word (1) , imprints (1), (... 47 more singletons ...) Zug = lettering (10), label (5), logo (5), logotype (4),

  • f (4), inscription (3), sticker (2), letters (2),

writings (1), nameplate (1), handwriting (1), (... 51 more singletons ...)

Calculation of translational entropy: H(Ts|s) = −

t∈Ts P(t|s) log P(t|s)

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-44
SLIDE 44

Variance

(k) compositional: Schriftgr¨

  • ße

Word Alignments Schrift = font (65), text (7), fonts (3), size (3), type (2), character (2), sizes (2), font text (1), record (1) (... 16 more singletons ...) Gr¨

  • ße

= size (74), sizes (13), relative size (1), (... 14 more singletons ...)

(l) non-compositional: Schriftzug

Word Alignments Schrift = lettering (10), logo (6), label (5), logotype (4), text (3), writing (3), texts (3), inscription (2), sticker (2), etched (2), word (1) , imprints (1), (... 47 more singletons ...) Zug = lettering (10), label (5), logo (5), logotype (4),

  • f (4), inscription (3), sticker (2), letters (2),

writings (1), nameplate (1), handwriting (1), (... 51 more singletons ...)

Schriftgr¨

  • ße: 1.451

Schriftzug: 3.827 Calculation of translational entropy: H(Ts|s) = −

t∈Ts P(t|s) log P(t|s)

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-45
SLIDE 45

Ranking

Compound Freq. TE Seilbahn 561 3.809 Sonnenschirm 594 3.315 Seemann 76 3.114 Armband 178 3.058 Stereoanlage 119 2.899 Wasserhahn 50 2.778 Kaffeemaschine 333 2.725 Hausboot 34 2.718 Bettw¨ asche 842 2.670 Telefonzelle 26 2.602 Gew¨ achshaus 165 2.584 Schlauchboot 56 2.524 M¨ ulleimer 61 2.500 Kopfkissen 83 2.481 Handtuch 911 2.463 M¨ ulltonne 34 2.459 Schachbrett 66 2.408 Tintenfisch 75 2.394 Sessellift 134 2.368

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-46
SLIDE 46

Overview Motivation Methodology Results

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-47
SLIDE 47

Overview Motivation Methodology Results

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-48
SLIDE 48

Results

Calculate Spearman rank correlation coefficient (ρ-value)

  • n the von der Heide/Borgwaldt dataset

Compare to vector-based approach by Schulte im Walde (2016)

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-49
SLIDE 49

Results

Calculate Spearman rank correlation coefficient (ρ-value)

  • n the von der Heide/Borgwaldt dataset

Compare to vector-based approach by Schulte im Walde (2016)

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-50
SLIDE 50

Results

Calculate Spearman rank correlation coefficient (ρ-value)

  • n the von der Heide/Borgwaldt dataset

Compare to vector-based approach by Schulte im Walde (2016)

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-51
SLIDE 51

Results

Calculate Spearman rank correlation coefficient (ρ-value)

  • n the von der Heide/Borgwaldt dataset

Compare to vector-based approach by Schulte im Walde (2016) vdHB minimal frequency 5 10 25 50 100 #compounds 143 110 76 43 18 mod.vector 0.5839 0.5478 0.5237 0.4713 0.2301 mod.te

  • 0.0175
  • 0.043
  • 0.0524
  • 0.0663
  • 0.0877

head.vector 0.5942 0.5871 0.5946 0.4804 0.4634 head.te 0.1268 0.1205 0.1643 0.3392 0.4407

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-52
SLIDE 52

Results

Calculate Spearman rank correlation coefficient (ρ-value)

  • n the von der Heide/Borgwaldt dataset

Compare to vector-based approach by Schulte im Walde (2016) vdHB minimal frequency 5 10 25 50 100 #compounds 143 110 76 43 18 mod.vector 0.5839 0.5478 0.5237 0.4713 0.2301 mod.te

  • 0.0175
  • 0.043
  • 0.0524
  • 0.0663
  • 0.0877

head.vector 0.5942 0.5871 0.5946 0.4804 0.4634 head.te 0.1268 0.1205 0.1643 0.3392 0.4407

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-53
SLIDE 53

A closer look at the results – Rankings for the 18 highest frequent compounds

vdHB mod ranking mod.te ranking vdHB head ranking head.te ranking Handtuch Sonnenschirm Bettw¨ asche Seilbahn Visitenkarte Seilbahn Stereoanlage Sonnenschirm Nachttisch Armband Seilbahn Armband Haselnuss Gewchshaus Wasserfall Stereoanlage Sonnenblume Visitenkarte Eisberg Kaffeemaschine Stereoanlage Bettw¨ asche Armband Bettw¨ asche Sessellift Papierkorb Papierkorb Gew¨ achshaus Kreditkarte Nachttisch Kreditkarte Handtuch Armband Sessellift Gew¨ achshaus Sessellift Seilbahn Stereoanlage Kaffeemaschine Wasserfall Papierkorb Handtuch Nachttisch Papierkorb Postkarte Postkarte Sessellift Nachttisch Eisberg Kaffeemaschine Postkarte Postkarte Sonnenschirm Wasserfall Sonnenschirm Visitenkarte Bettw¨ asche Haselnuss Handtuch Haselnuss Gew¨ achshaus Sonnenblume Visitenkarte Kreditkarte Kaffeemaschine Eisberg Haselnuss Eisberg Wasserfall Kreditkarte Sonnenblume Sonnenblume

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-54
SLIDE 54

Conclusion and Future Work

Conclusions: Alignment variance is weakly correlated with compositionality Head variance better indicator than modifier variance (!) Problem: data sparsity Weighting of TE scores Combine alignments into different languages Combine with other existing alignment-based scores Combine with monolingual approaches

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-55
SLIDE 55

Conclusion and Future Work

Conclusions: Alignment variance is weakly correlated with compositionality Head variance better indicator than modifier variance (!) Problem: data sparsity Weighting of TE scores Combine alignments into different languages Combine with other existing alignment-based scores Combine with monolingual approaches

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-56
SLIDE 56

Conclusion and Future Work

Conclusions: Alignment variance is weakly correlated with compositionality Head variance better indicator than modifier variance (!) Problem: data sparsity Weighting of TE scores Combine alignments into different languages Combine with other existing alignment-based scores Combine with monolingual approaches

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-57
SLIDE 57

Conclusion and Future Work

Conclusions: Alignment variance is weakly correlated with compositionality Head variance better indicator than modifier variance (!) Problem: data sparsity Weighting of TE scores Combine alignments into different languages Combine with other existing alignment-based scores Combine with monolingual approaches

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-58
SLIDE 58

Conclusion and Future Work

Conclusions: Alignment variance is weakly correlated with compositionality Head variance better indicator than modifier variance (!) Problem: data sparsity Future Work: Weighting of TE scores Combine alignments into different languages Combine with other existing alignment-based scores Combine with monolingual approaches

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-59
SLIDE 59

Conclusion and Future Work

Conclusions: Alignment variance is weakly correlated with compositionality Head variance better indicator than modifier variance (!) Problem: data sparsity Future Work: Weighting of TE scores Combine alignments into different languages Combine with other existing alignment-based scores Combine with monolingual approaches

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-60
SLIDE 60

Conclusion and Future Work

Conclusions: Alignment variance is weakly correlated with compositionality Head variance better indicator than modifier variance (!) Problem: data sparsity Future Work: Weighting of TE scores Combine alignments into different languages Combine with other existing alignment-based scores Combine with monolingual approaches

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-61
SLIDE 61

Conclusion and Future Work

Conclusions: Alignment variance is weakly correlated with compositionality Head variance better indicator than modifier variance (!) Problem: data sparsity Future Work: Weighting of TE scores Combine alignments into different languages Combine with other existing alignment-based scores Combine with monolingual approaches

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments

slide-62
SLIDE 62

Conclusion and Future Work

Conclusions: Alignment variance is weakly correlated with compositionality Head variance better indicator than modifier variance (!) Problem: data sparsity Future Work: Weighting of TE scores Combine alignments into different languages Combine with other existing alignment-based scores Combine with monolingual approaches

Fabienne Cap Approximating Compound Compositionalitybased on Word Alignments