Semantic Knowledge Acquisition using Frequency Based Patterns Roy - - PowerPoint PPT Presentation

semantic knowledge acquisition using
SMART_READER_LITE
LIVE PREVIEW

Semantic Knowledge Acquisition using Frequency Based Patterns Roy - - PowerPoint PPT Presentation

Semantic Knowledge Acquisition using Frequency Based Patterns Roy Schwartz and Ari Rappoport School of Computer Science and Engineering, The Hebrew University of Jerusalem, February 2015 The Catalonia-Israel Symposium on Lexical Semantics and


slide-1
SLIDE 1

Semantic Knowledge Acquisition using Frequency Based Patterns

Roy Schwartz and Ari Rappoport

School of Computer Science and Engineering, The Hebrew University of Jerusalem, February 2015

The Catalonia-Israel Symposium on Lexical Semantics and Grammatical Structure

slide-2
SLIDE 2

The Goal: Acquire (Lexical) Semantic Knowledge

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 2

slide-3
SLIDE 3

The Goal: Acquire (Lexical) Semantic Knowledge

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 2

slide-4
SLIDE 4

The Goal: Acquire (Lexical) Semantic Knowledge

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 2

slide-5
SLIDE 5

The Goal: Acquire (Lexical) Semantic Knowledge

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 2

slide-6
SLIDE 6

The Goal: Acquire (Lexical) Semantic Knowledge

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 2

slide-7
SLIDE 7

Toolkit

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 3

slide-8
SLIDE 8

Toolkit

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 3

slide-9
SLIDE 9

Toolkit

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 3

slide-10
SLIDE 10

Toolkit

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 3

slide-11
SLIDE 11

Disclaimer

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • We present a highly effective computational method
  • We do not attempt to make any linguistic or cognitive claim

– Nevertheless, there are some related issues, e.g., in construction grammar theories

4

slide-12
SLIDE 12

Overview

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Introduction

– Bag of words models – Lexico-syntactic Patterns – Lexico-syntactic Patterns 2.0: Flexible Patterns

  • Latest results

– Interpretable Word Embeddings Using Patterns Features (Schwartz, Reichart and Rappoport, under review)

5

slide-13
SLIDE 13

Bag-of-Words Models

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John gave a present to Mary

6

slide-14
SLIDE 14

Bag-of-Words Models

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John gave a present to Mary

6

slide-15
SLIDE 15

Bag-of-Words Models

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John gave a present to Mary

Mary present gave John

6

slide-16
SLIDE 16

Bag-of-Words Models

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John gave a present to Mary Distributional Semantics

(Harris, 1954)

Words that occur in similar context are likely to have similar meanings

6

slide-17
SLIDE 17

Bag-of-Words Applications

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Represent words using their surrounding (word) contexts

– Word similarity / association – Word clustering / classification – …

  • Represent phrases / sentences by the words that they contain

– Sentiment analysis – Spam filters

7

slide-18
SLIDE 18

Missing: Context

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John gave a present to Marry

8

slide-19
SLIDE 19

Missing: Context

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John gave a present to Marry

8

slide-20
SLIDE 20

Missing: Context

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John gave a present to Marry

8

slide-21
SLIDE 21

Missing: Context

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John’s car broke down John and Mary got married Workers like John are an asset to every organization

8

slide-22
SLIDE 22

Missing: Context

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John’s car broke down John and Mary got married Workers like John are an asset to every organization

8

slide-23
SLIDE 23

Missing: Context

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John’s car broke down John and Mary got married Workers like John are an asset to every organization

8

slide-24
SLIDE 24

Missing: Context

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

John’s car broke down John and Mary got married Workers like John are an asset to every organization

8

slide-25
SLIDE 25

Lexico-syntactic Patterns

Hearst, 1992

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Patterns of the form “X is a country”, “X such as Y”, etc.

9

slide-26
SLIDE 26

Lexico-syntactic Patterns

Hearst, 1992

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Patterns potentially capture the context in which a word

participates

9

slide-27
SLIDE 27

Lexico-syntactic Patterns

Hearst, 1992

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • For example:

– A dog participates in patterns (contexts) such as: – “X barks”, “X has a tail”, “X and cats”, …

9

slide-28
SLIDE 28

Semantic Knowledge Acquisition using Patterns

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Extracting country names

– “X is a country”

10

slide-29
SLIDE 29

Semantic Knowledge Acquisition using Patterns

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Extracting country names

– “X is a country” – Canada is a country in north America – There's a sense in America that France is a country of culture

10

slide-30
SLIDE 30

Semantic Knowledge Acquisition using Patterns

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Extracting hyponymy relations

– “X such as Y”

10

slide-31
SLIDE 31

Semantic Knowledge Acquisition using Patterns

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Extracting hyponymy relations

– “X such as Y” – Cut the stems of boxed flowers such as roses – I am responsible for preparing a range of fruits such as apples

10

slide-32
SLIDE 32

Pattern Applications

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Acquiring the semantics of single words

– Building semantic lexicons (Riloff and Shepherd, 1997; Roark and Charniak, 1998) – Semantic class learning (Kozareva et al., 2008)

  • Acquiring the semantics of relationships between words

– Discovering hyponymy (Hearst, 1992) – Discovering meronymy (Berland and Charniak, 1999) – Discovering antonymy (Lin et al., 2003)

11

slide-33
SLIDE 33

Symmetric Patterns (SPs)

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • X and Y

– cats and dogs , dogs and cats – France and England, England and France

  • X as well as Y

– friends as well as colleagues, colleagues as well as friends – apples and oranges , oranges and apples

12

slide-34
SLIDE 34

Symmetric Patterns (SPs)

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Words that co-occur in symmetric patterns are likely to be

similar to one another

– Widdows and Dorow, 2002; Dorow et al., 2005; Davidov et al., 2006, Schwartz et al., 2014

12

slide-35
SLIDE 35

Limitations of Patterns

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • The early works that adopted lexico-syntactic patterns used a

set of manually created patterns

– Require human (experts) labor – Language-specific

13

slide-36
SLIDE 36

Patterns 2.0: Flexible Patterns

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Patterns that are extracted automatically

14

slide-37
SLIDE 37

Patterns 2.0: Flexible Patterns

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Instead of defining a set of fixed patterns, we define meta-

patterns

– Structures of (potential) patterns – High frequency words (HFWs) are used instead of fixed words – Content words (CWs) appear as wildcards – E.g., “HFW1 X HFW2 Y”

14

slide-38
SLIDE 38

Patterns 2.0: Flexible Patterns

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Frequent and informative patterns are automatically selected

14

slide-39
SLIDE 39

Extracted Flexible Patterns

“HFW1 X HFW2 Y”

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • as X as Y
  • the X the Y
  • an X from Y
  • from X to Y
  • a X has Y
  • to X big Y
  • in X the Y
  • an X do Y
  • to X and Y

15

slide-40
SLIDE 40

Extracted Flexible Patterns

“HFW1 X HFW2 Y”

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • as X as Y
  • from X to Y
  • a X has Y
  • to X and Y

15

slide-41
SLIDE 41

Benefits of using Flexible Patterns

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Flexible patterns are computed automatically

– Based solely on word frequencies – Do not require expert knowledge – Language and domain independent – Large coverage

16

slide-42
SLIDE 42

Automatic Discovery of Symmetric Patterns

Davidov and Rappoport, ACL 2006

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • An algorithm for extraction symmetric patterns from plain text

(symmetric flexible patterns)

– “X and Y”, “X as well as Y”, “neither X nor Y” – Input: a large corpus of plain text – Output: a set of symmetric patterns

17

slide-43
SLIDE 43

Automatic Discovery of Symmetric Patterns

Davidov and Rappoport, ACL 2006

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Application: cluster nouns into meaningful semantic groups

Discovered categories include chemical elements, university names, languages, fruits, fishing baits…

17

slide-44
SLIDE 44

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Dog Cat House Camera Rat Computer

house and computer

DR06 Example

X and Y

18

slide-45
SLIDE 45

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Dog Cat House Camera Rat Computer

house and computer

Symmetric edges

DR06 Example

X and Y

18

slide-46
SLIDE 46

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Dog Cat House Camera Rat Computer

house and computer

Symmetric edges Asymmetric edges

DR06 Example

X and Y

18

slide-47
SLIDE 47

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Dog Cat House Camera Rat Computer

house and computer

Symmetric edges Asymmetric edges

DR06 Example

X and Y

M = #symmetric_edges #all_edges

18

slide-48
SLIDE 48

Resulting Set of Patterns

  • “X and Y”
  • “X or Y”
  • “X as well as Y”
  • “X nor Y”
  • “X and the Y”
  • “X or the Y”
  • “X or a Y”
  • “X and one Y”
  • “from X to Y”
  • “X rather than Y”

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 19

slide-49
SLIDE 49

Minimally Supervised Noun Classification

Schwartz, Reichart and Rappoport, Coling 2014

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Classify nouns into semantic categories

– Animals, edibles, tools, …

20

slide-50
SLIDE 50

Minimally Supervised Noun Classification

Schwartz, Reichart and Rappoport, Coling 2014

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • For each semantic category, start with a small set of positive

and negative examples

– Typically only two positive examples and two negative examples

20

slide-51
SLIDE 51

Minimally Supervised Noun Classification

Schwartz, Reichart and Rappoport, Coling 2014

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Link words that co-occur in symmetric flexible patterns

20

slide-52
SLIDE 52

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Example: Animate class

Dog Cat House Couch Purse Rat Car Mole Chair Hammer Computer Owl Apple Whale

.5 .03 .12 .44 .78 .62 .33 .12 .08 .11 .89 .25 .70 .21 .17 .12 .22

21

slide-53
SLIDE 53

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Example: Animate class

Dog Cat House Couch Purse Rat Car Mole Chair Hammer Computer Owl Apple Whale

.5 .03 .12 .44 .78 .62 .33 .12 .08 .11 .89 .25 .70 .21 .17 .12 .22

21

slide-54
SLIDE 54

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Example: Animate class

Dog Cat House Couch Purse Rat Car Mole Chair Hammer Computer Owl Apple Whale

.5 .03 .12 .44 .78 .62 .33 .12 .08 .11 .89 .25 .70 .21 .17 .12 .22

21

slide-55
SLIDE 55

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Example: Animate class

Dog Cat House Couch Purse Rat Car Mole Chair Hammer Computer Owl Apple Whale

.5 .03 .12 .44 .78 .62 .33 .12 .08 .11 .89 .25 .70 .21 .17 .12 .22

21

slide-56
SLIDE 56

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Example: Animate class

Dog Cat House Couch Purse Rat Car Mole Chair Hammer Computer Owl Apple Whale

.5 .03 .12 .44 .78 .62 .33 .12 .08 .11 .89 .25 .70 .21 .17 .12 .22

21

slide-57
SLIDE 57

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Example: Animate class

Dog Cat House Couch Purse Rat Car Mole Chair Hammer Computer Owl Apple Whale

.5 .03 .12 .44 .78 .62 .33 .12 .08 .11 .89 .25 .70 .21 .17 .12 .22

82-94% accuracy on 450 words

21

slide-58
SLIDE 58

Interpretable Word Embeddings Using Pattern Features

Roy Schwartz, Roi Reichart and Ari Rappoport

(Under Revision)

slide-59
SLIDE 59
  • Representations of words as vectors of features

(numbers)

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Vector Space Models

0.5 0.76

  • 0.12

0.76

  • 0.51

. . .

Vdog

23

slide-60
SLIDE 60
  • Features are usually bag-of-words counts

– Directly or via some mathematical transformation

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Vector Space Models

0.5 0.76

  • 0.12

0.76

  • 0.51

. . .

Vdog

23

slide-61
SLIDE 61
  • In recent years, deep neural network models have

been applied to generate accurate vector representations (aka word embeddings)

– Bengio, 2003; Collobert, 2008 & 2011, word2vec (Mikolov 2013{a,b,c})

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Vector Space Models

0.5 0.76

  • 0.12

0.76

  • 0.51

. . .

Vdog

23

slide-62
SLIDE 62

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Word Embeddings (Cool!) Properties

Θ Vdog Vcar

  • (accurate) Word similarity

24

slide-63
SLIDE 63

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Word Embeddings (Cool!) Properties

  • Word analogy

(Mikolov et al., 2013)

24

slide-64
SLIDE 64

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Word Embeddings Applications

  • Information Retrieval
  • Document Classification
  • Question Answering
  • Named Entity Recognition
  • Parsing

25

slide-65
SLIDE 65

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Word Embeddings Limitations

  • Resulting vectors are highly uninterpretable

– Sequences of several hundred numbers – Not clear what each number represents

26

slide-66
SLIDE 66

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Word Embeddings Limitations

  • Restricted to a limited set of relations

Similarity/Relatedness, some analogies – Other relations are not supported: hyponymy (animal  dog), antonymy (big/tall), etc.

26

slide-67
SLIDE 67

Symmetric Patterns to Word Embeddings

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Input: a large corpus C (8G words)

27

slide-68
SLIDE 68

Symmetric Patterns to Word Embeddings

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Extract a set of SPs P using the DR06 algorithm

27

slide-69
SLIDE 69

Symmetric Patterns to Word Embeddings

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Traverse C, extract all instances of all p in P

– cats and dogs – House and the rooms – from France to England – …

27

slide-70
SLIDE 70

Symmetric Patterns to Word Embeddings (2)

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • For each word w in the lexicon, build a count vector (Vw) of all
  • ther words that co-occur with w in SPs

28

slide-71
SLIDE 71

Symmetric Patterns to Word Embeddings (2)

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • orange

1. … apples and oranges … 2. … oranges as well as grapes …

  • K. … neither banana nor orange
  • China

1. … Japan or China … 2. … China rather than Korea …

  • M. … Vietnam and China …

28

slide-72
SLIDE 72

Symmetric Patterns to Word Embeddings (3)

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

  • Compute the Positive Pointwise Mutual Information (PPMI)

between each pair of words

              ) , ( : : ) , ( ) , ( ) ( ) ( ) , ( log ) , (

j i j i j i j i j i j i

w w PMI

  • therwise

w w PMI w w PPMI w p w p w w p w w PMI

29

slide-73
SLIDE 73

PPMI(dog,house) PPMI(dog,mouse) PPMI(dog,zebra) PPMI(dog,wine) PPMI(dog,cat) PPMI(dog,dolphin) PPMI(dog,bottle) PPMI(dog,pen) . . .

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

The Result: Interpretable Word Embeddings based on Symmetric Patterns

Vsp

dog =

30

slide-74
SLIDE 74

PPMI(dog,house) PPMI(dog,mouse) PPMI(dog,zebra) PPMI(dog,wine) PPMI(dog,cat) PPMI(dog,dolphin) PPMI(dog,bottle) PPMI(dog,pen) . . .

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

The Result: Interpretable Word Embeddings based on Symmetric Patterns

Vsp

dog =

|VSP

w| = ~500K

Ew(|nonzero(VSP

w)| )= ~50

30

slide-75
SLIDE 75

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Interpretability

  • Our model is interpretable in two different manners

31

slide-76
SLIDE 76

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Interpretability

  • We understand what each feature represents

– The similarity score between the target word and another word w

31

slide-77
SLIDE 77

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Interpretability

  • We understand how the value of each feature is generated

– The co-occurrence score of the target word and w in symmetric patterns

31

slide-78
SLIDE 78

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Interpretability

  • Interpretability can be exploited to improve our model

31

slide-79
SLIDE 79

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Antonyms

big / small

32

slide-80
SLIDE 80

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Antonyms

big / small

  • Antonyms appear in similar contexts
  • Here is a X car
  • I live in a X house

32

slide-81
SLIDE 81

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Antonyms

big / small

 In typical word embeddings, cos(Vbig , Vsmall) is high

32

slide-82
SLIDE 82
  • Some symmetric patterns are indicative of antonymy*

– “either X or Y” (either big or small), “from X to Y” (from poverty to richness)

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Antonyms

big / small

* Lin et al. (2003)

32

slide-83
SLIDE 83

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Antonyms

  • A variant of our model that assigns dissimilar vectors to

antonym pairs 

33

slide-84
SLIDE 84

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Antonyms

  • For each word w, compute similarly to , but using

the set of antonym patterns  β is tuned using a development set

AP SP AP' w w w

V V V    

AP w

V

SP w

V

33

slide-85
SLIDE 85
  • Word similarity task

– Experiments with the SimLex999 dataset (Hill et al., 2014) – 999 word pairs, each assigned a similarity score by human annotators – f<model>(wi,wj) = cos(V<model>

wi,V<model> wj)

– Evaluation results is the Spearman’s ρ score between model and human judgments – Numbers are average scores of 10 folds of 25% (dev) / 75 (test) partitions – Baselines: 2 interpretable baselines, 3 state-of-the-art, non-interpretable baselines

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Experiments

34

slide-86
SLIDE 86
  • Word similarity task

– Experiments with the SimLex999 dataset (Hill et al., 2014) – 999 word pairs, each assigned a similarity score by human annotators – f<model>(wi,wj) = cos(V<model>

wi,V<model> wj)

– Evaluation results is the Spearman’s ρ score between model and human judgments – Numbers are average scores of 10 folds of 25% (dev) / 75 (test) partitions – Baselines: 2 interpretable baselines, 3 state-of-the-art, non-interpretable baselines

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Experiments

34

slide-87
SLIDE 87
  • Word similarity task

– Experiments with the SimLex999 dataset (Hill et al., 2014) – 999 word pairs, each assigned a similarity score by human annotators – f<model>(wi,wj) = cos(V<model>

wi,V<model> wj)

– Evaluation results is the Spearman’s ρ score between model and human judgments – Numbers are average scores of 10 folds of 25% (dev) / 75 (test) partitions – Baselines: 2 interpretable baselines, 3 state-of-the-art, non-interpretable baselines

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Experiments

34

slide-88
SLIDE 88

Antonyms

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 35

slide-89
SLIDE 89

fjoint(wi,wj) =γ∙fSP(wi,wj) + (1- γ)∙fskip-gram(wi,wj)

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Joint Model

γ determined using a development set

36

slide-90
SLIDE 90

fjoint(wi,wj) =γ∙fSP(wi,wj) + (1- γ)∙fskip-gram(wi,wj)

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Joint Model

γ determined using a development set

36

slide-91
SLIDE 91

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Patterns

  • Highly effective computational tool

– High quality results (either unsupervised or very weakly supervised)

  • Simple to understand and implement

– Can be implemented in computer hardware

37

slide-92
SLIDE 92

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Patterns and Language Acquisition

  • Children learn linguistic structures, among others, through

pattern-finding in their discourse interactions with others (Tomasello, 2003)

38

slide-93
SLIDE 93

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Summary

  • Patterns are useful for extracting semantic information
  • Symmetric patterns are as useful (actually more useful) as

state-of-the-art word embeddings in modeling word similarity

– 5–9.4 points gap

  • Patterns can capture relations that word embeddings cannot

– Antonymy

  • SPs can be combined along with state-of-the-art embeddings

to create an even more accurate representation

– 10.1 points higher than state-of-the-art

39

slide-94
SLIDE 94

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport

Current Work: Asymmetry of Symmetric Patterns

  • Symmetric patterns are not really symmetric

– good or bad >> bad or good, more or less >> less or more – Order of binomials (Bunin Benor and Levy, 2006)

  • In a large majority of the cases, positive word comes before

negative

  • Application: polarity induction

40

slide-95
SLIDE 95

roys02@cs.huji.ac.il http://www.cs.huji.ac.il/~roys02/

Semantic Knowledge Acquisition using Frequency Based Patterns @ Schwartz and Rappoport 41