Jointly Learning Word Representations and Composition Functions - - PowerPoint PPT Presentation

jointly learning word representations and composition
SMART_READER_LITE
LIVE PREVIEW

Jointly Learning Word Representations and Composition Functions - - PowerPoint PPT Presentation

Jointly Learning Word Representations and Composition Functions Using Predicate-Argument Structures Kazuma Hashimoto (UT) Pontus Stenetorp (UT) Makoto Miwa (TTI) Yoshimasa Tsuruoka (UT) U niversity of T okyo ( UT ) T oyota T echnological I


slide-1
SLIDE 1

10/28/2014 EMNLP 2014 in Doha, Qatar

Jointly Learning Word Representations and Composition Functions Using Predicate-Argument Structures

Kazuma Hashimoto (UT) Pontus Stenetorp (UT) Makoto Miwa (TTI) Yoshimasa Tsuruoka (UT) University of Tokyo (UT) Toyota Technological Institute (TTI)

slide-2
SLIDE 2
  • Neural networks + large unlabeled corpora

Neural Word Vector Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-3
SLIDE 3
  • Neural networks + large unlabeled corpora

– Learn word (i.e. single token) representations

  • e.g.) word2vec

(Mikolov+ 2013; Mnih and Kavukcuoglu 2013; inter alia)

Neural Word Vector Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-4
SLIDE 4
  • Neural networks + large unlabeled corpora

– Learn word (i.e. single token) representations

  • e.g.) word2vec

(Mikolov+ 2013; Mnih and Kavukcuoglu 2013; inter alia)

– Learn composed vector representations

  • e.g.) compositional neural language models

for verb-object vectors (Tsubaki+ 2013)

Neural Word Vector Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-5
SLIDE 5

Relation to Previous Work

10/28/2014 EMNLP 2014 in Doha, Qatar

word2vec Compositional neural language models Our model single token representations

✓ ✓ ✓

recursive structures

  • f syntactic relations

x x ✓

pre-training

✓ x ✓

composition

x ✓ ✓

slide-6
SLIDE 6

Relation to Previous Work

10/28/2014 EMNLP 2014 in Doha, Qatar

word2vec Compositional neural language models Our model single token representations

✓ ✓ ✓

recursive structures

  • f syntactic relations

x x ✓

pre-training

✓ x ✓

composition

x ✓ ✓

slide-7
SLIDE 7

Relation to Previous Work

10/28/2014 EMNLP 2014 in Doha, Qatar

word2vec Compositional neural language models Our model single token representations

✓ ✓ ✓

recursive structures

  • f syntactic relations

x x ✓

pre-training

✓ x ✓

composition

x ✓ ✓

slide-8
SLIDE 8

Relation to Previous Work

10/28/2014 EMNLP 2014 in Doha, Qatar

word2vec Compositional neural language models Our model single token representations

✓ ✓ ✓

recursive structures

  • f syntactic relations

x x ✓

pre-training

✓ x ✓

composition

x ✓ ✓

slide-9
SLIDE 9
  • Learning word and composed representations

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-10
SLIDE 10
  • Learning word and composed representations

– using syntactic structures of unlabeled corpora d vectors

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-11
SLIDE 11
  • Learning word and composed representations

– using syntactic structures of unlabeled corpora – without pre-trained word vectors

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-12
SLIDE 12
  • Learning word and composed representations

– using syntactic structures of unlabeled corpora – without pre-trained word vectors

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

storm downpour pay solve

  • vercome
slide-13
SLIDE 13
  • Learning word and composed representations

– using syntactic structures of unlabeled corpora – without pre-trained word vectors

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

storm downpour heavy rain make payment pay solve problem achieve objective bridge gap solve

  • vercome
slide-14
SLIDE 14
  • Learning word and composed representations

– using syntactic structures of unlabeled corpora – without pre-trained word vectors

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

storm downpour heavy rain make payment pay solve problem achieve objective bridge gap solve

  • vercome

State-of-the-art scores for phrase similarity tasks with transitive verbs

slide-15
SLIDE 15
  • 1. Learning word representations

using predicate-argument structures

  • 2. Jointly learning word representations and

composition functions

  • 3. Evaluation on phrase similarity tasks
  • 4. Conclusion

Overview

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-16
SLIDE 16
  • 1. Learning word representations

using predicate-argument structures

  • 2. Jointly learning word representations and

composition functions

  • 3. Evaluation on phrase similarity tasks
  • 4. Conclusion

Overview

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-17
SLIDE 17
  • Standard dependency structures

– Relations between heads and modifiers

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

the heavy rain caused the car accidents

slide-18
SLIDE 18
  • Standard dependency structures

– Relations between heads and modifiers

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

the heavy rain caused the car accidents nn det det amod nsubj dobj root

slide-19
SLIDE 19
  • Standard dependency structures

– Relations between heads and modifiers

  • Predicate-Argument Structures (PASs)

– Relations between predicates and arguments

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

the heavy rain caused the car accidents nn det det amod nsubj dobj root the heavy rain caused the car accidents

slide-20
SLIDE 20
  • Each predicate in a sentence has

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidents

slide-21
SLIDE 21
  • Each predicate in a sentence has

– a specific category

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidents

slide-22
SLIDE 22
  • Each predicate in a sentence has

– a specific category – zero or more arguments

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidents

slide-23
SLIDE 23
  • Each predicate in a sentence has

– a specific category – zero or more arguments

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidents adjective argument 1

slide-24
SLIDE 24
  • Each predicate in a sentence has

– a specific category – zero or more arguments

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidents verb adjective argument 1 argument 1 argument 2

slide-25
SLIDE 25
  • Each predicate in a sentence has

– a specific category – zero or more arguments

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidents verb adjective noun argument 1 argument 1 argument 1 argument 2

slide-26
SLIDE 26
  • Given a PAS, discriminating between

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-27
SLIDE 27
  • Given a PAS, discriminating between

– a word in the specific PAS

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-28
SLIDE 28
  • Given a PAS, discriminating between

– a word in the specific PAS and – a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-29
SLIDE 29
  • Given a PAS, discriminating between

– a word in the specific PAS and – a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

rain cause accident verb argument 1 argument 2

slide-30
SLIDE 30
  • Given a PAS, discriminating between

– a word in the specific PAS and – a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

a target word: cause rain cause accident verb argument 1 argument 2

slide-31
SLIDE 31
  • Given a PAS, discriminating between

– a word in the specific PAS and – a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

a target word: cause

a noise distribution (scaled unigram distribution in (Mikolov+, 2013))

rain cause accident verb argument 1 argument 2

slide-32
SLIDE 32
  • Given a PAS, discriminating between

– a word in the specific PAS and – a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

a target word: cause vs a drawn word: eat

a noise distribution (scaled unigram distribution in (Mikolov+, 2013))

rain eat accident verb argument 1 argument 2

slide-33
SLIDE 33
  • Given a PAS, discriminating between

– a word in the specific PAS and – a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

a target word: cause vs a drawn word: eat

a noise distribution (scaled unigram distribution in (Mikolov+, 2013))

rain eat accident verb argument 1 argument 2

context information

slide-34
SLIDE 34

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

rain cause accident verb argument 1 argument 2

slide-35
SLIDE 35

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

word vectors

rain cause accident verb argument 1 argument 2

slide-36
SLIDE 36

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

argument 2 word vectors

rain cause accident verb argument 1 argument 2 𝑞 cause = tanh(ℎ𝑏𝑠𝑕1

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12 ∗ 𝑤(rain) +

ℎ𝑏𝑠𝑕2

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12∗ 𝑤(accident))

slide-37
SLIDE 37

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

argument 2

𝑞 cause = tanh(ℎ𝑏𝑠𝑕1

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12 ∗ 𝑤(rain) +

ℎ𝑏𝑠𝑕2

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12∗ 𝑤(accident))

word vectors

rain cause accident verb argument 1 argument 2

slide-38
SLIDE 38

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

argument 2

𝑞 cause = tanh(ℎ𝑏𝑠𝑕1

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12 ∗ 𝑤(rain) +

ℎ𝑏𝑠𝑕2

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12∗ 𝑤(accident))

word vectors

rain cause accident verb argument 1 argument 2

slide-39
SLIDE 39

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

argument 2

𝑞 cause = tanh(ℎ𝑏𝑠𝑕1

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12 ∗ 𝑤(rain) +

ℎ𝑏𝑠𝑕2

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12∗ 𝑤(accident))

word vectors

rain cause accident verb argument 1 argument 2

slide-40
SLIDE 40

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause

𝑡

argument 2

𝑞 cause = tanh(ℎ𝑏𝑠𝑕1

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12 ∗ 𝑤(rain) +

ℎ𝑏𝑠𝑕2

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12∗ 𝑤(accident))

𝑡 = 𝑤 cause ∙ 𝑞(cause) 𝑡′ = 𝑤 eat ∙ 𝑞 cause

word vectors

rain cause accident verb argument 1 argument 2

slide-41
SLIDE 41

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

𝑞 cause = tanh(ℎ𝑏𝑠𝑕1

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12 ∗ 𝑤(rain) +

ℎ𝑏𝑠𝑕2

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12∗ 𝑤(accident))

𝑡 = 𝑤 cause ∙ 𝑞(cause) 𝑡′ = 𝑤 eat ∙ 𝑞 cause

word vectors

rain cause accident verb argument 1 argument 2

slide-42
SLIDE 42

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′ 𝐝𝐩𝐭𝐮: 𝐧𝐛𝐲(𝟏, 𝟐 − 𝒕 + 𝒕′)

𝑞 cause = tanh(ℎ𝑏𝑠𝑕1

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12 ∗ 𝑤(rain) +

ℎ𝑏𝑠𝑕2

𝑤𝑓𝑠𝑐_𝑏𝑠𝑕12∗ 𝑤(accident))

𝑡 = 𝑤 cause ∙ 𝑞(cause) 𝑡′ = 𝑤 eat ∙ 𝑞 cause

word vectors

rain cause accident verb argument 1 argument 2

slide-43
SLIDE 43
  • Learning word representations based on

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

slide-44
SLIDE 44
  • Learning word representations based on

– specific PAS categories

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

slide-45
SLIDE 45
  • Learning word representations based on

– specific PAS categories – selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

slide-46
SLIDE 46
  • Learning word representations based on

– specific PAS categories – selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

slide-47
SLIDE 47
  • Learning word representations based on

– specific PAS categories – selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

slide-48
SLIDE 48
  • Learning word representations based on

– specific PAS categories – selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

slide-49
SLIDE 49
  • Learning word representations based on

– specific PAS categories – selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

``rain’’ can be

  • a subject of ``cause’’

(not ``eat’’) 𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

slide-50
SLIDE 50
  • Learning word representations based on

– specific PAS categories – selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

``rain’’ can be

  • a subject of ``cause’’

(not ``eat’’)

  • a cause of ``accident’’

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

slide-51
SLIDE 51

Examples

10/28/2014 EMNLP 2014 in Doha, Qatar

eat at restaurant preposition argument 1 argument 2 heavy rain adjective argument 1

slide-52
SLIDE 52

Examples

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 eat 𝑤 a𝑢

argument 1

+

predicate

eat at restaurant preposition argument 1 argument 2 heavy rain adjective argument 1

slide-53
SLIDE 53

Examples

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 eat 𝑤 a𝑢

argument 1

+

restaurant cupboard

𝑡

predicate

𝑡′

eat at restaurant preposition argument 1 argument 2 heavy rain adjective argument 1

slide-54
SLIDE 54

Examples

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 eat 𝑤 a𝑢

argument 1

+

restaurant cupboard

𝑡

predicate

𝑡′

𝑤 rain

argument 1

+

heavy delicious

𝑡 𝑡′

eat at restaurant preposition argument 1 argument 2 heavy rain adjective argument 1

slide-55
SLIDE 55
  • Providing additional context information

Adding Bag-of-Words Contexts

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

slide-56
SLIDE 56
  • Providing additional context information

– Nouns and Verbs in the same sentences

Adding Bag-of-Words Contexts

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident

argument 1

+

cause eat

𝑡

argument 2

𝑡′

slide-57
SLIDE 57
  • Providing additional context information

– Nouns and Verbs in the same sentences

Adding Bag-of-Words Contexts

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident 𝑤 road 𝑤 injure

argument 1

+

cause eat

𝑡

argument 2

𝑡′

+

slide-58
SLIDE 58
  • Providing additional context information

– Nouns and Verbs in the same sentences

Adding Bag-of-Words Contexts

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑤 rain 𝑤 accident 𝑤 road 𝑤 injure

argument 1

+

cause eat

𝑡

argument 2

𝑡′

+

BoW

slide-59
SLIDE 59
  • Learning representations composed by

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-60
SLIDE 60
  • Learning representations composed by

– multiple words and

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-61
SLIDE 61
  • Learning representations composed by

– multiple words and – specific relation categories

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-62
SLIDE 62
  • Learning representations composed by

– multiple words and – specific relation categories

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

storm downpour

slide-63
SLIDE 63
  • Learning representations composed by

– multiple words and – specific relation categories

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

storm downpour

heavy rain adjective argument 1

slide-64
SLIDE 64
  • Learning representations composed by

– multiple words and – specific relation categories

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

storm downpour heavy rain

heavy rain adjective argument 1

slide-65
SLIDE 65
  • Using connections on graphs of PASs

A Specific PAS as a Single Token

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

𝑤 rain 𝑤 accident rain cause accident verb argument 1 argument 2

slide-66
SLIDE 66
  • Using connections on graphs of PASs

A Specific PAS as a Single Token

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

rain cause accident verb argument 1 argument 2 heavy adjective argument 1 car noun argument 1 𝑤 rain 𝑤 accident

slide-67
SLIDE 67
  • Using connections on graphs of PASs

A Specific PAS as a Single Token

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

𝑤 heavy__rain 𝑤 car__accident rain cause accident verb argument 1 argument 2 heavy adjective argument 1 car noun argument 1

parameterization

slide-68
SLIDE 68
  • Using connections on graphs of PASs

A Specific PAS as a Single Token

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

Same as Previously! rain cause accident verb argument 1 argument 2 heavy adjective argument 1 car noun argument 1 𝑤 heavy__rain 𝑤 car__accident

parameterization

slide-69
SLIDE 69
  • Similar tokens for each PAS representation

in terms of cosine similarity

Learned PAS Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

heavy_rain chief_executive world_war rain thunderstorm downpour blizzard much_rain general_manager vice_president executive_director project_manager managing_director second_war plane_crash riot last_war great_war

slide-70
SLIDE 70
  • Similar tokens for each PAS representation

in terms of cosine similarity

Learned PAS Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

make_payment solve_problem meeting_take_place make_order carry_survey pay_tax pay impose_tax achieve_objective bridge_gap improve_quality deliver_information encourage_development hold_meeting event_take_place end_season discussion_take_place do_work

slide-71
SLIDE 71
  • 1. Learning word representations

using predicate-argument structures

  • 2. Jointly learning word representations and

composition functions

  • 3. Evaluation on phrase similarity tasks
  • 4. Conclusion

Overview

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-72
SLIDE 72

Why Composition?

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

𝑤 heavy__rain 𝑤 car__accident

slide-73
SLIDE 73

Why Composition?

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

fully parameterized PAS representations

𝑤 heavy__rain 𝑤 car__accident

slide-74
SLIDE 74

Why Composition?

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

fully parameterized PAS representations

  • Very large number of combinations of words

𝑤 heavy__rain 𝑤 car__accident

slide-75
SLIDE 75

Why Composition?

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

fully parameterized PAS representations

  • Very large number of combinations of words

 Data sparseness 𝑤 heavy__rain 𝑤 car__accident

slide-76
SLIDE 76

Why Composition?

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

fully parameterized PAS representations

  • Very large number of combinations of words

 Data sparseness

  • Ignoring information from individual words

𝑤 heavy__rain 𝑤 car__accident

slide-77
SLIDE 77

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

𝑤 heavy rain 𝑤 car accident

slide-78
SLIDE 78

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

𝑤 heavy rain 𝑤 car accident 𝑤 heavy 𝑤 rain 𝑤 car 𝑤 accident

word vectors

slide-79
SLIDE 79

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

𝑤 heavy rain 𝑤 car accident 𝑤 heavy 𝑤 rain 𝑤 car 𝑤 accident 𝒉𝒃𝒆𝒌_𝒃𝒔𝒉𝟐 𝒉𝒐𝒑𝒗𝒐_𝒃𝒔𝒉𝟐

composition functions word vectors

slide-80
SLIDE 80

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

𝑤 heavy rain 𝑤 car accident 𝑤 heavy 𝑤 rain 𝑤 car 𝑤 accident 𝒉𝒃𝒆𝒌_𝒃𝒔𝒉𝟐 𝒉𝒐𝒑𝒗𝒐_𝒃𝒔𝒉𝟐

composition functions composed vectors word vectors

slide-81
SLIDE 81

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

𝑤 heavy rain 𝑤 car accident 𝑤 heavy 𝑤 rain 𝑤 car 𝑤 accident 𝒉𝒃𝒆𝒌_𝒃𝒔𝒉𝟐 𝒉𝒐𝒑𝒗𝒐_𝒃𝒔𝒉𝟐

composition functions composed vectors

Same as Previously!

word vectors

slide-82
SLIDE 82

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑡

argument 2

𝑡′

𝑤 heavy 𝑤 rain 𝑤 car 𝑤 accident 𝒉𝒃𝒆𝒌_𝒃𝒔𝒉𝟐 𝒉𝒐𝒑𝒗𝒐_𝒃𝒔𝒉𝟐

composition functions

𝑤 heavy rain 𝑤 car accident

slide-83
SLIDE 83
  • Simple element-wise composition functions

with and without tanh

Composition Functions in this Work

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-84
SLIDE 84
  • Simple element-wise composition functions

with and without tanh – e.g.)

Composition Functions in this Work

10/28/2014 EMNLP 2014 in Doha, Qatar

Composition Function 𝒉𝒃𝒆𝒌_𝒃𝒔𝒉𝟐 𝑤 heavy rain = 𝒉𝒃𝒆𝒌_𝒃𝒔𝒉𝟐(𝑤 heavy , 𝑤 rain )

slide-85
SLIDE 85
  • Simple element-wise composition functions

with and without tanh – e.g.)

Composition Functions in this Work

10/28/2014 EMNLP 2014 in Doha, Qatar

Composition Function 𝒉𝒃𝒆𝒌_𝒃𝒔𝒉𝟐 Add𝑚 𝑤 heavy + 𝑤 rain Add𝑜𝑚 tanh(𝑤 heavy + 𝑤 rain ) 𝑤 heavy rain = 𝒉𝒃𝒆𝒌_𝒃𝒔𝒉𝟐(𝑤 heavy , 𝑤 rain )

slide-86
SLIDE 86
  • Simple element-wise composition functions

with and without tanh – e.g.)

Composition Functions in this Work

10/28/2014 EMNLP 2014 in Doha, Qatar

Composition Function 𝒉𝒃𝒆𝒌_𝒃𝒔𝒉𝟐 Add𝑚 𝑤 heavy + 𝑤 rain Add𝑜𝑚 tanh(𝑤 heavy + 𝑤 rain ) WAdd𝑚 𝑛𝑞𝑠𝑓𝑒

𝑏𝑒𝑘_𝑏𝑠𝑕1 ∗ 𝑤 heavy + 𝑛𝑏𝑠𝑕1 𝑏𝑒𝑘_𝑏𝑠𝑕1 ∗ 𝑤 rain

WAdd𝑜𝑚 tanh(𝑛𝑞𝑠𝑓𝑒

𝑏𝑒𝑘_𝑏𝑠𝑕1 ∗ 𝑤 heavy + 𝑛𝑏𝑠𝑕1 𝑏𝑒𝑘_𝑏𝑠𝑕1 ∗ 𝑤 rain )

𝑤 heavy rain = 𝒉𝒃𝒆𝒌_𝒃𝒔𝒉𝟐(𝑤 heavy , 𝑤 rain )

slide-87
SLIDE 87

Learned Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

make payment solve problem run company make repayment make money make indemnity make saving make sum solve dilemma solve task solve difficulty solve trouble solve contradiction run firm run industry run corporation run enterprise run club

  • Similar composed representations in terms of

cosine similarity

slide-88
SLIDE 88

Learned Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

people kill animal animal kill people meeting take place anyone kill animal man kill animal person kill animal people kill bird predator kill animal creature kill people effusion kill people elephant kill people tiger kill people people kill people briefing take place party take place session take place conference take place investiture take place

  • Similar composed representations in terms of

cosine similarity

slide-89
SLIDE 89
  • L2-norms of the weight vectors of WAdd𝑜𝑚

Learned Composition Weights

10/28/2014 EMNLP 2014 in Doha, Qatar

Category Predicate Argument 1 Argument 2 adj_arg1 2.38 6.55

  • noun_arg1

3.37 5.60

  • verb_arg12

6.78 2.57 2.18

slide-90
SLIDE 90
  • L2-norms of the weight vectors of WAdd𝑜𝑚

– Clearly emphasizing head words

Learned Composition Weights

10/28/2014 EMNLP 2014 in Doha, Qatar

Category Predicate Argument 1 Argument 2 adj_arg1 2.38 6.55

  • noun_arg1

3.37 5.60

  • verb_arg12

6.78 2.57 2.18 nouns verbs

slide-91
SLIDE 91
  • 1. Learning word representations

using predicate-argument structures

  • 2. Jointly learning word representations and

composition functions

  • 3. Evaluation on phrase similarity tasks
  • 4. Conclusion

Overview

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-92
SLIDE 92
  • Training data

– PASs from BNC (~6 million sentences)

  • adjective-noun, noun-noun
  • prepositions and verbs with 2 arguments

Experimental Settings

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-93
SLIDE 93
  • Training data

– PASs from BNC (~6 million sentences)

  • adjective-noun, noun-noun
  • prepositions and verbs with 2 arguments
  • Dimensionality

– 50 and 1,000

Experimental Settings

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-94
SLIDE 94
  • Training data

– PASs from BNC (~6 million sentences)

  • adjective-noun, noun-noun
  • prepositions and verbs with 2 arguments
  • Dimensionality

– 50 and 1,000

  • Optimization

– AdaGrad (Duchi+ 2011)

  • learning rate: 0.05, mini-batch size: 32

Experimental Settings

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-95
SLIDE 95
  • Measuring the semantic similarity between

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-96
SLIDE 96
  • Measuring the semantic similarity between

– Adjective-Noun phrases (AN) – Noun-Noun phrases (NN) – Verb-Object phrases (VO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010)

slide-97
SLIDE 97
  • Measuring the semantic similarity between

– Adjective-Noun phrases (AN) – Noun-Noun phrases (NN) – Verb-Object phrases (VO) – Subject-Verb-Object phrases (SVO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010) (Grefenstette and Sadrzadeh 2011)

slide-98
SLIDE 98
  • Measuring the semantic similarity between

– Adjective-Noun phrases (AN) – Noun-Noun phrases (NN) – Verb-Object phrases (VO) – Subject-Verb-Object phrases (SVO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010) (Grefenstette and Sadrzadeh 2011) p1: vast amount p2: large quantity

AN dataset

slide-99
SLIDE 99
  • Measuring the semantic similarity between

– Adjective-Noun phrases (AN) – Noun-Noun phrases (NN) – Verb-Object phrases (VO) – Subject-Verb-Object phrases (SVO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010) (Grefenstette and Sadrzadeh 2011) p1: vast amount p2: large quantity

AN dataset

human annotator similarity score 7

slide-100
SLIDE 100
  • Measuring the semantic similarity between

– Adjective-Noun phrases (AN) – Noun-Noun phrases (NN) – Verb-Object phrases (VO) – Subject-Verb-Object phrases (SVO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010) (Grefenstette and Sadrzadeh 2011) p1: vast amount p2: large quantity

AN dataset

human annotator

cos 𝑤 𝑞1 , 𝑤 𝑞2 = 0.85

similarity score 7

slide-101
SLIDE 101
  • Measuring the semantic similarity between

– Adjective-Noun phrases (AN) – Noun-Noun phrases (NN) – Verb-Object phrases (VO) – Subject-Verb-Object phrases (SVO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010) (Grefenstette and Sadrzadeh 2011) p1: vast amount p2: large quantity

AN dataset

human annotator

 Spearman’s rank correlation cos 𝑤 𝑞1 , 𝑤 𝑞2 = 0.85

similarity score 7

slide-102
SLIDE 102
  • Examples of phrase pairs for noun phrase tasks

Examples of Phrase Pairs

10/28/2014 EMNLP 2014 in Doha, Qatar

AN phrase pair score vast amount large quantity 7 important part significant role 7 efficient use little room 1 early stage dark eye 1 NN phrase pair score wage increase tax rate 7 education course training programme 6

  • ffice worker

kitchen door 2 study group news agency 1

slide-103
SLIDE 103
  • Examples of phrase pairs for verb phrase tasks

Examples of Phrase Pairs

10/28/2014 EMNLP 2014 in Doha, Qatar

VO phrase pair score start work begin career 7 pour tea drink water 6 shut door close eye 1 wave hand start work 1 SVO phrase pair score student write name student spell name 7 child show sign child express sign 6 river meet sea river visit sea 1 system meet criterion system visit criterion 1

slide-104
SLIDE 104
  • Strong baselines produced by word2vec

Main Results (50dim)

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 AN NN VO SVO Correlation Score Add_l Add_nl Wadd_l Wadd_nl word2vec Human

slide-105
SLIDE 105
  • Strong baselines produced by word2vec

Main Results (50dim)

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 AN NN VO SVO Correlation Score Add_l Add_nl Wadd_l Wadd_nl word2vec Human

slide-106
SLIDE 106
  • Strong baselines produced by word2vec

Main Results (50dim)

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 AN NN VO SVO Correlation Score Add_l Add_nl Wadd_l Wadd_nl word2vec Human

slide-107
SLIDE 107
  • Strong baselines produced by word2vec
  • Nice scores for verb phrase tasks

Main Results (50dim)

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 AN NN VO SVO Correlation Score Add_l Add_nl Wadd_l Wadd_nl word2vec Human

slide-108
SLIDE 108
  • Nice scores for verb phrase tasks
  • Consistently outperforming 50 dimensional vectors

Main Results (1,000 dim)

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 AN NN VO SVO Correlation Score Add_l Add_nl Wadd_l Wadd_nl word2vec Human

slide-109
SLIDE 109
  • The AN, NN, and VO tasks

– BL: element-wise multiplications

(Blacoe and Lapata 2012)

– HB: recursive neural networks with CCGs

(Hermann and Blunsom 2013)

– KS: tensor-based composition models

(Kartsaklis and Sadrzadeh 2013)

  • The SVO task

– GS, VC: tensor-based composition models

(Grefenstette and Sadrzadeh 2011), (Van de Cruys+ 2013)

Comparison with Previous Work

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-110
SLIDE 110

The AN, NN, and VO Tasks

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 AN NN VO Correlation Score Add_nl Wadd_nl BL HB KS Human

slide-111
SLIDE 111
  • 50 dim

– Comparable to state-of-the-art scores

The AN, NN, and VO Tasks

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 AN NN VO Correlation Score Add_nl Wadd_nl BL HB KS Human

slide-112
SLIDE 112
  • 1,000 dim

– New state-of-the-art score for the VO task

The AN, NN, and VO Tasks

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 AN NN VO Correlation Score Add_nl Wadd_nl BL HB KS Human

slide-113
SLIDE 113

The SVO Task

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 SVO Correlation Score Wadd_nl GS VC Human

BNC ukWaC

  • State-of-the-art models use large corpora

– e.g.) ukWaC corpus (~ 2B words)

slide-114
SLIDE 114
  • Achieving the state-of-the-art score using

a much smaller corpus – BNC (~ 0.1B words) vs ukWaC (~ 2B words)

The SVO Task

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 SVO Correlation Score Wadd_nl GS VC Human

BNC BNC ukWaC

slide-115
SLIDE 115
  • BoW contexts are helpful for the verb phrase tasks

– The results might be dependent on how to construct BoW contexts

Effects of BoW Contexts

10/28/2014 EMNLP 2014 in Doha, Qatar

0.1 0.2 0.3 0.4 0.5 0.6 0.7 AN NN VO SVO Correlation Score Wadd_nl w/o BoW Wadd_nl w/ BoW Human

slide-116
SLIDE 116
  • 1. Learning word representations

using predicate-argument structures

  • 2. Jointly learning word representations and

composition functions

  • 3. Evaluation on phrase similarity tasks
  • 4. Conclusion

Overview

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-117
SLIDE 117
  • Jointly learning composition functions

– with syntactic structures – without any pre-trained word vectors

  • State-of-the-art scores for verb phrase similarity

tasks

Conclusion

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-118
SLIDE 118
  • Incorporating more sophisticated composition

functions to improve verb phrase representations

  • Learning full phrase representations rather than
  • nly 2 or 3 word phrases

Future Work

10/28/2014 EMNLP 2014 in Doha, Qatar

slide-119
SLIDE 119
  • Any questions?

Thank You Very Much!

10/28/2014 EMNLP 2014 in Doha, Qatar