Entailment above the word level in distributional semantics Marco - - PowerPoint PPT Presentation

entailment above the word level in distributional
SMART_READER_LITE
LIVE PREVIEW

Entailment above the word level in distributional semantics Marco - - PowerPoint PPT Presentation

Entailment above the word level in distributional semantics Marco Baroni University of Trento Raffaella Bernardi University of Trento Ngoc-Quynh Do EM LCT, Free University of Bozen-Bolzano Chung-chieh Shan Cornell University, University of


slide-1
SLIDE 1

Entailment above the word level in distributional semantics

Marco Baroni University of Trento Raffaella Bernardi University of Trento Ngoc-Quynh Do EM LCT, Free University of Bozen-Bolzano Chung-chieh Shan Cornell University, University of Tsukuba EACL 25 April 2012

slide-2
SLIDE 2

2/17

Summary

Entailment among composite phrases rather than nouns. (Cheap training data!)

Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?)

Different entailment relations at different semantic types. (Prediction from formal semantics.)

slide-3
SLIDE 3

2/17

Summary

Entailment among composite phrases rather than nouns. (Cheap training data!)

Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?)

Different entailment relations at different semantic types. (Prediction from formal semantics.) AN = = N big cat cat N = = N dog animal train test

slide-4
SLIDE 4

2/17

Summary

Entailment among composite phrases rather than nouns. (Cheap training data!)

Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?)

Different entailment relations at different semantic types. (Prediction from formal semantics.) AN = = N big cat cat N = = N dog animal QN = = QN many dogs some dogs QN = = QN all cats several cats train test

slide-5
SLIDE 5

2/17

Summary

Entailment among composite phrases rather than nouns. (Cheap training data!)

Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?)

Different entailment relations at different semantic types. (Prediction from formal semantics.) AN = = N big cat cat N = = N dog animal QN = = QN many dogs some dogs QN = = QN all cats several cats

×

train test

slide-6
SLIDE 6

3/17

Approaches to semantics

“In order to say what a meaning is, we may first ask what a meaning does, and then find something that does that.” —David Lewis

slide-7
SLIDE 7

3/17

Approaches to semantics

“In order to say what a meaning is, we may first ask what a meaning does, and then find something that does that.” —David Lewis

Truth, entailment

Every person cried.

  • Every professor cried.

A person cried.

  • A professor cried.

Formal semantics ∀x. Px → Cx λg. ∀x. Px → gx λf. λg. ∀x. fx → gx

P C

slide-8
SLIDE 8

3/17

Approaches to semantics

“In order to say what a meaning is, we may first ask what a meaning does, and then find something that does that.” —David Lewis

Concepts, similarity

ambulance

battleship ambulance

bookstore

Distributional semantics    

a b a n d

  • n

a b d

  • m

i n a l a b i l i t y a c a d e m i c a c c e p t

. . .

ambulance 27 10 50 17 130

. . .

battleship 35 32 1 25

. . .

bookstore 5 6 33 13

. . .

. . . . . . . . . . . . . . . . . . ...

   

slide-9
SLIDE 9

4/17

slide-10
SLIDE 10

5/17

Distributional semantics for entailment among words

For each word w, rank contexts c by descending Pr(c | w) Pr(c)

> 1.

“pointwise mutual information”

slide-11
SLIDE 11

5/17

Distributional semantics for entailment among words

For each word w, rank contexts c by descending Pr(c | w) Pr(c)

> 1.

“pointwise mutual information” parent

argcountn arglistn arglistj phanen specityn qdiscn carthyn parents-to-ben non-residentj step-parentn tcn ballonsn elizan symptonsn adoptivej stepparentn nonresidentj home-schooln scabridn petiolulen . . .

person

anglian first-mentionedj unascertainedj enurev deposit-takingj bonisn iconclassj cotswoldsn aforesaidn haverv foresaidj ghan sub-paragraphsn enactedj geestj non-medicinalj sub-paragraphn intimationn arrestmentn incumbrancen . . .

professor

williamn extraordinariusn ordinariusn francisn reidn emeritusn emeritusj derwentn regiusn laurencen edwardn carisoprodoln adjunctj winstonn privatdozentj edwardj xanaxn tenurev cialisn florencen . . .

slide-12
SLIDE 12

6/17

Distributional semantics for entailment among words

Context rank of word 1 Context overlap with word 2 1000 2000 3000 4000 5000 1000 2000 3000 parent-person professor-person person-parent person-professor parent-professor professor-parent

slide-13
SLIDE 13

6/17

Distributional semantics for entailment among words

Context rank of word 1 Context overlap with word 2 1000 2000 3000 4000 5000 1000 2000 3000

p e r f e c t ⊆

parent-person professor-person person-parent person-professor parent-professor professor-parent

slide-14
SLIDE 14

6/17

Distributional semantics for entailment among words

Context rank of word 1 Context overlap with word 2 1000 2000 3000 4000 5000 1000 2000 3000

p e r f e c t ⊆

parent-person professor-person person-parent person-professor parent-professor professor-parent Better: skew divergence (Lee), balAPinc (Kotlerman et al.), . . .

slide-15
SLIDE 15

7/17

Above the word level

Phrases have corpus distributions too! N cat AN white cat QN every cat

slide-16
SLIDE 16

7/17

Above the word level

Phrases have corpus distributions too! But N ≈ AN ≈ QN Syntactic category N cat N AN white cat N QN every cat QP

slide-17
SLIDE 17

7/17

Above the word level

Phrases have corpus distributions too! But N ≈ AN ≈ QN Syntactic category Semantic type N cat N e → t AN white cat N e → t QN every cat QP

(e → t) → t

slide-18
SLIDE 18

7/17

Above the word level

Phrases have corpus distributions too! But N ≈ AN ≈ QN Syntactic category Semantic type N cat N e → t AN white cat N e → t AAN big white cat N e → t QN every cat QP

(e → t) → t

QAN every big cat QP

(e → t) → t

* AQN

big every cat

* QQN

some every cat

slide-19
SLIDE 19

8/17

Our questions

Entailment among composite phrases rather than nouns?

Entailment among logical words rather than content words?

Different entailment relations at different semantic types? AN = = N big cat cat N = = N dog animal QN = = QN many dogs some dogs QN = = QN all cats several cats train test

×

slide-20
SLIDE 20

8/17

Our questions

Entailment among composite phrases rather than nouns?

Entailment among logical words rather than content words?

Different entailment relations at different semantic types? AN = = N big cat cat N = = N dog animal QN = = QN many dogs some dogs QN = = QN all cats several cats train test

×

slide-21
SLIDE 21

8/17

Our questions

Entailment among composite phrases rather than nouns?

Entailment among logical words rather than content words?

Different entailment relations at different semantic types? AN = =

N N

big cat cat N = =

N N

dog animal QN = =

QNQN

many dogs some dogs QN = =

QNQN

all cats several cats

×

train test

slide-22
SLIDE 22

9/17

Our semantic space

BNC, WackyPedia, ukWaC lemmatized, POS-tagged tokens (2.8G) most frequent A, N, V (27K) AN QN A Q N

(48K)

       

#(c, w)

       

TreeTagger (Schmid) words and phrases in the same sentence

slide-23
SLIDE 23

9/17

Our semantic space

BNC, WackyPedia, ukWaC lemmatized, POS-tagged tokens (2.8G) most frequent A, N, V (27K) AN QN A Q N

(48K)

       

#(c, w)

       

TreeTagger (Schmid) words and phrases in the same sentence

       

log Pr(c|w) Pr(c)

       

(300)

       

U ˜

Σ

       

PMI SVD

slide-24
SLIDE 24

9/17

Our semantic space

BNC, WackyPedia, ukWaC lemmatized, POS-tagged tokens (2.8G) most frequent A, N, V (27K) AN QN A Q N

(48K)

       

#(c, w)

       

TreeTagger (Schmid) words and phrases in the same sentence

       

log Pr(c|w) Pr(c)

       

(300)

       

U ˜

Σ

       

PMI SVD

frequency baseline SVM balAPinc cosine baseline

slide-25
SLIDE 25

10/17

Our entailment classifiers log Pr(c|w) Pr(c)

                PMI

slide-26
SLIDE 26

10/17

Our entailment classifiers log Pr(c|w) Pr(c)

                PMI

slide-27
SLIDE 27

10/17

Our entailment classifiers log Pr(c|w) Pr(c)

                PMI

?

slide-28
SLIDE 28

10/17

Our entailment classifiers log Pr(c|w) Pr(c)

                PMI

?

balAPinc (Kotlerman et al.)

slide-29
SLIDE 29

10/17

Our entailment classifiers log Pr(c|w) Pr(c)

                PMI

?

0 ≤ balAPinc ≤ 1

> threshold?

slide-30
SLIDE 30

10/17

Our entailment classifiers log Pr(c|w) Pr(c)

                PMI

?

0 ≤ balAPinc ≤ 1

> threshold?

Train Test AN N N N QN QN QN QN AN N QN QN

slide-31
SLIDE 31

10/17

Our entailment classifiers log Pr(c|w) Pr(c)

               

U ˜

Σ

                PMI

?

0 ≤ balAPinc ≤ 1

> threshold? SVD

SVM (cubic)

  • utperformed naïve Bayes, kNN
slide-32
SLIDE 32

11/17

Our data sets

WordNet pope spiritual_leader spiritual_leader leader cat feline feline carnivore . . .

slide-33
SLIDE 33

11/17

Our data sets

WordNet pope leader cat carnivore . . . (1385)

slide-34
SLIDE 34

11/17

Our data sets

WordNet pope leader cat carnivore . . . (1385) leader pope cat leader . . . (1385)

invert resample

slide-35
SLIDE 35

11/17

Our data sets

most frequent WordNet big former . . . (300) pope leader cat carnivore . . . (1385) leader pope cat leader . . . (1385)

invert resample

slide-36
SLIDE 36

11/17

Our data sets

most frequent WordNet big former ——— . . . (256) pope leader cat carnivore . . . (1385) leader pope cat leader . . . (1385)

invert resample

slide-37
SLIDE 37

11/17

Our data sets

most frequent BLESS WordNet big former ——— . . . (256) apple shirt . . . (200) pope leader cat carnivore . . . (1385) big apple apple big shirt shirt . . . (1246) leader pope cat leader . . . (1385) big apple shirt big shirt apple . . . (1244)

invert resample resample resample

slide-38
SLIDE 38

11/17

Our data sets

most frequent BLESS WordNet most frequent big former ——— . . . (256) apple shirt . . . (200) pope leader cat carnivore . . . (1385) all both each either every few many most much no several some . . . big apple apple big shirt shirt . . . (1246) leader pope cat leader . . . (1385) big apple shirt big shirt apple . . . (1244)

invert resample resample resample

slide-39
SLIDE 39

11/17

Our data sets

most frequent BLESS WordNet most frequent big former ——— . . . (256) apple shirt . . . (200) pope leader cat carnivore . . . (1385) all some many several . . . (13) some every both many . . . (17) big apple apple big shirt shirt . . . (1246) leader pope cat leader . . . (1385) big apple shirt big shirt apple . . . (1244)

invert resample resample resample

slide-40
SLIDE 40

11/17

Our data sets

most frequent BLESS WordNet most frequent big former ——— . . . (256) apple shirt . . . (200) pope leader cat carnivore . . . (1385) all some many several . . . (13) some every both many . . . (17) big apple apple big shirt shirt . . . (1246) leader pope cat leader . . . (1385) big apple shirt big shirt apple . . . (1244) pope leader cat carnivore . . . (6402)

invert resample resample resample

all cat some cat many cat several cat . . . (7537) some cat every cat both cat many cat . . . (8455) all cat every leader both cat many leader .

slide-41
SLIDE 41

11/17

Our data sets

most frequent BLESS WordNet most frequent big former ——— . . . (256) apple shirt . . . (200) pope leader cat carnivore . . . (1385) all some many several . . . (13) some every both many . . . (17) big apple apple big shirt shirt . . . (1246) leader pope cat leader . . . (1385) big apple shirt big shirt apple . . . (1244) pope leader cat carnivore . . . (6402)

invert resample resample resample

all cat some cat many cat several cat . . . (7537) some cat every cat both cat many cat . . . (8455)

N = = N

e → t

AN = = N

e → t

QN = = QN

(e → t) → t

slide-42
SLIDE 42

11/17

Our data sets

N = = N

e → t

AN = = N

e → t

QN = = QN

(e → t) → t

train test train test train test

slide-43
SLIDE 43

12/17

Results at noun type

P R F Accuracy (95% C.I.) SVMupper 88.6 88.6 88.5 88.6 (87.3–89.7) balAPincAN N 65.2 87.5 74.7 70.4 (68.7–72.1) balAPincupper 64.4 90.0 75.1 70.1 (68.4–71.8) SVMAN N 69.3 69.3 69.3 69.3 (67.6–71.0) cos(N1, N2) 57.7 57.6 57.5 57.6 (55.8–59.5) fq(N1) < fq(N2) 52.1 52.1 51.8 53.3 (51.4–55.2)

slide-44
SLIDE 44

13/17

Holding out QN data

  • all

all both both each each either either every every few few many many most most much much no no several several some some

slide-45
SLIDE 45

13/17

Holding out QN data

  • all

all both both each each either either every every few few many many most most much much no no several several some some p a i r

  • u

t

slide-46
SLIDE 46

13/17

Holding out QN data

  • all

all both both each each either either every every few few many many most most much much no no several several some some q u a n t i fi e r

  • u

t p a i r

  • u

t

slide-47
SLIDE 47

14/17

Results at quantifier type

P R F Accuracy (95% C.I.) SVMpair-out 76.7 77.0 76.8 78.1 (77.5–78.8) SVMquantifier-out 70.1 65.3 68.0 71.0 (70.3–71.7) SVMQ

pair-out

67.9 69.8 68.9 70.2 (69.5–70.9) SVMQ

quantifier-out

53.3 52.9 53.1 56.0 (55.2–56.8) cos(QN1, QN2) 52.9 52.3 52.3 53.1 (52.3–53.9) balAPincAN N 46.7 5.6 10.0 52.5 (51.7–53.3) SVMAN N 2.8 42.9 5.2 52.4 (51.7–53.2) fq(QN1)<fq(QN2) 51.0 47.4 49.1 50.2 (49.4–51.0) balAPincupper 47.1 100 64.1 47.2 (46.4–47.9)

slide-48
SLIDE 48

15/17

Holding out each quantifier

Quantifier Instances Correct

  • each

656 656 649 637 (98%) every 460 1322 402 1293 (95%) much 248 216 0 (87%) all 2949 2641 2011 2494 (81%) several 1731 1509 1302 1267 (79%) many 3341 4163 2349 3443 (77%) few 461 311 (67%) most 928 832 549 511 (60%) some 4062 3145 1780 2190 (55%) no 714 380 (53%) both 636 1404 589 303 (44%) either 63 63 2 41 (34%) Total 15074 16910 9849 12870 (71%)

slide-49
SLIDE 49

16/17

Our questions answered

Entailment among composite phrases rather than nouns? Yes.

Entailment among logical words rather than content words? Yes.

Different entailment relations at different semantic types? Yes. AN = =

N N

big cat cat N = =

N N

dog animal QN = =

QNQN

many dogs some dogs QN = =

QNQN

all cats several cats

×

slide-50
SLIDE 50

16/17

Our questions answered

Entailment among composite phrases rather than nouns? Yes. (Cheap training data!)

✒ Practical import

Entailment among logical words rather than content words? Yes. (Part of Recognizing Textual Entailment?) ✒ Practical import Different entailment relations at different semantic types? Yes. (Prediction from formal semantics.) AN = =

N N

big cat cat N = =

N N

dog animal QN = =

QNQN

many dogs some dogs QN = =

QNQN

all cats several cats

×

slide-51
SLIDE 51

16/17

Our questions answered

Entailment among composite phrases rather than nouns? Yes. (Cheap training data!)

✒ Practical import

Entailment among logical words rather than content words? Yes. (Part of Recognizing Textual Entailment?) ✒ Practical import Different entailment relations at different semantic types? Yes. (Prediction from formal semantics.) Ongoing work:

◮ How does the SVM work? ◮ Missing experiments? ◮ How to compose semantic vectors?

slide-52
SLIDE 52

17/17

Holding out each quantifier pair

Quantifier pair Instances Correct all | = some 1054 1044 (99%) all | = several 557 550 (99%) each | = some 656 647 (99%) all | = many 873 772 (88%) much | = some 248 217 (88%) every | = many 460 400 (87%) many | = some 951 822 (86%) all | = most 465 393 (85%) several | = some 580 439 (76%) both | = some 573 322 (56%) many | = several 594 113 (19%) most | = many 463 84 (18%) both | = either 63 1 (2%) Quantifier pair Instances Correct some | = every 484 481 (99%) several | = all 557 553 (99%) several | = every 378 375 (99%) some | = all 1054 1043 (99%) many | = every 460 452 (98%) some | = each 656 640 (98%) few | = all 157 153 (97%) many | = all 873 843 (97%) both | = most 369 347 (94%) several | = few 143 134 (94%) both | = many 541 397 (73%) many | = most 463 300 (65%) either | = both 63 39 (62%) many | = no 714 369 (52%) some | = many 951 468 (49%) few | = many 161 33 (20%) both | = several 431 63 (15%)