Dependency-based Convolutional Neural Networks for Sentence - - PowerPoint PPT Presentation

dependency based convolutional neural networks for
SMART_READER_LITE
LIVE PREVIEW

Dependency-based Convolutional Neural Networks for Sentence - - PowerPoint PPT Presentation

Dependency-based Convolutional Neural Networks for Sentence Embedding What is Hawaii s state flower ? ROOT Mingbo Ma Liang Huang Bing Xiang Bowen Zhou CUNY IBM T. J. Watson ACL 2015 Beijing Convolutional Neural Network


slide-1
SLIDE 1

Dependency-based Convolutional Neural Networks for Sentence Embedding

Mingbo Ma Liang Huang CUNY Bing Xiang Bowen Zhou IBM T. J. Watson

What is Hawaii ’ s state flower ? ROOT ACL 2015 Beijing

slide-2
SLIDE 2

Convolutional Neural Network for NLP

Kalchbrenner et al. (2014) and Kim (2014) apply CNNs to sentence modeling

  • alleviates data sparsity by word embedding
  • sequential order (sentence) instead of spatial order (image)

Should use more linguistic and structural information!

2

slide-3
SLIDE 3

1 3 2 6 4 5

convolution direction

Sequential convolution

word rep.

What is Hawaii ’s state flower

Sequential Convolution

3

slide-4
SLIDE 4

1 3 2 6 4 5

convolution direction

Sequential convolution

word rep.

What is Hawaii ’s state flower

Sequential Convolution

4

slide-5
SLIDE 5

1 3 2 6 4 5

convolution direction

Sequential convolution

word rep.

What is Hawaii ’s state flower

Sequential Convolution

5

slide-6
SLIDE 6

1 3 2 6 4 5

convolution direction

Sequential convolution

word rep.

What is Hawaii ’s state flower

Sequential Convolution

6

slide-7
SLIDE 7

1 3 2 6 4 5

convolution direction

Sequential convolution

word rep.

What is Hawaii ’s state flower

Sequential Convolution

7

slide-8
SLIDE 8

Try different convolution filters and repeat the same process

8

slide-9
SLIDE 9

1 3 2 6 4 5

convolution direction

Sequential convolution

word rep.

What is Hawaii ’s state flower

Sequential Convolution

9

slide-10
SLIDE 10

1 3 2 6 4 5

convolution direction

Sequential convolution

word rep.

What is Hawaii ’s state flower

Max pooling

Sequential Convolution

10

slide-11
SLIDE 11

1 3 2 6 4 5

convolution direction

Sequential convolution

word rep.

What is Hawaii ’s state flower

Max pooling Feed into NN Classification

Sequential Convolution

11

slide-12
SLIDE 12

What is Hawaii 's state flower ?

Sequential Convolution: Location

12

Gold standard: Entity Example: Question Type Classification (TREC)

slide-13
SLIDE 13

1 3 2 6 4 5

Sequential convolution

word rep.

What is Hawaii ’s state flower

Sequential Convolution

13

slide-14
SLIDE 14

1 3 2 6 4 5

Sequential convolution

word rep.

What is Hawaii ’s state flower

Sequential Convolution

14

Loc Loc

slide-15
SLIDE 15

1 3 2 6 4 5

Sequential convolution

word rep.

What is Hawaii ’s state flower

Sequential Convolution

15

Loc Loc Loc Loc

slide-16
SLIDE 16

1 3 2 6 4 5

Sequential convolution

word rep.

What is Hawaii ’s state flower

Sequential Convolution

16

Loc Loc Loc Loc Enty

slide-17
SLIDE 17

Sequential convolution

word rep.

Convolution on Tree

17

1 3 2 6 4 5

What is Hawaii ’s state flower

ROOT

slide-18
SLIDE 18

Sequential Convolution

18

Sequential convolution:

  • Traditional convolution operates in surface order
  • Cons: No structural information is captured

No long distance relationships

slide-19
SLIDE 19

Dependency-based Convolution

Structural Convolution:

  • operates the convolution filters on dependency tree
  • more “important” words are convolved more often
  • long distance relationships is naturally obtained

19

Sequential convolution:

  • Traditional convolution operates in surface order
  • Cons: No structural information is captured

No long distance relationships

slide-20
SLIDE 20

convolution direction

dependency convolution

word rep.

Convolution on Tree

child parent

20

1 3 2 6 4 5

What is Hawaii ’s state flower

ROOT

slide-21
SLIDE 21

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

21

ROOT

slide-22
SLIDE 22

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

22

ROOT

slide-23
SLIDE 23

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

23

ROOT

slide-24
SLIDE 24

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

24

ROOT

slide-25
SLIDE 25

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

25

ROOT

slide-26
SLIDE 26

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

26

ROOT

slide-27
SLIDE 27

Try different Bigram convolution filters and repeat the same process

27

slide-28
SLIDE 28

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

28

ROOT

slide-29
SLIDE 29

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

Max pooling

29

ROOT

slide-30
SLIDE 30

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

Max pooling

30

ROOT

slide-31
SLIDE 31

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

Max pooling

31

ROOT

slide-32
SLIDE 32

1 3 2 6 4 5

convolution direction

dependency convolution

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent

Max pooling

32

ROOT

slide-33
SLIDE 33

Trigram Convolution on Trees

33

slide-34
SLIDE 34

1 3 2 6 4 5

convolution direction

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent grand parent

Trigram convolution

34

ROOT* ROOT**

slide-35
SLIDE 35

1 3 2 6 4 5

convolution direction

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent grand parent

Trigram convolution

35

ROOT* ROOT**

slide-36
SLIDE 36

1 3 2 6 4 5

convolution direction

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent grand parent

Trigram convolution

36

ROOT* ROOT**

slide-37
SLIDE 37

follow the same steps as before…

37

slide-38
SLIDE 38

1 3 2 6 4 5

convolution direction

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent grand parent

Trigram convolution

38

ROOT* ROOT**

more important words are convolved more often!

slide-39
SLIDE 39

1 3 2 6 4 5

convolution direction

word rep.

What is Hawaii ’s state flower

Convolution on Tree

child parent grand parent

ROOT* ROOT**

Trigram convolution

Max pooling

39

slide-40
SLIDE 40

Convolution on Tree

Fully connected NN with softmax output

40

1 3 2 6 4 5

What is Hawaii ’s state flower

ROOT

bigram trigram

slide-41
SLIDE 41

Convolution on Siblings

Besides convolution on ancestor path, we also can capture conjunction information from siblings

41

ancestor path

g2 m

h

g g3

m

h

g

g2 m

h

g

siblings

s

m

_

s

m

h

t

s

m

_

t

s

m

h

s

m

h

g

slide-42
SLIDE 42

Experiments

Tasks: Sentimental analysis Question classification Datasets:

Tasks Dataset # Classes Size Testset Sentimental Analysis MR 2 10662 10-CV SST1 5 11855 2210 Question Classification TREC 6 5952 500 TREC-2 50 5952 500

42

slide-43
SLIDE 43

Sentimental Analysis Data Examples

Sentimental analysis from Rotten Tomatoes (MR & SST

  • 1)

straightforward statements: simplistic, silly and tedious subtle statements: the film tunes into a grief that could lead a man across centuries sentences with adversative: not for everyone, but for those with whom it will connect, it's a nice departure from standard moviegoing fare

43

Negative Positive Positive

slide-44
SLIDE 44

Sentimental Analysis Experiments Results

44

Category Model MR SST-1 This work

ancestor

80.4 47.7

ancestor+sibling

81.7 48.3

ancestor+sibling+sequential

81.9 49.5

CNNs

CNNs-non-static (Kim ’14) — baseline

81.5 48.0

CNNs-multichannel (Kim ’14)

81.1 47.4

Deep CNNs (Kalchbrenner+ ’14)

  • 48.5

Recursive NNs

Recursive Autoencoder (Socher+ ’11)

77.7 43.2

Recursive Neural Tensor (Socher+ ’13)

  • 45.7

Deep Recursive NNs (Irsoy+ ’14)

  • 49.8

Recurrent NNs

LSTM on tree (Zhu+ ’15)

81.9 48.0

Other

Paragraph-Vec (Le+ ’14)

  • 48.7
slide-45
SLIDE 45

Question Classification Examples

45

Sentence

Top-level (TREC) Fine-grained (TREC-2)

How did serfdom develop in and then leave Russia?

DESC

manner

What is Hawaii 's state flower ?

ENTY

plant

What sprawling U.S. state boasts the most airports ?

LOC

state

When was Algeria colonized ?

NUM

date

What person 's head is on a dime ?

HUM

ind

What does the technical term ISDN mean ?

ABBR

exp

slide-46
SLIDE 46

Question Classification Experiments Results

46

Category Model TREC TREC2 This work

ancestor

95.4 88.4

ancestor+sibling

95.6 89.0

ancestor+sibling+sequential

95.4 88.8

CNNs

CNNs-non-static (Kim ’14) — baseline

93.6 86.4

CNNs-multichannel (Kim ’14)

92.2 86.0

Deep CNNs (Kalchbrenner+ ’14)

93.0

  • Hand-coded

SVMs (Silva+ ’11)*

95.0 90.8

we achieved the highest published accuracy on TREC.

slide-47
SLIDE 47

Error Analysis :-)

Cases which we do better than Baseline:

http://cogcomp.cs.illinois.edu/Data/QA/QC/definition.html

47

Gold/Ours: Enty Baseline: Loc Gold/Ours: Enty Baseline: Desc Gold/Ours: Desc Baseline: Enty Gold/Ours: Mild Neg Baseline: Mild Pos

slide-48
SLIDE 48

Error Analysis :-(

Cases which we make mistakes:

http://cogcomp.cs.illinois.edu/Data/QA/QC/definition.html

48

Gold: Num Ours: Enty Baseline: Num

Cases which we and baseline make mistakes:

Gold: Num Ours: Enty Baseline: Desc

slide-49
SLIDE 49

Conclusions

Pros: Dependency-based convolution captures long- distance information. It outperforms sequential CNN in all four datasets. highest published accuracy on TREC. Cons: Our model’s accuracy depends on parser quality.

49

slide-50
SLIDE 50

What is Hawaii ’ s state flower ? ROOT

Deep Learning can and should be combined with linguistic intuitions.