Exploiting Domain Knowledge in Aspect Extraction Meichun Hsu - - PowerPoint PPT Presentation

exploiting domain knowledge in aspect extraction
SMART_READER_LITE
LIVE PREVIEW

Exploiting Domain Knowledge in Aspect Extraction Meichun Hsu - - PowerPoint PPT Presentation

Exploiting Domain Knowledge in Aspect Extraction Meichun Hsu Zhiyuan (Brett) Chen Malu Castellanos Arjun Mukherjee Riddhiman Ghosh Bing Liu Aspect Extraction Extracting aspect terms Aspect Terms This camera takes beautiful pictures but


slide-1
SLIDE 1

Exploiting Domain Knowledge in Aspect Extraction

Zhiyuan (Brett) Chen Arjun Mukherjee Bing Liu Meichun Hsu Malu Castellanos Riddhiman Ghosh

slide-2
SLIDE 2

Aspect Extraction

Extracting aspect terms

slide-3
SLIDE 3

Aspect Terms

This camera takes beautiful pictures but its price is higher than $200.

slide-4
SLIDE 4

Aspect Terms

This camera takes beautiful pictures but its price is higher than $200.

slide-5
SLIDE 5

Aspect Extraction

Clustering terms into categories Extracting aspect terms

slide-6
SLIDE 6

Clustering

Picture Photo Image

Aspect 1 Aspect 2

Price Cost Money

slide-7
SLIDE 7

Existing Work

Word frequency + syntactic dependency

(e.g., Hu and Liu, 2004)

For extracting only Supervised sequence labeling/classification

(e.g., Liu, Hu and Cheng 2005)

slide-8
SLIDE 8

Existing Work

Word frequency + syntactic dependency

(e.g., Hu and Liu, 2004)

For extracting only Supervised sequence labeling/classification

(e.g., Liu, Hu and Cheng 2005)

For clustering only Grouping aspect terms (e.g., Zhai et al.,

2010)

slide-9
SLIDE 9

Existing Work

Topic models (e.g., Mukherjee and Liu,

2012; Kim et al., 2013; Lazaridou et al., 2013; Lin and He, 2009; Lu and Zhai, 2008; Moghaddam and Ester, 2011; Sauper et al., 2011; Titov and McDonald, 2008;)

For both extracting and clustering

slide-10
SLIDE 10

Issues of Unsupervised Topic Models

Objective functions do not correlate well with human judgments (Chang et al., 2009). Many aspects/topics are not meaningful.

slide-11
SLIDE 11

Remedy: Knowledge-based Topic Models

slide-12
SLIDE 12

Knowledge-based Topic Models

Seeded Models DF-LDA Multiple senses Utilize “cannot” knowledge Adverse effect

slide-13
SLIDE 13

Knowledge-based Topic Models

Picture Photo

Must-Link

Picture Price

Cannot-Link

DF-LDA (Andrzejewski et al., 2009)

slide-14
SLIDE 14

Knowledge-based Topic Models

Seeded models (Burns et al., 2012;

Jagarlamudi et al., 2012; Lu et al., 2011; Mukherjee and Liu, 2012)

DF-LDA (Andrzejewski et al., 2009)

slide-15
SLIDE 15

Knowledge-based Topic Models

Multiple senses

Light

slide-16
SLIDE 16

Knowledge-based Topic Models

Multiple senses

Light {Light, Bright} {Light, Heavy}

slide-17
SLIDE 17

Knowledge-based Topic Models

Multiple senses

Light {Light, Bright, Heavy}

slide-18
SLIDE 18

Knowledge-based Topic Models

Adverse effect of knowledge Price Color Cheap Pricy Cost … Color Cheap Price Cost Pricy … Cost Price {Price, Cost}

slide-19
SLIDE 19

Knowledge-based Topic Models

Utilize “cannot” knowledge Price Expensive Money Cheap Amazon Review Shipping Order Amazon Price Expensive Review Shipping Order Money Cheap {Amazon,Price}

slide-20
SLIDE 20

Knowledge-based Topic Models

Seeded Models DF-LDA Multiple senses Utilize “cannot” knowledge Adverse effect

slide-21
SLIDE 21

Addressing Issues

Multiple senses Utilize “cannot” knowledge Adverse effect

slide-22
SLIDE 22

Addressing Issues

Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model

slide-23
SLIDE 23

M-Set and C-Set

Must-set: {Price, Cost, Money} Do not enforce transitivity. Cannot-set: {Price, Color, Size}

slide-24
SLIDE 24

Addressing First Issue

Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model

slide-25
SLIDE 25

MDK-LDA (Chen et al., IJCAI 2013)

slide-26
SLIDE 26

MDK-LDA (Chen et al., IJCAI 2013)

S1: {Light, Heavy, Weight} S2: {Light, Bright, Luminance}

slide-27
SLIDE 27

Addressing Second Issue

Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model

slide-28
SLIDE 28

Simple Pólya Urn Model (SPU)

slide-29
SLIDE 29

Simple Pólya Urn Model (SPU)

slide-30
SLIDE 30

Simple Pólya Urn Model (SPU)

slide-31
SLIDE 31

Simple Pólya Urn Model (SPU)

slide-32
SLIDE 32

Simple Pólya Urn Model (SPU)

slide-33
SLIDE 33

Simple Pólya Urn Model (SPU) The richer get richer!

slide-34
SLIDE 34

Interpreting LDA Under SPU

slide-35
SLIDE 35

Topic 0 price

Interpreting LDA Under SPU

slide-36
SLIDE 36

Topic 0 price price

Interpreting LDA Under SPU

slide-37
SLIDE 37

Generalized Pólya Urn Model (GPU)

slide-38
SLIDE 38

Generalized Pólya Urn Model (GPU)

slide-39
SLIDE 39

Generalized Pólya Urn Model (GPU)

slide-40
SLIDE 40

Generalized Pólya Urn Model (GPU)

slide-41
SLIDE 41

Generalized Pólya Urn Model (GPU)

slide-42
SLIDE 42

Topic 0 price

Applying GPU

slide-43
SLIDE 43

Topic 0 price price

Applying GPU

cost money

slide-44
SLIDE 44

Addressing Third Issue

Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model

slide-45
SLIDE 45

Topic 0 Topic 1 Topic 2

Our Proposed E-GPU Model

slide-46
SLIDE 46

Topic 0 Topic 1 Topic 2

E-GPU Model

price

slide-47
SLIDE 47

Topic 0 Topic 1 Topic 2

E-GPU Model price

price money cost

slide-48
SLIDE 48

Topic 0 Topic 1 Topic 2

E-GPU Model

color

{price, color}

slide-49
SLIDE 49

Topic 0 Topic 1 Topic 2

E-GPU Model

color 8

“color”

1

“color”

1

“color”

slide-50
SLIDE 50

Topic 0 Topic 1 Topic 2

E-GPU Model

color 8

“color”

1

“color”

1

“color”

slide-51
SLIDE 51

Topic 0 Topic 1 Topic 2

E-GPU Model

amazon

{price, amazon}

slide-52
SLIDE 52

Topic 0 Topic 1 Topic 2

E-GPU Model

“amazon” “amazon”

10

“amazon”

amazon

slide-53
SLIDE 53

Topic 0 Topic 1 Topic 2

E-GPU Model

“amazon” “amazon”

10

“amazon”

amazon

slide-54
SLIDE 54

“amazon”

E-GPU Model

amazon

“amazon”

10

“amazon”

Topic 0 Topic 1 Topic 2 Topic 3

slide-55
SLIDE 55

Addressing Issues

Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model

slide-56
SLIDE 56

Evaluation

slide-57
SLIDE 57

Evaluation

Four domains Evaluation Human Knowledge Objective

slide-58
SLIDE 58

Model Comparison

LDA (Blei et al., 2003) DF-LDA (Andrzejewski et al., 2009) MC-LDA LDA-GPU (Mimno et al., 2011)

slide-59
SLIDE 59

Model Comparison

LDA DF-LDA MC-LDA LDA-GPU DF-M DF-MC M-LDA MC-LDA

slide-60
SLIDE 60

Model Comparison

LDA DF-LDA MC-LDA LDA-GPU DF-M DF-MC M-LDA MC-LDA

Baselines

slide-61
SLIDE 61

Objective Evaluation

Topic Coherence

slide-62
SLIDE 62

Objective Evaluation

Topic Coherence

slide-63
SLIDE 63

Human Evaluation

Precision @ 5

slide-64
SLIDE 64

Human Evaluation

Precision @ 10

slide-65
SLIDE 65

Example Aspects

slide-66
SLIDE 66

Conclusions

Discover meaningful aspects using knowledge

slide-67
SLIDE 67

Conclusions

Multiple senses Utilize “cannot” knowledge Adverse effect

Discover meaningful aspects using knowledge

slide-68
SLIDE 68

Conclusions

Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model

Discover meaningful aspects using knowledge

slide-69
SLIDE 69

Datasets: http://www.cs.uic.edu/~zchen/

slide-70
SLIDE 70

Datasets: http://www.cs.uic.edu/~zchen/