SLIDE 1
Exploiting Domain Knowledge in Aspect Extraction Meichun Hsu - - PowerPoint PPT Presentation
Exploiting Domain Knowledge in Aspect Extraction Meichun Hsu - - PowerPoint PPT Presentation
Exploiting Domain Knowledge in Aspect Extraction Meichun Hsu Zhiyuan (Brett) Chen Malu Castellanos Arjun Mukherjee Riddhiman Ghosh Bing Liu Aspect Extraction Extracting aspect terms Aspect Terms This camera takes beautiful pictures but
SLIDE 2
SLIDE 3
Aspect Terms
This camera takes beautiful pictures but its price is higher than $200.
SLIDE 4
Aspect Terms
This camera takes beautiful pictures but its price is higher than $200.
SLIDE 5
Aspect Extraction
Clustering terms into categories Extracting aspect terms
SLIDE 6
Clustering
Picture Photo Image
Aspect 1 Aspect 2
Price Cost Money
SLIDE 7
Existing Work
Word frequency + syntactic dependency
(e.g., Hu and Liu, 2004)
For extracting only Supervised sequence labeling/classification
(e.g., Liu, Hu and Cheng 2005)
SLIDE 8
Existing Work
Word frequency + syntactic dependency
(e.g., Hu and Liu, 2004)
For extracting only Supervised sequence labeling/classification
(e.g., Liu, Hu and Cheng 2005)
For clustering only Grouping aspect terms (e.g., Zhai et al.,
2010)
SLIDE 9
Existing Work
Topic models (e.g., Mukherjee and Liu,
2012; Kim et al., 2013; Lazaridou et al., 2013; Lin and He, 2009; Lu and Zhai, 2008; Moghaddam and Ester, 2011; Sauper et al., 2011; Titov and McDonald, 2008;)
For both extracting and clustering
SLIDE 10
Issues of Unsupervised Topic Models
Objective functions do not correlate well with human judgments (Chang et al., 2009). Many aspects/topics are not meaningful.
SLIDE 11
Remedy: Knowledge-based Topic Models
SLIDE 12
Knowledge-based Topic Models
Seeded Models DF-LDA Multiple senses Utilize “cannot” knowledge Adverse effect
SLIDE 13
Knowledge-based Topic Models
Picture Photo
Must-Link
Picture Price
Cannot-Link
DF-LDA (Andrzejewski et al., 2009)
SLIDE 14
Knowledge-based Topic Models
Seeded models (Burns et al., 2012;
Jagarlamudi et al., 2012; Lu et al., 2011; Mukherjee and Liu, 2012)
DF-LDA (Andrzejewski et al., 2009)
SLIDE 15
Knowledge-based Topic Models
Multiple senses
Light
SLIDE 16
Knowledge-based Topic Models
Multiple senses
Light {Light, Bright} {Light, Heavy}
SLIDE 17
Knowledge-based Topic Models
Multiple senses
Light {Light, Bright, Heavy}
SLIDE 18
Knowledge-based Topic Models
Adverse effect of knowledge Price Color Cheap Pricy Cost … Color Cheap Price Cost Pricy … Cost Price {Price, Cost}
SLIDE 19
Knowledge-based Topic Models
Utilize “cannot” knowledge Price Expensive Money Cheap Amazon Review Shipping Order Amazon Price Expensive Review Shipping Order Money Cheap {Amazon,Price}
SLIDE 20
Knowledge-based Topic Models
Seeded Models DF-LDA Multiple senses Utilize “cannot” knowledge Adverse effect
SLIDE 21
Addressing Issues
Multiple senses Utilize “cannot” knowledge Adverse effect
SLIDE 22
Addressing Issues
Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model
SLIDE 23
M-Set and C-Set
Must-set: {Price, Cost, Money} Do not enforce transitivity. Cannot-set: {Price, Color, Size}
SLIDE 24
Addressing First Issue
Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model
SLIDE 25
MDK-LDA (Chen et al., IJCAI 2013)
SLIDE 26
MDK-LDA (Chen et al., IJCAI 2013)
S1: {Light, Heavy, Weight} S2: {Light, Bright, Luminance}
SLIDE 27
Addressing Second Issue
Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model
SLIDE 28
Simple Pólya Urn Model (SPU)
SLIDE 29
Simple Pólya Urn Model (SPU)
SLIDE 30
Simple Pólya Urn Model (SPU)
SLIDE 31
Simple Pólya Urn Model (SPU)
SLIDE 32
Simple Pólya Urn Model (SPU)
SLIDE 33
Simple Pólya Urn Model (SPU) The richer get richer!
SLIDE 34
Interpreting LDA Under SPU
SLIDE 35
Topic 0 price
Interpreting LDA Under SPU
SLIDE 36
Topic 0 price price
Interpreting LDA Under SPU
SLIDE 37
Generalized Pólya Urn Model (GPU)
SLIDE 38
Generalized Pólya Urn Model (GPU)
SLIDE 39
Generalized Pólya Urn Model (GPU)
SLIDE 40
Generalized Pólya Urn Model (GPU)
SLIDE 41
Generalized Pólya Urn Model (GPU)
SLIDE 42
Topic 0 price
Applying GPU
SLIDE 43
Topic 0 price price
Applying GPU
cost money
SLIDE 44
Addressing Third Issue
Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model
SLIDE 45
Topic 0 Topic 1 Topic 2
Our Proposed E-GPU Model
SLIDE 46
Topic 0 Topic 1 Topic 2
E-GPU Model
price
SLIDE 47
Topic 0 Topic 1 Topic 2
E-GPU Model price
price money cost
SLIDE 48
Topic 0 Topic 1 Topic 2
E-GPU Model
color
{price, color}
SLIDE 49
Topic 0 Topic 1 Topic 2
E-GPU Model
color 8
“color”
1
“color”
1
“color”
SLIDE 50
Topic 0 Topic 1 Topic 2
E-GPU Model
color 8
“color”
1
“color”
1
“color”
SLIDE 51
Topic 0 Topic 1 Topic 2
E-GPU Model
amazon
{price, amazon}
SLIDE 52
Topic 0 Topic 1 Topic 2
E-GPU Model
“amazon” “amazon”
10
“amazon”
amazon
SLIDE 53
Topic 0 Topic 1 Topic 2
E-GPU Model
“amazon” “amazon”
10
“amazon”
amazon
SLIDE 54
“amazon”
E-GPU Model
amazon
“amazon”
10
“amazon”
Topic 0 Topic 1 Topic 2 Topic 3
SLIDE 55
Addressing Issues
Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model
SLIDE 56
Evaluation
SLIDE 57
Evaluation
Four domains Evaluation Human Knowledge Objective
SLIDE 58
Model Comparison
LDA (Blei et al., 2003) DF-LDA (Andrzejewski et al., 2009) MC-LDA LDA-GPU (Mimno et al., 2011)
SLIDE 59
Model Comparison
LDA DF-LDA MC-LDA LDA-GPU DF-M DF-MC M-LDA MC-LDA
SLIDE 60
Model Comparison
LDA DF-LDA MC-LDA LDA-GPU DF-M DF-MC M-LDA MC-LDA
Baselines
SLIDE 61
Objective Evaluation
Topic Coherence
SLIDE 62
Objective Evaluation
Topic Coherence
SLIDE 63
Human Evaluation
Precision @ 5
SLIDE 64
Human Evaluation
Precision @ 10
SLIDE 65
Example Aspects
SLIDE 66
Conclusions
Discover meaningful aspects using knowledge
SLIDE 67
Conclusions
Multiple senses Utilize “cannot” knowledge Adverse effect
Discover meaningful aspects using knowledge
SLIDE 68
Conclusions
Multiple senses Utilize “cannot” knowledge Adverse effect Adding variable s E-GPU Model GPU Model
Discover meaningful aspects using knowledge
SLIDE 69
Datasets: http://www.cs.uic.edu/~zchen/
SLIDE 70