SLIDE 1 CER: Complementary Entity Recognition
via Knowledge Expansion
- n Large Unlabeled Product Reviews
Hu Xu, University of Illinois at Chicago Sihong Xie, Lehigh University Lei Shu, University of Illinois at Chicago Philip S. Yu, University of Illinois at Chicago, Tsinghua University
BigData ’16
SLIDE 2
My Black Friday Experience
SLIDE 3
I have a case and want to add a new GPU.
SLIDE 4
This is how a compatible GPU should look like.
SLIDE 5
I found a 1070 GPU with a good price on Newegg.
SLIDE 6
Unfortunately, the GPU is too long to fit in… and non-refundable.
SLIDE 7
In the end, I damaged my case a little…
SLIDE 8
What’s a better way to avoid this?
SLIDE 9
So we need to identify the fact that the GPU does not like some cases…
SLIDE 10 Roadmaps
- Preliminary: Complementary Entity Recognition
- Method Overview
- Basic Recognition via Dependency Paths
- Knowledge Expansion on a Large Amount of Reviews
- Experiments
- Conclusions
SLIDE 11 Sentiment Analysis on Reviews (Liu, 2012)
- Product reviews contain a huge amount of information
about first-hand user experiences in a sadly unstructured text format.
- Aspect-level sentiment analysis on product reviews is a
key task to understand customers’ opinions on opinion targets: products and aspects (features) of products.
- We focus on complementary entities in reviews.
SLIDE 12 What’s an Entity?
- Something that has separate and distinct existence and
- bjective or conceptual reality.
- —Merriam-Webster
- We are interested in entities related to products.
- Named Entity
- e.g., Samsung Galaxy S6, Microsoft Surface
- General Entity
- tablet, cellphone, computer, etc.
SLIDE 13 What’s a Complementary Entity?
- Customers also express their opinions on a relation between a reviewed
product and another product.
- One relation type is complementary relation: two products (entities)
should work together.
- Definition:
- target entity: the reviewed product;
- complementary entity: the related product in a complementary
relation.
- Example:
- This card works with my phone.
SLIDE 14
A Few Examples
SLIDE 15
A Few Examples with Opinions
SLIDE 16 Complementary Entity Recognition (CER)
- Extract complementary entities from sentences of reviews. (The target
entities can be obtained from product titles of the reviewed product)
- e.g., extract phone from “It works with my phone.”
- Differences from Named Entity Recognition (NER):
- Including general entities: e.g., case, phone.
- Context dependent
- e.g., “It works with my iPhone 7” vs “I like my iPhone 7.”
- We only focus on CER in this paper since sentiment classification is an
independent task and requires different techniques.
SLIDE 17 Roadmaps
- Preliminary: Complementary Entity Recognition
- Method Overview
- Basic CER via Dependency Paths
- Knowledge Expansion on a Large Amount of Reviews
- Experiments
- Conclusions
SLIDE 18 Method Overview
- We propose an unsupervised method with two components:
- Basic CER via Dependency Paths
- extract complementary entities from review sentences.
- Knowledge Expansion on a Large Amount of Reviews
- improve the precision by using the knowledge expanded on a
large amount of reviews
- the contexts of complementary entities can be noisy
- high quality knowledge can reduce such noise via high
precision paths.
SLIDE 19 Method Overview
Test Review
Dependency Paths for CER
Domain Reviews
Knowledge Expansion
Complementary Entity
SLIDE 20 Noisy Context of Complementary Entity
- Taking a micro SD card as the target entity for example:
- Similar context can be used for other purposes:
- It works in my phone.
- It works in practice.
- It works in airplane mode.
- The verbs used in the context of complementary entities are unlimited and domain
related:
- It works with my phone.
- I use it with my phone.
- I insert this card into my phone.
- This card like my phone.
SLIDE 21 Knowledge Expansion
- We introduce two kinds of knowledge to help to filter out noises:
- Candidate Complementary Entities:
- e.g., Samsung Galaxy S6, MS Surface, phone, tablet, etc. for
micro SD card
- Domain-Specific Verbs:
- e.g., use, work, fit, insert for micro SD card
- We observe that products under the same category share similar context
knowledge.
- We group reviews under the same category together in case the number
- f reviews for a specific product is limited.
SLIDE 22 Roadmaps
- Preliminary: Complementary Entity Recognition
- Method Overview
- Basic CER via Dependency Paths
- Knowledge Expansion on a Large Amount of Reviews
- Experiments
- Conclusions
SLIDE 23 Dependency Parsing
(De Marneffe and Manning, 2008)
)
- A sentence can be parsed into a tree structure with
words as nodes and typed grammar relations as edges.
- e.g., It works with my phone.
SLIDE 24 Dependency Path
- Paths passing through the nodes that are
complementary entities can be used to extract complementary entity.
SLIDE 25 Dependency Paths
- These paths (e.g., Path 6) may have low precision due to
context noises and domain knowledge can help to reduce such noises.
SLIDE 26 Roadmaps
- Preliminary: Complementary Entity Recognition
- Method Overview
- Basic CER via Dependency Paths
- Knowledge Expansion on a Large Amount of
Reviews
SLIDE 27 Knowledge Expansion
- To improve CER’s precision, we use domain
knowledge to filter noises.
- Similarly, we use dependency paths to extract high
quality knowledge from a large amount of reviews.
- The dependency paths must be of high precision
to ensure the quality of knowledge.
- We use only seed general verbs work and fit.
SLIDE 28 Knowledge Expansion
work fit
SLIDE 29 Knowledge Expansion
work fit Tablet phone Samsung Galaxy S6 Microsoft Surface Pro 4
SLIDE 30 Knowledge Expansion
work fit Tablet phone Samsung Galaxy S6 Microsoft Surface Pro 4 work use insert fit like
SLIDE 31
Dependency Paths for Knowledge Expansion
SLIDE 32 Roadmaps
- Preliminary: Complementary Entity Recognition
- Method Overview
- Basic CER via Dependency Paths
- Knowledge Expansion on a Large Amount of Reviews
- Experiments
- Conclusions
SLIDE 33 Experiment Setting
- Dataset:
- We annotated 7 products for testing purpose;
- We collect 6000 reviews for each category of
products, used for knowledge expansion.
- We compare 10 methods:
- 2 Noun Phrase Chunkers, NER, CRF, Sceptre, “My”
Entity Path, CER, CER1K+, CER3K+, CER6K+.
SLIDE 34 Experiment Results
- CER6K+ performs best:
- F1-score is more than 70%.
- CER3K+ (with 3000 domain reviews) is already
good enough.
- Other baselines are not designed for this task.
SLIDE 35
Experiment Results
SLIDE 36
Domain Knowledge
SLIDE 37 Roadmaps
- Preliminary: Complementary Entity Recognition (CER)
- Method Overview
- Basic Recognition via Dependency Paths
- Knowledge Expansion on a Large Amount of Reviews
- Experiments
- Conclusions
SLIDE 38 Conclusions
- We introduce a novel task Complementary Entity
Recognition (CER) and an unsupervised method for recognition.
- We utilize big data to expand domain knowledge and use
the domain knowledge to improve the performance of recognition.
- Future works can be
- sentiment classification for complementary entities;
- automatic knowledge accumulation from data.
SLIDE 39 Q&A
- The annotated dataset can be found at:
- https://www.cs.uic.edu/~hxu/CER_dataset.html
- For details, please go for the original paper:
- Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu, CER:
Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews, IEEE International Conference on Big Data 2016, Washington D.C., Dec 5-8, 2016.