CER: Complementary Entity Recognition via Knowledge Expansion on - - PowerPoint PPT Presentation

cer complementary entity recognition
SMART_READER_LITE
LIVE PREVIEW

CER: Complementary Entity Recognition via Knowledge Expansion on - - PowerPoint PPT Presentation

CER: Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews Hu Xu , University of Illinois at Chicago Sihong Xie, Lehigh University Lei Shu, University of Illinois at Chicago Philip S. Yu, University of


slide-1
SLIDE 1

CER: Complementary Entity Recognition

via Knowledge Expansion

  • n Large Unlabeled Product Reviews

Hu Xu, University of Illinois at Chicago Sihong Xie, Lehigh University Lei Shu, University of Illinois at Chicago Philip S. Yu, University of Illinois at Chicago, Tsinghua University

BigData ’16

slide-2
SLIDE 2

My Black Friday Experience

slide-3
SLIDE 3

I have a case and want to add a new GPU.

slide-4
SLIDE 4

This is how a compatible GPU should look like.

slide-5
SLIDE 5

I found a 1070 GPU with a good price on Newegg.

slide-6
SLIDE 6

Unfortunately, the GPU is too long to fit in… and non-refundable.

slide-7
SLIDE 7

In the end, I damaged my case a little…

slide-8
SLIDE 8

What’s a better way to avoid this?

slide-9
SLIDE 9

So we need to identify the fact that the GPU does not like some cases…

slide-10
SLIDE 10

Roadmaps

  • Preliminary: Complementary Entity Recognition
  • Method Overview
  • Basic Recognition via Dependency Paths
  • Knowledge Expansion on a Large Amount of Reviews
  • Experiments
  • Conclusions
slide-11
SLIDE 11

Sentiment Analysis on Reviews (Liu, 2012)

  • Product reviews contain a huge amount of information

about first-hand user experiences in a sadly unstructured text format.

  • Aspect-level sentiment analysis on product reviews is a

key task to understand customers’ opinions on opinion targets: products and aspects (features) of products.

  • We focus on complementary entities in reviews.
slide-12
SLIDE 12

What’s an Entity?

  • Something that has separate and distinct existence and
  • bjective or conceptual reality.
  • —Merriam-Webster
  • We are interested in entities related to products.
  • Named Entity
  • e.g., Samsung Galaxy S6, Microsoft Surface
  • General Entity
  • tablet, cellphone, computer, etc.
slide-13
SLIDE 13

What’s a Complementary Entity?

  • Customers also express their opinions on a relation between a reviewed

product and another product.

  • One relation type is complementary relation: two products (entities)

should work together.

  • Definition:
  • target entity: the reviewed product;
  • complementary entity: the related product in a complementary

relation.

  • Example:
  • This card works with my phone.
slide-14
SLIDE 14

A Few Examples

slide-15
SLIDE 15

A Few Examples with Opinions

slide-16
SLIDE 16

Complementary Entity Recognition (CER)

  • Extract complementary entities from sentences of reviews. (The target

entities can be obtained from product titles of the reviewed product)

  • e.g., extract phone from “It works with my phone.”
  • Differences from Named Entity Recognition (NER):
  • Including general entities: e.g., case, phone.
  • Context dependent
  • e.g., “It works with my iPhone 7” vs “I like my iPhone 7.”
  • We only focus on CER in this paper since sentiment classification is an

independent task and requires different techniques.

slide-17
SLIDE 17

Roadmaps

  • Preliminary: Complementary Entity Recognition
  • Method Overview
  • Basic CER via Dependency Paths
  • Knowledge Expansion on a Large Amount of Reviews
  • Experiments
  • Conclusions
slide-18
SLIDE 18

Method Overview

  • We propose an unsupervised method with two components:
  • Basic CER via Dependency Paths
  • extract complementary entities from review sentences.
  • Knowledge Expansion on a Large Amount of Reviews
  • improve the precision by using the knowledge expanded on a

large amount of reviews

  • the contexts of complementary entities can be noisy
  • high quality knowledge can reduce such noise via high

precision paths.

slide-19
SLIDE 19

Method Overview

Test Review

Dependency Paths for CER

Domain Reviews

Knowledge Expansion

Complementary Entity

slide-20
SLIDE 20

Noisy Context of Complementary Entity

  • Taking a micro SD card as the target entity for example:
  • Similar context can be used for other purposes:
  • It works in my phone.
  • It works in practice.
  • It works in airplane mode.
  • The verbs used in the context of complementary entities are unlimited and domain

related:

  • It works with my phone.
  • I use it with my phone.
  • I insert this card into my phone.
  • This card like my phone.
slide-21
SLIDE 21

Knowledge Expansion

  • We introduce two kinds of knowledge to help to filter out noises:
  • Candidate Complementary Entities:
  • e.g., Samsung Galaxy S6, MS Surface, phone, tablet, etc. for

micro SD card

  • Domain-Specific Verbs:
  • e.g., use, work, fit, insert for micro SD card
  • We observe that products under the same category share similar context

knowledge.

  • We group reviews under the same category together in case the number
  • f reviews for a specific product is limited.
slide-22
SLIDE 22

Roadmaps

  • Preliminary: Complementary Entity Recognition
  • Method Overview
  • Basic CER via Dependency Paths
  • Knowledge Expansion on a Large Amount of Reviews
  • Experiments
  • Conclusions
slide-23
SLIDE 23

Dependency Parsing

(De Marneffe and Manning, 2008)

)

  • A sentence can be parsed into a tree structure with

words as nodes and typed grammar relations as edges.

  • e.g., It works with my phone.
slide-24
SLIDE 24

Dependency Path

  • Paths passing through the nodes that are

complementary entities can be used to extract complementary entity.

slide-25
SLIDE 25

Dependency Paths

  • These paths (e.g., Path 6) may have low precision due to

context noises and domain knowledge can help to reduce such noises.

slide-26
SLIDE 26

Roadmaps

  • Preliminary: Complementary Entity Recognition
  • Method Overview
  • Basic CER via Dependency Paths
  • Knowledge Expansion on a Large Amount of

Reviews

  • Experiments
  • Conclusions
slide-27
SLIDE 27

Knowledge Expansion

  • To improve CER’s precision, we use domain

knowledge to filter noises.

  • Similarly, we use dependency paths to extract high

quality knowledge from a large amount of reviews.

  • The dependency paths must be of high precision

to ensure the quality of knowledge.

  • We use only seed general verbs work and fit.
slide-28
SLIDE 28

Knowledge Expansion

work fit

slide-29
SLIDE 29

Knowledge Expansion

work fit Tablet phone Samsung Galaxy S6 Microsoft Surface Pro 4

slide-30
SLIDE 30

Knowledge Expansion

work fit Tablet phone Samsung Galaxy S6 Microsoft Surface Pro 4 work use insert fit like

slide-31
SLIDE 31

Dependency Paths for Knowledge Expansion

slide-32
SLIDE 32

Roadmaps

  • Preliminary: Complementary Entity Recognition
  • Method Overview
  • Basic CER via Dependency Paths
  • Knowledge Expansion on a Large Amount of Reviews
  • Experiments
  • Conclusions
slide-33
SLIDE 33

Experiment Setting

  • Dataset:
  • We annotated 7 products for testing purpose;
  • We collect 6000 reviews for each category of

products, used for knowledge expansion.

  • We compare 10 methods:
  • 2 Noun Phrase Chunkers, NER, CRF, Sceptre, “My”

Entity Path, CER, CER1K+, CER3K+, CER6K+.

slide-34
SLIDE 34

Experiment Results

  • CER6K+ performs best:
  • F1-score is more than 70%.
  • CER3K+ (with 3000 domain reviews) is already

good enough.

  • Other baselines are not designed for this task.
slide-35
SLIDE 35

Experiment Results

slide-36
SLIDE 36

Domain Knowledge

slide-37
SLIDE 37

Roadmaps

  • Preliminary: Complementary Entity Recognition (CER)
  • Method Overview
  • Basic Recognition via Dependency Paths
  • Knowledge Expansion on a Large Amount of Reviews
  • Experiments
  • Conclusions
slide-38
SLIDE 38

Conclusions

  • We introduce a novel task Complementary Entity

Recognition (CER) and an unsupervised method for recognition.

  • We utilize big data to expand domain knowledge and use

the domain knowledge to improve the performance of recognition.

  • Future works can be
  • sentiment classification for complementary entities;
  • automatic knowledge accumulation from data.
slide-39
SLIDE 39

Q&A

  • The annotated dataset can be found at:
  • https://www.cs.uic.edu/~hxu/CER_dataset.html
  • For details, please go for the original paper:
  • Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu, CER:

Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews, IEEE International Conference on Big Data 2016, Washington D.C., Dec 5-8, 2016.