Mining and Summarizing Customer Reviews ---- Mansuo Shen, Yonghua - - PowerPoint PPT Presentation

mining and summarizing customer reviews
SMART_READER_LITE
LIVE PREVIEW

Mining and Summarizing Customer Reviews ---- Mansuo Shen, Yonghua - - PowerPoint PPT Presentation

Mining and Summarizing Customer Reviews ---- Mansuo Shen, Yonghua Yu, Weicheng Chao Introduction Problems Too Many Reviews Customers: Difficult to Read and Make Decisions Manufaturer: Difficlut to Track and Manage Products


slide-1
SLIDE 1

Mining and Summarizing Customer Reviews

  • --- Mansuo Shen, Yonghua Yu, Weicheng Chao
slide-2
SLIDE 2

Introduction

  • Problems

○ Too Many Reviews ■ Customers: Difficult to Read and Make Decisions ■ Manufaturer: Difficlut to Track and Manage Products

  • Goal

○ mine and summarize all the customer reviews of a product ■ Mine the Features of the Product ■ Positive and Negative Opinions

slide-3
SLIDE 3

Introduction

  • Tasks

○ mine product features ■ data mining and natural language processing techniques ○ identify opinion sentences ■ find opinion words ○ decide whether each opinion sentence is positive or negative ■ semantic orientation ○ summarize the results.

slide-4
SLIDE 4

Related Work

  • Subjective Genre Classification
  • Sentiment Classification
  • Text Summarization
  • TerminologyFinding
slide-5
SLIDE 5

The Proposed Techniques

  • --Part-of-SpeechTagging(POS)
  • Product Feature: Usually Nones
  • Tool

○ NLProcessor Linguistic Parser(XML

  • utputs)

■ Split Texts into Sentences ■ Produce the Part-of-Speech Tag for Each Word ■ Identify None and Verb Groups

  • Saved in Database
  • Transaction File Generated
  • Preprocessing: Stopwords Removal,

Stemming and etc e.g. <W C=‘NN’> : None <NG>: None Group / Phrase

slide-6
SLIDE 6

The Proposed Techniques

  • --Frequent Features Identification
  • Problem

○ Difficulty of natural language understanding ○ e.g. ■ The pictures are very clear. ---- Picture Quality ■ While light, it will not easily fit in my pocket. ---- Camera Size

  • Frequent Features

○ Associate Mining ■ Words Converage ■ Frequent: More than 1% of the Review Sentences ○ Not All Features are Genuine Features ■ Compactness Pruning

  • Prune candidate features which are not in a specific order

■ Redundancy Pruning

  • p-support
slide-7
SLIDE 7

The Proposed Techniques

  • Opinion Words Extraction
  • Opinion Sentence

○ Defination ■ Contain one or more product features

slide-8
SLIDE 8

The Proposed Techniques

  • --Orientation Identification for Opinion Words

Adjectives are organized into bipolar clusters. in

  • WordNet. Satellite synsets have similar sense with

head synset. Procedure: 1. Set seed set with common adjectives and their orientations. 2. If given adj. has synonym or antonym in seed set, then we know its orientation and add it into seed set. 3. If not, keep the adj. and search it when the seed list is updated.

slide-9
SLIDE 9

The Proposed Techniques

  • -- Infrequent Feature Identification

Infrequent Features: Only small number of people talk about them, but can still be useful for customers and manufacturers. Procedure: For each sentence, if it doesn't contain frequent feature, but has one or more opinion words: Find the nearest noun/noun phrase around the opinion word, and store it into feature set as infrequent feature. Irrelevant noun: not a serious problem

  • Infrequent features account a small part of the whole set.
  • Infrequent features have lower p-support, so they are less important when ranked.
slide-10
SLIDE 10

The Proposed Techniques

  • -- Predicting the Orientations of Opinion Sentences

Procedure: 1. Count positive and negative opinions in a sentence, and if one wins, here comes the sentence’s orientation. 2. If there is a tie, for each feature, count effective opinions. 3. If still can’t decide, take the orientation of previous sentence. Examples: 1. “Overall this is a good camera with a really good picture clarity & an exceptional close-up shooting capability.” 2. “The auto and manual along with movie modes are very easy to use, but the software is not intuitive” If there is a negation word close to a opinion word, use it’s opposite orientation.

slide-11
SLIDE 11

The Proposed Techniques

  • -- Summary Generation

For each feature, list related opinion sentences grouped by positive/negative and show both counts. All features are ranked by frequency. Feature phrases have a higher rank.

slide-12
SLIDE 12

Evaluation

  • Reviews are from:

○ 2 digital cameras, 1 DVD player, 1 mp3 player, 1 cellular phone ○ Amazon.com and CNet.com

  • Reviews contain:

○ a text review and a title

  • The first 100 reviews of each product were crawled and downloaded
  • Using NLProcessor for generating POS tags
  • Manually tagging
slide-13
SLIDE 13

Evaluation

Evaluate FBS(Feture-Based Summarization) from following perspectives:

  • Effectiveness of feature extraction
  • Effectiveness of opinion sentence extraction
  • Accuracy of orientation prediction of opinion sentences
slide-14
SLIDE 14

Evaluation

Issue 1:

  • --“The taste of this burger is quite amazing”
  • --“This burger is coming from heaven”
  • --“The burger’s taste makes me say NO to any other burgers”

Issue 2:

  • --“The taste makes me wander”
slide-15
SLIDE 15

Evaluation

slide-16
SLIDE 16

Evaluation

slide-17
SLIDE 17

Evaluation

Two Reasons:

  • FASTR generates a large number of terms

○ Not features at all

  • FASTR does not find one-word terms
slide-18
SLIDE 18

Evaluation

slide-19
SLIDE 19

Limitations

  • Pronoun resolution is hard

○ E.g. “It’s quite light”

  • Only adjectives are considered as indicators of opinion orientations

○ E.g. “I love its resolution”

  • Strength of opinions is not been considered

○ E.g. “The color of it is astonishing!!!!!! But the screen is not that good.”

slide-20
SLIDE 20

Conclusion

  • Provide a feature-based summary of lots of customer reviews of a online-sold product.
  • The experimental results indicate that the proposed techniques are very promising.
  • The problem would be increasingly important as more people are buying stuffs online.
slide-21
SLIDE 21

Questions?