Mining the Peanut Gallery Opinion Extraction and Semantic - - PowerPoint PPT Presentation

mining the peanut gallery
SMART_READER_LITE
LIVE PREVIEW

Mining the Peanut Gallery Opinion Extraction and Semantic - - PowerPoint PPT Presentation

Mining the Peanut Gallery Opinion Extraction and Semantic Classification of Product Reviews A paper by Kushal Dave, Steve Lawrence, David M. Pennock Presented by Ledao Chen and David Zhao 1 Problem Product reviews are everywhere! How


slide-1
SLIDE 1

Mining the Peanut Gallery

Opinion Extraction and Semantic Classification of Product Reviews

Presented by Ledao Chen and David Zhao A paper by Kushal Dave, Steve Lawrence, David M. Pennock

1

slide-2
SLIDE 2

Problem

  • Product reviews are everywhere!
  • How can you possibly read them all?

2

slide-3
SLIDE 3

Relevant background

  • Objectivity classification

○ Separating reviews from other content

  • Word classification

○ How similar are two words

  • Sentiment classification

○ What emotion a word is associated with

3

slide-4
SLIDE 4

Data

  • CNET

○ 7 categories, all electronics ○ Review with binary good/bad

  • Amazon

○ 7 categories, varied ○ Review with 5-star rating

4

slide-5
SLIDE 5

Evaluation

Category 1 Category 2 Category 3

  • Cat. 4

Category 5 Category 6 Category 7 Positive Negative

Test 1

Train fold Test fold

5

slide-6
SLIDE 6

Evaluation

Category 1 Category 2 Category 3

  • Cat. 4

Category 5 Category 6 Category 7 Positive Negative

Test 2

10x sets

6

slide-7
SLIDE 7

Tokenization

  • Strip HTML
  • Tokenize document into sentences
  • Tokenize sentences into words

[ [“Peace” “cannot” “be” “kept” “by” “force” “;” “it” “can” “only” ...], [“Darkness” “cannot” “drive” “out” “darkness” “;” “only” “light”...], [“Hate” “cannot” “drive” “out” “hate” “;” “only” “love” “can” “do”...] ]

7

slide-8
SLIDE 8

Metadata and statistical substitution

  • Numerical tokens

‒ “I have 35” → “I have number”

  • Product names

‒ “I like Nikon” & “I like Kodak” → “I like productname”

  • Low-frequency terms

‒ “Peach fuzz” and “Pollen fuzz” → “unique fuzz”

  • Product-specific terms

‒ “Lens is bad” and “RAM is bad” → “producttypeword is bad”

8

slide-9
SLIDE 9

Linguistic substitution

  • WordNet from tokens with part-of-speech tags
  • Colocation of nouns and modifying adjectives
  • Stemming of tokens
  • Negation propagation

○ “not good or useful”→“not NOTgood NOTor NOTuseful”

9

slide-10
SLIDE 10

N-gram and proximity

  • For Test 1, trigrams performed best
  • For Test 2, bigrams performed best
  • Mixing n-grams with lower-order features

○ e.g. bigrams mixed with unigrams

  • Smoothing using lower-order reference model
  • Proximity features

[Peace cannot be kept by force it can only be achieved by understanding] Combined into “achieved-understanding” feature

10

slide-11
SLIDE 11

Substrings

11

slide-12
SLIDE 12

Substring Trade-off

Substrings become longer generally more discriminatory

12

their frequency decrease less evidence for considering them relevant

slide-13
SLIDE 13

Thresholding

  • 1. Count the frequency
  • f features
  • 2. Normalize (optional)
  • 3. Thresholding

The difference of different thresholds are not significantly different.

13

slide-14
SLIDE 14

Smoothing

  • Add-one Smoothing
  • Witten-Bell Smoothing
  • Good-Turing Smoothing

14

P= P=

slide-15
SLIDE 15

Scoring

The normalized term frequency, by taking the

number of times a feature fi occurs in C and dividing it by the total number of tokens in C. A term’s score is thus a measure of bias ranging from –1 to 1.

Baseline:

15

slide-16
SLIDE 16

Scoring

16

slide-17
SLIDE 17

Alternatives: odds ratio

Performs on par with SVM Sensitive to different class sizes, thus performs poorly on Test 1

Scoring

17

slide-18
SLIDE 18

Alternatives: Fisher Discriminant

Performs poorly on both tests

Scoring

18

slide-19
SLIDE 19
  • Multiplying by

document frequency, dampened by logarithm, provided better result on Test 1

  • Gaussian weighting

scheme on TF provided better result

  • n Test 1

19

Reweighting

slide-20
SLIDE 20

Basic idea: Sum the scores of the words in an unknown document and use the sign of the total to determine

20

Classifying

slide-21
SLIDE 21

Basic idea: crawl search engine results for a given product’s name and attempt to identify and analyze product reviews within this set.

Model by data from Discard some certain pages, paragraphs, sentences (such as pages without “review” in their title, paragraphs not containing the name of the product, and excessively long or short sentences)

21

Mining

slide-22
SLIDE 22

Randomly selected 600 sentences (200 for each of 3 products) from search engine as parsed and thresholded by the mining tool. Manually tagged as positive (P) or negative (N) or ambiguous (I) Ambiguous means they were ambiguous when taken out of context, did not express an opinion at all, or were not describing the product.

  • ----- Very Subjective

P: 173 N:71 I: 356

22

Mining Evaluation

slide-23
SLIDE 23

23

Mining Evaluation

Worse than tossing a coin

slide-24
SLIDE 24

Summary and Conclusions

  • Obtained fairly good results for the review classification task

through the choice of appropriate features and metrics

  • Identified a number of issues that make this problem difficult

24

slide-25
SLIDE 25

The Issues

  • Rating inconsistency

○ Different understanding on 1-5 stars

  • Ambivalence and comparison

○ Some reviewers use terms that have negative connotations, but then write an equivocating final sentence explaining that overall they were satisfied.

25

slide-26
SLIDE 26
  • Sparse data

○ Many of the reviews are very short ○ Amazon is OK, but most reviews from C|net are within 3 documents occurrence

  • Skewed distribution

○ positive reviews were predominant ○ certain products and product types have more reviews ■ ‘Camera’ is positive ■ Negative reviews a longer, language are more varied.

26

The Issues

slide-27
SLIDE 27

Questions

27