Mining and Summarizing Customer Reviews
- --- Mansuo Shen, Yonghua Yu, Weicheng Chao
Mining and Summarizing Customer Reviews ---- Mansuo Shen, Yonghua - - PowerPoint PPT Presentation
Mining and Summarizing Customer Reviews ---- Mansuo Shen, Yonghua Yu, Weicheng Chao Introduction Problems Too Many Reviews Customers: Difficult to Read and Make Decisions Manufaturer: Difficlut to Track and Manage Products
○ Too Many Reviews ■ Customers: Difficult to Read and Make Decisions ■ Manufaturer: Difficlut to Track and Manage Products
○ mine and summarize all the customer reviews of a product ■ Mine the Features of the Product ■ Positive and Negative Opinions
○ mine product features ■ data mining and natural language processing techniques ○ identify opinion sentences ■ find opinion words ○ decide whether each opinion sentence is positive or negative ■ semantic orientation ○ summarize the results.
○ NLProcessor Linguistic Parser(XML
■ Split Texts into Sentences ■ Produce the Part-of-Speech Tag for Each Word ■ Identify None and Verb Groups
Stemming and etc e.g. <W C=‘NN’> : None <NG>: None Group / Phrase
○ Difficulty of natural language understanding ○ e.g. ■ The pictures are very clear. ---- Picture Quality ■ While light, it will not easily fit in my pocket. ---- Camera Size
○ Associate Mining ■ Words Converage ■ Frequent: More than 1% of the Review Sentences ○ Not All Features are Genuine Features ■ Compactness Pruning
■ Redundancy Pruning
○ Defination ■ Contain one or more product features
Adjectives are organized into bipolar clusters. in
head synset. Procedure: 1. Set seed set with common adjectives and their orientations. 2. If given adj. has synonym or antonym in seed set, then we know its orientation and add it into seed set. 3. If not, keep the adj. and search it when the seed list is updated.
Infrequent Features: Only small number of people talk about them, but can still be useful for customers and manufacturers. Procedure: For each sentence, if it doesn't contain frequent feature, but has one or more opinion words: Find the nearest noun/noun phrase around the opinion word, and store it into feature set as infrequent feature. Irrelevant noun: not a serious problem
Procedure: 1. Count positive and negative opinions in a sentence, and if one wins, here comes the sentence’s orientation. 2. If there is a tie, for each feature, count effective opinions. 3. If still can’t decide, take the orientation of previous sentence. Examples: 1. “Overall this is a good camera with a really good picture clarity & an exceptional close-up shooting capability.” 2. “The auto and manual along with movie modes are very easy to use, but the software is not intuitive” If there is a negation word close to a opinion word, use it’s opposite orientation.
For each feature, list related opinion sentences grouped by positive/negative and show both counts. All features are ranked by frequency. Feature phrases have a higher rank.
○ 2 digital cameras, 1 DVD player, 1 mp3 player, 1 cellular phone ○ Amazon.com and CNet.com
○ a text review and a title
Evaluate FBS(Feture-Based Summarization) from following perspectives:
Issue 1:
Issue 2:
Two Reasons:
○ Not features at all
○ E.g. “It’s quite light”
○ E.g. “I love its resolution”
○ E.g. “The color of it is astonishing!!!!!! But the screen is not that good.”