Ashequl Qadir University of Wolverhampton, UK - - PowerPoint PPT Presentation
Ashequl Qadir University of Wolverhampton, UK - - PowerPoint PPT Presentation
Ashequl Qadir University of Wolverhampton, UK ashequl.qadir@wlv.ac.uk Outline Introduction Related Works Methodology Results Improvement Challenges Conclusion and Future works Introduction Review Examples Introduction
Outline
- Introduction
- Related Works
- Methodology
- Results
- Improvement Challenges
- Conclusion and Future works
Introduction
Review Examples
Introduction
Customers write reviews concerning their satisfaction and
criticisms.
Products have their functional features.
(Ex. Performance, Portability etc.)
Customers study reviews of product features prior to buying. Reviews contain:
Opinions ( Ex. It is very easy and simple to use) Factual Information ( Ex. It has a pink metal case)
Opinions can be:
General (Ex. I am very happy with this product) Product feature specific (Ex. It works nicely – Performance feature)
Introduction
Explicit product feature:
“The design looks nice” – Explicitly mentioned design feature.
Implicit Product feature:
“I found this product really useful for transport” – Implicitly present
portability feature.
Web contains unstructured opinion texts.
Summarizing feature specific opinions by reading is time consuming.
Opinion Sources: Review sites (epinions.com), e-commerce sites
(amazon.com), blogs etc.
Goals:
- Opinion Detection
- Product feature assignment
Related works
Opinion identification:
- Specific orientation of POS tags (Turney, 2002)
- Seed list of lexicons (Godbole et al., 2007; Kim et al., 2005 )
- N-gram subjectivity clue (Riloff et al.,2003)
- Word co-locations (Wiebe et al., 2001)
- Dependency relations of words (Wilson et al., 2004; Fei et al.,2006)
Explicit product feature identification:
- By extracting noun phrases (Yi et al., 2003,2005; Hu et al., 2004)
- By parts and properties of a product (Popescu et al., 2005 )
- By approaching as a classification problem (Ghani et al., 2006)
- By using frequent word associations (Qadir, 2009)
- By using dependency grammar graph (Zhung et al., 2006)
Approach in this paper differs by:
- Utilizing typed dependency relations to detect opinion
- Using statistical methods for identifying associated words.
- Utilizing tf.idf weight score to assign a product feature to a review
line.
- Not discriminating between explicit and implicit product features.
Typed Dependency Relations
- Words in a sentence have certain grammatical dependency
relations.
- Example: It was incredibly easy to set up and use.
- nsubj(nominal subject): Syntactic subject of a clause
- cop(copula): Complement of a copular verb and the copular verb
- advmod(adverbial modifier): Modifying adverb
- xcomp(open clausal complement): Clausal complement without its own subject
- aux(auxiliary): non-main verb of the clause
- prt(phrasal verb particle): phrasal verb
- cc(coordination): element of a conjunct and the coordinating conjunction
- conj(conjunct): connected by a coordinating conjunction
to
aux
easy It was incredibly set up use
nsubj cop advmod xcomp prt conj
and
cc
Typed Dependency Relations
Features of Stanford typed dependencies:
- More fine-grained distinctions are offered (de Marneffe,
2008) than PARC representations. (King, 2003)
- Ex. Breaking down adjuct into amod, xcomp, prep_of (de
Marneffe, 2008)
- Uses Penn Treebank part-of-speech tags.
- Defines 55 binary grammatical relations.
- Relation consists of a governor and a dependent.
Typed Dependency Relations
Example: It was incredibly easy to set up and use.
- Words representing functional feature of product: set up , use
(Usability)
- Word representing opinion: easy
- Word representing degree of subjectivity: incredibly
Dependency relations:
- xcomp(easy, set)
- conj (set, use)
- prt(set, up)
- advmod(easy, incredibly)
- POS that can represent a functional feature: VB
- Ex. uses, works, looks, fits, costs
- POS that can represent a feature: Adj/NN
- Ex. reliable, portable (Adj)
- Ex. design, price (NN)
- Representations that can establish a meaningful relation between
these parts-of-speech are chosen
Selected Typed Dependencies
acomp - Adjectival Complement Description:
An adjectival complement (acomp) of a VP is an adjectival
phrase.
The adjectival phrase functions as the complement (like an
- bject of the verb)
- Example:
Dependency Relation Component Example Indication acomp worked/VBD fine/JJ Possible Opinion acomp proved/VBN reliable/JJ Possible Opinion acomp works/VBZ well/JJ Possible Opinion Tag POS
VBD Verb, past tense VBN Verb, past participle VBZ Verb, 3rd person singuar present JJ Adjective
Selected Typed Dependencies
xcomp – Open Clausal Complement Definition:
An open clausal complement (xcomp)of a VP or an ADJP is a
clausal complement without its own subject, whose reference is determined by an external subject.
- Example:
Dependenc y Relation Component Example Indication xcomp easy/JJ use/VB Possible Opinion xcomp rendering/VBG impossible/JJ Possible Opinion xcomp found/VBD difficult/JJ Possible Opinion xcomp makes/VBZ ideal/JJ Possible Opinion xcomp find/VBP convenient/JJ Possible Opinion xcomp experienced/VBN similar/JJ Not Opinion Tag POS JJ Adjective VB Verb, base form VBG Verb, gerund or present participle VBD Verb, past tense VBZ Verb, 3rd person singular present VBP Verb, Non-3rd person singular present VBN Verb, past participle
Selected Typed Dependencies
advmod – Adverbial Modifier Definition:
An adverbial modifier(advmod) of a word is a (nonclausal) RB or
ADVP that serves to modify the meaning of the word.
- Example:
Dependency Relation Component Example Indication advmod well/JJ amazingly/RB Possible Opinion advmod easily/RB very/RB Possible Opinion advmod loads/VBD fast/RB Possible Opinion advmod looks/VBZ especially/RB Not Opinion advmod fits/VBZ perfectly/RB Possible Opinion advmod recognized/VBN straight/RB Not Opinion advmod satisfied/VBN very/RB Possible Opinion advmod priced/VBN reasonably/RB Possible Opinion Tag POS JJ Adjective RB Adverb VBD Verb, past tense VBZ Verb, 3rd person singular present VBN Verb, past participle
Selected Typed Dependencies
amod – Adjectival Modifier Definition:
An adjectival modifier of an NP is any adjectival phrase that
serves to modify the meaning of the NP.
Examples:
- It has a nice design.
- amod(design, nice) – Opinion (design)
- I bought this new camera last year.
- amod(camera, new) – Not Opinion
- It has a pink cover.
- amod(cover, pink) – Not Opinion
- It looks nice.
- acomp(looks, nice) – Opinion (design)
Opinion Detection Algorithm
- 1. for each sentence in review text
2. set Opinion_Flag=False
3. check acomp_presence
4. if present
5. if governor is any form of verb
6. if dependent is any form of adjective
7. set Opinion_Flag=True
10. else if check xcomp_presence
11. if present
12. if governor is any form of adjective
13. if dependent is any form of verb
14. set Opinion_Flag=True
15. else if governor is any form of verb
16. if dependent is any form of adjective
17. set Opinion_Flag=True
18. else if check advmod_presence
19. if present
20. if dependent is any form of adverb
21. if governor in any form of verb
22. set Opinion_Flag=True
23. else if governor is any form of adverb
24. set Opinion_Flag=True
25. else if governor is any form of adjective
26. set Opinion_Flag=True
Pre-processed Data
- Manually annotated 50 reviews for training, 50 reviews for testing.
- Product Domain: Electronics, Product Type: Hard disk.
- Example:
Product Features Opinion Sentence Usability ‘It was incredibly easy to set up and use.’ Design ‘I like its design and the fact that I only need one cable.’ Performance ‘Works perfectly and is completely reliable, no problem at all.’ Portability ‘I found this product really useful for transport as it is that small.’ Speed ‘The speed and capacity of the Passport drive are impressive.’ General ‘A satisfying product.’
Product Feature Assignment
Counting Frequent Words:
- Only words in the components of the
typed dependency relations are counted.
- Function words are ignored.
- Lemmatization is used to ensure counting
- f only the base form of words.
- Word counts are done within product
feature scopes.
Product Feature Assignment
Let,
Total number of review lines: N Set of product features: Frequency of the word w at review line i associated with
product feature, pj : wi,j
Word frequency count, WCj for word w within pj
product feature scope can be denoted by the following equation:
For different values of j, word frequency of the same
word w will be different because associated product feature pj will be different.
P p p p
j
,..., ,
2 1
N i j i j
w WC
1 ,
Product Feature Assignment
Word synonyms are taken using Wordnet’s synsets. Factors:
Words in synsets are not originally present in review line Context might be different Polysemous synonyms exists.
Let, for word w,
Number of synsets : k Number of synonyms in ith synset :ni The probability of each synonym to be the appropriate
synonym of the original word, w is considered by the following probability function:
k i i
n w P
1
1 ) (
Product Feature Assignment
Normalizing within product feature scope:
- tf.idf metric is used.
- Term Frequency:
- Let, frequency of word wi in a product feature scope pj: WCi,j
- Number of unique words in product feature scope pj : k
- Then, term frequency, tfi,j can be denoted by:
- Inverse Document Frequency:
- Let, total number of product features assigned: |P|
- Number of product features with which the word wi appears:
- Inverse document frequency idfi can be calculated by the
following:
k j k j i j i
WC WC tf
, , ,
p w p
i
:
p w p s P idf
i i
: log
Product Feature Assignment
Product feature score:
For each product feature, a product feature score is calculated
using the following formula:
f(relation) is a function that calculates the tf.idf weight score
for each of the components of a typed dependency relation considering a specific product feature.
PFS represents the contribution of a set of words in a
sentence towards different product features.
PFS will have different values for each of the product features. Higher PFS for a product feature=words in the Typed
Dependency Relation is more indicative of that product feature,
mod) ( ) ( ) ( adv f xcomp f acomp f PFS
Product Feature Assignment
Product feature selection:
Let, c be the product feature class for which the product
feature score, PFS is calculated.
Then each opinion sentence is assigned to a product feature
class c* where,
1% of the highest PFS score is set as a threshold. Below threshold, PFS is not strong enough to indicate a
product feature. ‘No Opinion’ is considered.
PFS c
c
max arg *
Test Result
50 test reviews having 220 sentences. Opinion Sentences: 113, No Opinion: 107 Product Domain: Electronics, Product type: hard disk
Precision Recall F-measure 0.7231 0.4159 0.5281
Product Feature Precision Recall F-measure General 0.7778 0.1228 0.2121 Usability 0.9231 0.7500 0.8276 Design 0.6364 0.4667 0.5385 Performance 0.5833 0.7778 0.6667 Portability 0.7143 0.5556 0.6250 Speed 0.3077 0.5714 0.4000 No Opinion 0.5742 0.8318 0.6794
Improvement challenges:
Sentence Segmentation:
Single sentence with more than one product feature
Example 1: I love its sleekness and elegance, I love how
lightweight it is, I love how quiet it is and it’s surprisingly quite speedy too.
Example 2: I would definitely recommend the Western Digital
Passport to anyone looking for a compact, portable, easy to use and affordable!
Polysemous synonyms. Incorporating amod typed dependency relation
Conclusion and Future works
- This paper discusses a process to detect opinion
sentences and assigns a product feature to each opinion sentences.
- Typed dependency relations and frequent word
associations have been utilized to achieve the desired goal.
- Future works will involve:
- identifying appropriate segmentation methodology to aid
the system.
- implementing the process in a number of varied domains
with more test data.
- exploring left and right context of the dependencies for
more supporting information.