Ashequl Qadir University of Wolverhampton, UK - - PowerPoint PPT Presentation

ashequl qadir
SMART_READER_LITE
LIVE PREVIEW

Ashequl Qadir University of Wolverhampton, UK - - PowerPoint PPT Presentation

Ashequl Qadir University of Wolverhampton, UK ashequl.qadir@wlv.ac.uk Outline Introduction Related Works Methodology Results Improvement Challenges Conclusion and Future works Introduction Review Examples Introduction


slide-1
SLIDE 1

Ashequl Qadir

University of Wolverhampton, UK ashequl.qadir@wlv.ac.uk

slide-2
SLIDE 2

Outline

  • Introduction
  • Related Works
  • Methodology
  • Results
  • Improvement Challenges
  • Conclusion and Future works
slide-3
SLIDE 3

Introduction

slide-4
SLIDE 4

Review Examples

slide-5
SLIDE 5

Introduction

 Customers write reviews concerning their satisfaction and

criticisms.

 Products have their functional features.

 (Ex. Performance, Portability etc.)

 Customers study reviews of product features prior to buying.  Reviews contain:

 Opinions ( Ex. It is very easy and simple to use)  Factual Information ( Ex. It has a pink metal case)

 Opinions can be:

 General (Ex. I am very happy with this product)  Product feature specific (Ex. It works nicely – Performance feature)

slide-6
SLIDE 6

Introduction

 Explicit product feature:

 “The design looks nice” – Explicitly mentioned design feature.

 Implicit Product feature:

 “I found this product really useful for transport” – Implicitly present

portability feature.

 Web contains unstructured opinion texts.

 Summarizing feature specific opinions by reading is time consuming.

 Opinion Sources: Review sites (epinions.com), e-commerce sites

(amazon.com), blogs etc.

Goals:

  • Opinion Detection
  • Product feature assignment
slide-7
SLIDE 7

Related works

Opinion identification:

  • Specific orientation of POS tags (Turney, 2002)
  • Seed list of lexicons (Godbole et al., 2007; Kim et al., 2005 )
  • N-gram subjectivity clue (Riloff et al.,2003)
  • Word co-locations (Wiebe et al., 2001)
  • Dependency relations of words (Wilson et al., 2004; Fei et al.,2006)

Explicit product feature identification:

  • By extracting noun phrases (Yi et al., 2003,2005; Hu et al., 2004)
  • By parts and properties of a product (Popescu et al., 2005 )
  • By approaching as a classification problem (Ghani et al., 2006)
  • By using frequent word associations (Qadir, 2009)
  • By using dependency grammar graph (Zhung et al., 2006)

Approach in this paper differs by:

  • Utilizing typed dependency relations to detect opinion
  • Using statistical methods for identifying associated words.
  • Utilizing tf.idf weight score to assign a product feature to a review

line.

  • Not discriminating between explicit and implicit product features.
slide-8
SLIDE 8

Typed Dependency Relations

  • Words in a sentence have certain grammatical dependency

relations.

  • Example: It was incredibly easy to set up and use.
  • nsubj(nominal subject): Syntactic subject of a clause
  • cop(copula): Complement of a copular verb and the copular verb
  • advmod(adverbial modifier): Modifying adverb
  • xcomp(open clausal complement): Clausal complement without its own subject
  • aux(auxiliary): non-main verb of the clause
  • prt(phrasal verb particle): phrasal verb
  • cc(coordination): element of a conjunct and the coordinating conjunction
  • conj(conjunct): connected by a coordinating conjunction

to

aux

easy It was incredibly set up use

nsubj cop advmod xcomp prt conj

and

cc

slide-9
SLIDE 9

Typed Dependency Relations

Features of Stanford typed dependencies:

  • More fine-grained distinctions are offered (de Marneffe,

2008) than PARC representations. (King, 2003)

  • Ex. Breaking down adjuct into amod, xcomp, prep_of (de

Marneffe, 2008)

  • Uses Penn Treebank part-of-speech tags.
  • Defines 55 binary grammatical relations.
  • Relation consists of a governor and a dependent.
slide-10
SLIDE 10

Typed Dependency Relations

Example: It was incredibly easy to set up and use.

  • Words representing functional feature of product: set up , use

(Usability)

  • Word representing opinion: easy
  • Word representing degree of subjectivity: incredibly

Dependency relations:

  • xcomp(easy, set)
  • conj (set, use)
  • prt(set, up)
  • advmod(easy, incredibly)
  • POS that can represent a functional feature: VB
  • Ex. uses, works, looks, fits, costs
  • POS that can represent a feature: Adj/NN
  • Ex. reliable, portable (Adj)
  • Ex. design, price (NN)
  • Representations that can establish a meaningful relation between

these parts-of-speech are chosen

slide-11
SLIDE 11

Selected Typed Dependencies

acomp - Adjectival Complement Description:

 An adjectival complement (acomp) of a VP is an adjectival

phrase.

 The adjectival phrase functions as the complement (like an

  • bject of the verb)
  • Example:

Dependency Relation Component Example Indication acomp worked/VBD fine/JJ Possible Opinion acomp proved/VBN reliable/JJ Possible Opinion acomp works/VBZ well/JJ Possible Opinion Tag POS

VBD Verb, past tense VBN Verb, past participle VBZ Verb, 3rd person singuar present JJ Adjective

slide-12
SLIDE 12

Selected Typed Dependencies

xcomp – Open Clausal Complement Definition:

 An open clausal complement (xcomp)of a VP or an ADJP is a

clausal complement without its own subject, whose reference is determined by an external subject.

  • Example:

Dependenc y Relation Component Example Indication xcomp easy/JJ use/VB Possible Opinion xcomp rendering/VBG impossible/JJ Possible Opinion xcomp found/VBD difficult/JJ Possible Opinion xcomp makes/VBZ ideal/JJ Possible Opinion xcomp find/VBP convenient/JJ Possible Opinion xcomp experienced/VBN similar/JJ Not Opinion Tag POS JJ Adjective VB Verb, base form VBG Verb, gerund or present participle VBD Verb, past tense VBZ Verb, 3rd person singular present VBP Verb, Non-3rd person singular present VBN Verb, past participle

slide-13
SLIDE 13

Selected Typed Dependencies

advmod – Adverbial Modifier Definition:

 An adverbial modifier(advmod) of a word is a (nonclausal) RB or

ADVP that serves to modify the meaning of the word.

  • Example:

Dependency Relation Component Example Indication advmod well/JJ amazingly/RB Possible Opinion advmod easily/RB very/RB Possible Opinion advmod loads/VBD fast/RB Possible Opinion advmod looks/VBZ especially/RB Not Opinion advmod fits/VBZ perfectly/RB Possible Opinion advmod recognized/VBN straight/RB Not Opinion advmod satisfied/VBN very/RB Possible Opinion advmod priced/VBN reasonably/RB Possible Opinion Tag POS JJ Adjective RB Adverb VBD Verb, past tense VBZ Verb, 3rd person singular present VBN Verb, past participle

slide-14
SLIDE 14

Selected Typed Dependencies

amod – Adjectival Modifier Definition:

 An adjectival modifier of an NP is any adjectival phrase that

serves to modify the meaning of the NP.

Examples:

  • It has a nice design.
  • amod(design, nice) – Opinion (design)
  • I bought this new camera last year.
  • amod(camera, new) – Not Opinion
  • It has a pink cover.
  • amod(cover, pink) – Not Opinion
  • It looks nice.
  • acomp(looks, nice) – Opinion (design)
slide-15
SLIDE 15

Opinion Detection Algorithm

  • 1. for each sentence in review text

2. set Opinion_Flag=False

3. check acomp_presence

4. if present

5. if governor is any form of verb

6. if dependent is any form of adjective

7. set Opinion_Flag=True

10. else if check xcomp_presence

11. if present

12. if governor is any form of adjective

13. if dependent is any form of verb

14. set Opinion_Flag=True

15. else if governor is any form of verb

16. if dependent is any form of adjective

17. set Opinion_Flag=True

18. else if check advmod_presence

19. if present

20. if dependent is any form of adverb

21. if governor in any form of verb

22. set Opinion_Flag=True

23. else if governor is any form of adverb

24. set Opinion_Flag=True

25. else if governor is any form of adjective

26. set Opinion_Flag=True

slide-16
SLIDE 16

Pre-processed Data

  • Manually annotated 50 reviews for training, 50 reviews for testing.
  • Product Domain: Electronics, Product Type: Hard disk.
  • Example:

Product Features Opinion Sentence Usability ‘It was incredibly easy to set up and use.’ Design ‘I like its design and the fact that I only need one cable.’ Performance ‘Works perfectly and is completely reliable, no problem at all.’ Portability ‘I found this product really useful for transport as it is that small.’ Speed ‘The speed and capacity of the Passport drive are impressive.’ General ‘A satisfying product.’

slide-17
SLIDE 17

Product Feature Assignment

 Counting Frequent Words:

  • Only words in the components of the

typed dependency relations are counted.

  • Function words are ignored.
  • Lemmatization is used to ensure counting
  • f only the base form of words.
  • Word counts are done within product

feature scopes.

slide-18
SLIDE 18

Product Feature Assignment

 Let,

 Total number of review lines: N  Set of product features:  Frequency of the word w at review line i associated with

product feature, pj : wi,j

 Word frequency count, WCj for word w within pj

product feature scope can be denoted by the following equation:

 For different values of j, word frequency of the same

word w will be different because associated product feature pj will be different.

P p p p

j 

,..., ,

2 1

N i j i j

w WC

1 ,

slide-19
SLIDE 19

Product Feature Assignment

 Word synonyms are taken using Wordnet’s synsets. Factors:

 Words in synsets are not originally present in review line  Context might be different  Polysemous synonyms exists.

 Let, for word w,

 Number of synsets : k  Number of synonyms in ith synset :ni  The probability of each synonym to be the appropriate

synonym of the original word, w is considered by the following probability function:

k i i

n w P

1

1 ) (

slide-20
SLIDE 20

Product Feature Assignment

 Normalizing within product feature scope:

  • tf.idf metric is used.
  • Term Frequency:
  • Let, frequency of word wi in a product feature scope pj: WCi,j
  • Number of unique words in product feature scope pj : k
  • Then, term frequency, tfi,j can be denoted by:
  • Inverse Document Frequency:
  • Let, total number of product features assigned: |P|
  • Number of product features with which the word wi appears:
  • Inverse document frequency idfi can be calculated by the

following:

k j k j i j i

WC WC tf

, , ,

 

p w p

i 

:

 

p w p s P idf

i i

   : log

slide-21
SLIDE 21

Product Feature Assignment

Product feature score:

 For each product feature, a product feature score is calculated

using the following formula:

 f(relation) is a function that calculates the tf.idf weight score

for each of the components of a typed dependency relation considering a specific product feature.

 PFS represents the contribution of a set of words in a

sentence towards different product features.

 PFS will have different values for each of the product features.  Higher PFS for a product feature=words in the Typed

Dependency Relation is more indicative of that product feature,

  

   mod) ( ) ( ) ( adv f xcomp f acomp f PFS

slide-22
SLIDE 22

Product Feature Assignment

 Product feature selection:

 Let, c be the product feature class for which the product

feature score, PFS is calculated.

 Then each opinion sentence is assigned to a product feature

class c* where,

 1% of the highest PFS score is set as a threshold.  Below threshold, PFS is not strong enough to indicate a

product feature. ‘No Opinion’ is considered.

PFS c

c

max arg * 

slide-23
SLIDE 23

Test Result

 50 test reviews having 220 sentences.  Opinion Sentences: 113, No Opinion: 107  Product Domain: Electronics, Product type: hard disk

Precision Recall F-measure 0.7231 0.4159 0.5281

Product Feature Precision Recall F-measure General 0.7778 0.1228 0.2121 Usability 0.9231 0.7500 0.8276 Design 0.6364 0.4667 0.5385 Performance 0.5833 0.7778 0.6667 Portability 0.7143 0.5556 0.6250 Speed 0.3077 0.5714 0.4000 No Opinion 0.5742 0.8318 0.6794

slide-24
SLIDE 24

Improvement challenges:

 Sentence Segmentation:

 Single sentence with more than one product feature

 Example 1: I love its sleekness and elegance, I love how

lightweight it is, I love how quiet it is and it’s surprisingly quite speedy too.

 Example 2: I would definitely recommend the Western Digital

Passport to anyone looking for a compact, portable, easy to use and affordable!

 Polysemous synonyms.  Incorporating amod typed dependency relation

slide-25
SLIDE 25

Conclusion and Future works

  • This paper discusses a process to detect opinion

sentences and assigns a product feature to each opinion sentences.

  • Typed dependency relations and frequent word

associations have been utilized to achieve the desired goal.

  • Future works will involve:
  • identifying appropriate segmentation methodology to aid

the system.

  • implementing the process in a number of varied domains

with more test data.

  • exploring left and right context of the dependencies for

more supporting information.

slide-26
SLIDE 26

Questions and Discussion