NLP Course Term Project Aspect Extraction and Opinion Mining of - - PowerPoint PPT Presentation

nlp course term project aspect extraction and opinion
SMART_READER_LITE
LIVE PREVIEW

NLP Course Term Project Aspect Extraction and Opinion Mining of - - PowerPoint PPT Presentation

C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface NLP Course Term Project Aspect Extraction and Opinion Mining of Product Reviews Supervised by: Submitted by: Ast.Prof. Pawan Goyal Karanam Sai Ravi Teja Mentored by:


slide-1
SLIDE 1

COMPUTER ORGANIZATION AND DESIGN

The Hardware/Software Interface

NLP Course Term Project Aspect Extraction and Opinion Mining of Product Reviews

Submitted by: Karanam Sai Ravi Teja Chandini Baratam

 K.L.S. Koutilya Varma  Lokesh Dokara  D.Anudeep  M.Akshay

Supervised by: Ast.Prof. Pawan Goyal Mentored by:

  • Mr. Abishek
slide-2
SLIDE 2

Abstract:

 Our project aims at mining reviews in order

to build a model

  • f

important product features, their evaluation by reviewers and their relative quality across products. Sub- tasks are: identify product features and

  • pinions

w.r.t them, determine

  • pinion

polarity and rank opinions based on their

  • strength. Necessary data is scraped from

Flipkart and Amazon.

slide-3
SLIDE 3

SECTION

01

Sentiment Analysis

SECTION

02

Aspect Extraction

Three Parts of the Project

SECTION

03

Evaluation of results

slide-4
SLIDE 4

Approach: Building blocks of the solution

Data is extracted from a live website from the URL given by user.

Data Sets Sentence segmentation, tokenization lemmatization, POS tagging Preprocessing

A Rule based Aspect Extraction Model is used.

Applying NLP Sentence level sentiment analysis is applied to couple aspects with their sentiment Sentiment Analysis

slide-5
SLIDE 5

Data Extraction and Cleaning :

 The Data of reviews about a product

have to be lively extracted from the e- commerce websites.

 A scrapping program is written to

extract the reviews of the product.

 These extracted reviews are stored in

in lists to facilitate the next processes.

 Different Data Sets were used for

testing and evaluation.

slide-6
SLIDE 6

POS Tagging, Lemmatization and Dependency Tree

 The reviews are first sentence

segmented

Then the segmented sentences are

POS tagged and their dependency tree is generated.

From the dependency tree the words

are lemmatized.

Now the rules are applied on the

lemmatized dependency words to extract aspect.

slide-7
SLIDE 7

Rule based Aspect Extraction:

 A set of 11 rules are applied on the

segmented sentences.

Assumption that the sentences are

grammatically correct.

The advantage of the above approach is

that it is based on the fact that English sentences follow standard structure

Aspects can be extracted independent of

the product category. Now the rules are applied on the lemmatized dependency words to extract aspect.

slide-8
SLIDE 8

A Rule based approach Implementation

I like the lens of the camera.

nsubj(like-2, I-1) root(ROOT-0, like-2) det(lens-4, the-3) dobj(like-2, lens-4) case(camera-7, of-5) det(camera-7, the-6) nmod(lens-4, camera-7) I-active token(h) Like-t Lens-(n)obj relation with like

slide-9
SLIDE 9

A Rule based approach Implementation

I like the lens of the camera.

If an active token h is in a subject noun relationship with a word t then:

If t has any direct object relation with a token n and the POS of the token is Noun and n is not in Sentiwordnet, then n is extracted as an aspect. In (2), like is in direct object relation with lens so the aspect lens is extracted.

slide-10
SLIDE 10

Input-Output

  • Input: This camera has lots of great and easy to

access settings, takes great pictures, and is small enough to travel with comfortably. the advanced features and physical controls also make it a great starter camera for amateur photographers. Output: small enough to travel | settings | lots | pictures | camera | small | has | make | features | controls | make it a great starter camera |

slide-11
SLIDE 11

Results of the Rule based Aspect Extraction:

DATA SET ASPECT RECALL ASPECT PRECISION Selected Sentences from 300 corpus 45.8% 50% 300 review corpus 42% 48%

slide-12
SLIDE 12

Rule based Sentiment Extraction (Approach1):

 A rule based opinion extraction approach is

used to extract the sentiment words in the sentence.

The rules are applied on each sentence and

the sentiment words are extracted.

These sentiment words are attached to

aspects found above.

Resulted in poor performance with less

recall and accuracy.

slide-13
SLIDE 13

Naïve Sentence level Sentiment Extraction (Approach2):

 Each sentence in a review is segmented

and the sentiment is calculated using multiplication rule.

The sentiment of each word in the

sentence, if it is an adverb, adjective or a verb is calculated from “senti-word net” and multiplied.

The sentiment of all other parts of speech

words are taken as 1.

Assumption: Each sentence has only one

aspect.

slide-14
SLIDE 14

Input-Output

  • Input:the photo quality is very good. not dslr good, but that

is to be expected. i feel that the camera takes pretty usable pictures up to iso 400, but if you plan on making very large prints (above an 8x10) it may be better to stay below iso 200.i wish the camera had fewer megapixels and better iso performance, but no camera is perfect. i highly recommend this to anyone looking for a portable high quality point shoot camera. Output: photo_quality | good | quality | . . camera | it | . i | recommend | . Sentiment score: 0.875__0.875__0.112890625__0.25__

slide-15
SLIDE 15

Results of the Sentiment Analysis part:

DATA SET APPROACH SENTIMENT RECALL 300 review corpus

Approach 1 23.5%

300 review corpus

Approach 2 56.1%

slide-16
SLIDE 16

Graphical User Interface:

Graphical User Interface is

developed using Django Framework

Takes URL of the product as the

input.

Gives Top ten aspect words based

  • n frequency of occurrence along

with their sentiment.

slide-17
SLIDE 17

Contribution of Team Members:

NAME WORK CONTRIBUTION

  • K. Sai Ravi Teja

Web Scrapping code, Integration and Testing, Sentiment Analysis (Approach 2)

  • B. Chandini

Rules implementation, Aspect-Category lexicon, Sentiment Analysis (Approach 2)

  • M. Akshay

Additional rules implementation and java code for Stanford parser, Sentiment Analysis (Improvement of Approach1) Koutilya Varma Java Code and Code connecting Java and python, Sentiment Analysis (Approach 1). Lokesh Dokara Web Application Development

  • D. Anudeep

GUI Development, Rule Modification, Sentiment Analysis rules (Approach 1).

slide-18
SLIDE 18

References:

Aspect Extraction: A rule based approach to aspect extraction from product reviews. Soujanya Poria, Erik Cambria, Lun-Wei Ku, Chen Gui, Alexander Gelbukh.

slide-19
SLIDE 19