SENTIMENT ANALYSIS CS 498 | Mar 6 Macbeth, Scene 1, Act 2 from - - PowerPoint PPT Presentation

sentiment analysis
SMART_READER_LITE
LIVE PREVIEW

SENTIMENT ANALYSIS CS 498 | Mar 6 Macbeth, Scene 1, Act 2 from - - PowerPoint PPT Presentation

SENTIMENT ANALYSIS CS 498 | Mar 6 Macbeth, Scene 1, Act 2 from Wordle my Citeulike page Brad Paleys TextArc. Fernanda Vigass Themail. Martin Wattenbergs recent Word Tree visualization, showing Alberto Gonzaless testimony.


slide-1
SLIDE 1

CS 498 | Mar 6

SENTIMENT ANALYSIS

slide-2
SLIDE 2

Macbeth, Scene 1, Act 2 from Wordle

slide-3
SLIDE 3

my Citeulike page

slide-4
SLIDE 4

Brad Paley’s TextArc.

slide-5
SLIDE 5

Fernanda Viégas’s Themail.

slide-6
SLIDE 6

Martin Wattenberg’s recent Word Tree visualization, showing Alberto Gonzales’s testimony.

slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9

PNNL’s ThemeRiver.

slide-10
SLIDE 10

PNNL’s IN-SPIRE.

slide-11
SLIDE 11

tools

you know...practical stuff

slide-12
SLIDE 12

Stanford’s list

http://nlp.stanford.edu/links/statnlp.html

LIWC

http://www.liwc.net

SentiWordNet

http://sentiwordnet.isti.cnr.it

Pang & Lee’s data at Cornell

http://www.cs.cornell.edu/People/pabo/movie-review-data http://www.cs.cornell.edu/home/llee/data/convote.html

slide-13
SLIDE 13

analysis & design

how we might use it and why

slide-14
SLIDE 14

If I could figure out a way to determine whether people are more fearful or changing to more euphoric … I can forecast the economy better than any way I know. e trouble is, we can't figure that out.

— Alan Greenspan, Jan 2008

slide-15
SLIDE 15

1841

slide-16
SLIDE 16

Nasdaq vs. LiveJournal “anxious” moods, Jan 3 – Oct 26, 2007.

slide-17
SLIDE 17

BOOSTED DECISION TREE CLASSIFIER

  • 1. nerv*
  • 2. wor*
  • 3. anx*
  • 4. hop*
  • 5. you*
  • 6. scar*
  • 7. tomorrow
  • 8. fun
  • 9. war
  • 10. your*
  • 11. going
  • 12. be*
  • 13. interview
  • 16. lov*, 21. hospital,
  • 36. awesome, 51. yay, 89. exam*
  • ther notables:
slide-18
SLIDE 18

Bagged Naive Bayes classifier

Anxious true positive rate: 28% Anxious false positive rate: 3.4%

Boosted Decision Tree classifier

Anxious true positive rate: ~30% Anxious false positive rate: ~6%

All LiveJournal blog posts

posts per minute: ~107

Percentage of anxious posts in 10-min period Percentage of anxious posts in 10-min period Adapted Wald adjustment (lower bound on 95% CI) Adapted Wald adjustment (lower bound on 95% CI) average 60-min moving average

slide-19
SLIDE 19

5% 10% 15% Jan 2008 Feb Mar Apr May Jun 11.6K 11.8K 12.0K 12.2K 12.4K 12.6K 12.8K 13.0K 13.2K μ + 6σ Dow Jones daily close 7-day exponential moving average Percentage anxious blog posts

Jan 26 Of three predictive spikes, this is the furthest from a local maximum: it appears 5 trading days later on Feb 1. The SC primary happens on this date. The Fed lowers rate 4 days before and 4 days after this date. Feb 24 This spike comes three days before the second most critical maxima over this six month

  • period. The Dow takes nearly

two months to recover. After searching newspapers near this date, it is not clear what event may have caused this spike. Consumer confidence and poor business/housing reports follow in the next two days. Mar 25 This spike is probably noise, although it does preface a steep

  • decline. Detecting important blogs

and topics may eliminate spikes like these. Conference Board’s consumer confidence came out this day and could be responsible for the spike. May 16 This spike appears three days before the most important local

  • maxima. As of June 24, the Dow

has still not recovered from May 19, dropping nearly 10% to date. Michigan’s consumer sentiment index came out this day, along with unexpectedly poor housing

  • numbers. May 19 followed with

many poor business reports (2.5 s.d. anxiety spikes on May 19).

11.5M posts

slide-20
SLIDE 20

5 blog genres 33 top blogs 1,094 blog comments

OUR BLOG COMMENT DATASET

slide-21
SLIDE 21

Great post and I really like the video. This is extremely similar to the approach I use in writing almost anything …

Just wait until hackers exploit the print layer to this mesh stuff enough to grab root and start injecting python code …

ProBlogger Scobelizer

slide-22
SLIDE 22

Great post and I really like the video. This is extremely similar to the approach I use in writing almost anything …

Just wait until hackers exploit the print layer to this mesh stuff enough to grab root and start injecting python code …

ProBlogger Scobelizer

slide-23
SLIDE 23

Wald method p < 0.05

neither agree 39.2%

Proportions of agreement

11.1% disagree

49.4%

slide-24
SLIDE 24

AGREE/DISAGREE/NEITHER

LEXICAL

uni/bi/trigams TFIDF

POS

raw tags combo lexical

SENTIMENT

congressional floor rotten tomatoes LIWC

SEMANTIC

sim to post ESA

NAMED ENTITY

  • rganizations

people

slide-25
SLIDE 25

LIWC pos. emotion words agree LIWC affect words agree exclamations agree adjectives agree @ neither ellipsis !disagree great agree is tech blog neither cosine similarity to post !disagree great [noun] agree personal pronouns !disagree present tense verbs neither [prepos] [poss pronoun] agree tf-idf dot product with post !neither coordinating conjunctions agree

Features + Info Gain

0.079 0.049 0.043 0.041 0.041 0.038 0.035 0.034 0.034 0.03 0.028 0.026 0.026 0.026 0.026