Visualizing Text Sentiment VisSoft 2016 October 4, 2016 - - PowerPoint PPT Presentation

visualizing text sentiment
SMART_READER_LITE
LIVE PREVIEW

Visualizing Text Sentiment VisSoft 2016 October 4, 2016 - - PowerPoint PPT Presentation

Visualizing Text Sentiment VisSoft 2016 October 4, 2016 Christopher G. Healey Department of Computer Science Institute for Advanced Analytics North Carolina State University healey@ncsu.edu http://www.csc.ncsu.edu/faculty/healey sic


slide-1
SLIDE 1

Visualizing Text Sentiment

VisSoft 2016

October 4, 2016 Christopher G. Healey

Department of Computer Science
 Institute for Advanced Analytics North Carolina State University healey@ncsu.edu http://www.csc.ncsu.edu/faculty/healey
 NC STATE UNIVERSITY

sic parvis magna

slide-2
SLIDE 2

Visualization

  • Harness viewer’s strengths
  • human visual system
  • pattern recognition capabilities
  • domain expertise
  • understanding context
  • ability to manage ambiguity
  • Collaboration between


viewer and computer

  • Enhance each participant’s


individual strengths

  • Share initiative to offset


their weaknesses

Tweet affinity graph:

tweet property → hue, frequency → size, edges → affinity, proximity → similarity Data courtesy Twitter, Inc.

go.ncsu.edu/tweet-viz

slide-3
SLIDE 3

Nightingale’s Rose Chart

Rose or Coxcomb chart of causes of death during the Crimean War (1854–1855):

month → wedge; number of deaths → area; type of mortality → hue (blue: preventable; pink: wounds; black: other) Data courtesy understandinguncertainty.org/node/213

slide-4
SLIDE 4

Dot Map

John Snow’s cholera map (1854):

Cholera Patient → dot Data courtesy www.ncgia.ucsb.edu/pubs/snow/snow.html

slide-5
SLIDE 5

Painterly Visualization

Painterly visualization of a simulated supernova collapse:

pressure → luminance, velocity → hue, flow direction → orientation Data courtesy Dr. Jon Blondin, Astrophysics, NCSU
 Tateosian et al. "Engaging Viewers Through Nonphotorealistic Visualizations,” NPAR 2007, pp. 93–102.

slide-6
SLIDE 6

Tag Cloud

Wordle tag cloud:

term → text, term frequency → size www.wordle.net

slide-7
SLIDE 7

Phrase Nets

Phrase nets:

term frequency → size, links → neighbour relationship www-958.ibm.com/software/analytics/manyeyes/

slide-8
SLIDE 8

“Preattentive” Features

  • Basic visual features are detected by our


low-level visual system

  • detection is rapid, usually in one “glance” of


100–250 msec

  • can determine presence, absence, amount
  • unique features capture our focus of attention
  • Initially proposed as an automatic, bottom-up


phenomena

  • Treisman’s feature map theory
  • Revised to include bottom-up and top-down


influence

  • Wolfe’s guided search

Orientation Hue

slide-9
SLIDE 9

Hue Target

Absent Present

www.csc.ncsu.edu/fauclty/healey/PP

slide-10
SLIDE 10

Hue Target

Present Absent

www.csc.ncsu.edu/fauclty/healey/PP

slide-11
SLIDE 11

Curvature Target

Absent Present

www.csc.ncsu.edu/fauclty/healey/PP

slide-12
SLIDE 12

Conjunction Target

Present Absent

www.csc.ncsu.edu/fauclty/healey/PP

slide-13
SLIDE 13

Conjunction Target

Present Absent

www.csc.ncsu.edu/fauclty/healey/PP

slide-14
SLIDE 14

Ensemble Coding

All Green Circles > All Blue Circles More Large Blue Circles

Identify Which Colour has Larger Average Size

slide-15
SLIDE 15

Perceptual Guidelines

  • Choice of data-feature mapping guided by knowledge of human

visual perception

  • Color: hue, saturation, luminance, and/or chromaticity (hue + saturation)
  • Texture: size, orientation, density, regularity of placement
  • Motion: flicker, phase, direction, and velocity
  • Feature “hierarchies” control order of data-feature mapping
  • Luminance dominates hue, color dominates texture, regularity is

perceptually weak, so:

  • most important data attributes are assigned to luminance,
  • then hue or chroma,
  • then size, orientation, or density,
  • then regularity
slide-16
SLIDE 16

Postattentive Amnesia

  • If viewers are allowed to preview a scene, will they be faster to

answer questions about the details of the scene?

  • Intuition suggest they will
  • Implies viewers have the ability to extract


detail throughout a scene, access it rapidly


  • n demand
  • Various experiments have shown that


human vision does not work in this
 manner

  • Vision is not a camera that can “snapshot”


a full-detail representation of a scene

  • Results suggest that detail is only available at the most recent focus of

attention

Priming Image

slide-17
SLIDE 17

Postattentive Amnesia

  • If viewers are allowed to preview a scene, will they be faster to

answer questions about the details of the scene?

  • Intuition suggest they will
  • Implies viewers have the ability to extract


detail throughout a scene, access it rapidly


  • n demand
  • Various experiments have shown that


human vision does not work in this
 manner

  • Vision is not a camera that can “snapshot”


a full-detail representation of a scene

  • Results suggest that detail is only available at the most recent focus of

attention

Search Image

slide-18
SLIDE 18

Search With no Priming

slide-19
SLIDE 19

Search With no Priming

Present

slide-20
SLIDE 20

Primed Search

slide-21
SLIDE 21

Primed Search

Absent

slide-22
SLIDE 22

Change Blindness

  • Visual system has limited memory for detail, often

restricted to focus of attention

  • Visual disruption (e.g., eye


saccade) can render us “blind”
 to changes in a scene

  • Example: find differences


between two images

  • Original research conducted


at Nissan’s Cambridge Basic
 Research Centre

  • studying why accidents occur
  • significant visual evidence of a potential accident
  • sufficient time to avoid accident
slide-23
SLIDE 23

Find Five Differences

slide-24
SLIDE 24

Find Five Differences

eyes tilted up bee’s stripe colours reversed extra leaf patch on knee extra flower

slide-25
SLIDE 25

Change Blindness

Data courtesy Ron Rensink, Department of Psychology, UBC

slide-26
SLIDE 26

Change Blindness

Data courtesy Ron Rensink, Department of Psychology, UBC

slide-27
SLIDE 27

Change Blindness Models

  • Overwriting
  • current image overwritten


by new one

  • First impression
  • initial view abstracted
  • Nothing is stored
  • scene abstracted with no


details

  • Feature combination
  • previous and new views


combined

  • Everything is stored, nothing is compared
  • details cannot be accessed without external stimulus

Main actor changes across movie cut

slide-28
SLIDE 28

http://www.icarus.ca/icarus/?p=1024

Okanagan Mountain Park Fire (Kelowna, BC, 2003)

64000 acres, $33.8 million,239 homes destroyed

slide-29
SLIDE 29

National Cohesive Strategy

To safely and effective extinguish fire, when needed; use fire where allowable; manage our natural resources; and as a Nation, live with wildland fire.

National Cohesive Wildland Fire Management Strategy April, 2014

http://www.forestsandrangelands.gov/strategy

slide-30
SLIDE 30

National Interagency Fire Center


http://www.nifc.gov/fireInfo/fireInfo_stats_histSigFires.html

Miramachi, 3M Great Fire, 1.5M Yaquina, 450K Coos, 300K Lower Michigan, 2.5M South Carolina, 3M Adirondack, 637K Great Idaho, 3M Cloquet-Moose Lake, 1.2M Tillamook, 311K Seige of '87, 640K Yellowstone, 1.6M Inowak, 610K Dunn-Glen, 288K Rodeo-Chediski, 462K Cedar, 275K Taylor, 1.3M East Amarillo, 907K Big Turnaround, 388K Murphy, 652K Long Butte, 300K Wallow, 538K Whitewater-Baldy, 298K Long Draw, 558K Rim, 257K 500 1,000 1,500 2,000 2,500 3,000 3,500 1825 1845 1853 1868 1881 1898 1903 1910 1918 1933 1987 1988 1997 1999 2002 2003 2004 2006 2007 2007 2010 2011 2012 2012 2013

Acres Thousands Year

Twenty-Five Largest Fires By Acres

slide-31
SLIDE 31

National Interagency Fire Center


http://www.nifc.gov/fireInfo/fireInfo_stats_totalFires.html

2,000 4,000 6,000 8,000 10,000 12,000 50 100 150 200 250 300 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Acres Thousands Fires Thousands Year

Number of Fires and Acres Burned

Fires Acres

slide-32
SLIDE 32

National Interagency Fire Center


http://www.nifc.gov/fireInfo/SuppCosts.pdf

200 400 600 800 1,000 1,200 1,400 1,600 1,800 2,000 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013

Dollars Millions Year

USFS / DOI Wildfire Costs

US Forest Service Department of Interior

slide-33
SLIDE 33

Project Objectives

  • What are dominant wildfire and


risk narratives communicated
 through social media?

  • How are narratives shaped by


ecological, social, and political
 characteristics?

  • Community Engagement: Can fire officials communicate risk

mitigation strategies via Twitter?

  • Communication: Can Joint Fire Science monitor and communicate

with a community during a wildfire event via Twitter?

slide-34
SLIDE 34

Colorado Springs Fire Department Wildfire Mitigation


http://www.springsgov.com/Page.aspx?NavID=101

slide-35
SLIDE 35

Project Plan

  • 1. Capture, index, store wildfire


incident Twitter communication

  • 2. Perform thematic and


sentiment analysis of tweets

  • 3. Analyze and visualize information


flow within social media networks

  • Community engagement: for risk mitigation prior to a wildfire
  • Communication: between emergency management and

communities during wildfire events

slide-36
SLIDE 36

Data Capture

  • Capturing tweets with keywords “wildfire” and “forest service”

since May 14, 2013

  • 5.3 million tweets stored in MySQL database
  • Extracting relevant tweet properties
  • date and time
  • author
  • body
  • geolocation
  • DenverCP | -104.994593,39.746012 | Wildfire burning SW
  • f Beulah closes Hwy. 165: BEULAH, Colo. A small

wildfire burned in Pueblo County just... http://t.co/ FfPeLrpIS6 | Sun Jun 01 02:21:12 +0000 2014

slide-37
SLIDE 37

Sentiment

  • “An attitude, thought, or judgment prompted by feeling”
  • Natural language processing (NLP) approaches
  • Subjectivity classification, machine learning, semantic orientation
  • Sentiment dictionaries
  • Profile of mood states (POMS): tension–anxiety, depression–dejection,

anger–hostility, fatigue–inertia, vigor–activity, confusion–bewilderment

  • Affective Norms for English Words (ANEW): valence, arousal,

dominance

  • SentiStrength: 298 positive terms, 465 negative terms, support for social

network text

  • SentiWordNet: Sentiment scores for WordNet synsets
slide-38
SLIDE 38

Emotional Models

  • Psychological models of emotion
  • Russell’s emotional circumplex:

  • rthogonal valence and arousal


axes

  • Emotional scatterplot
  • 2D scatterplot, valence and


arousal as horizontal and vertical axes

  • Intermediate regions: upset,


stressed, nervous, tense

  • Similar alternative models
  • Plutchik’s eight bipolar dimensions
  • Thayer’s tense–calm, tired–energy


model

Russell’s circumplex Plutchik’s emotion wheel

slide-39
SLIDE 39

Recent Tweet Visualizer

  • Visualize “recent” tweets that match user-chosen keywords
  • Twitter API supports keyword searches in the recent tweet pool

1. Allow users to enter keyword search string 2. Query Twitter for recent tweets matching keywords 3. Identify tweets with at least n=2 dictionary terms 4. Estimate the sentiment of each tweet 5. Visualize tweets on an emotional scatterplot

  • Map valence and arousal to hue and luminance
  • Map two measures of confidence to size and opacity
slide-40
SLIDE 40

Emotional Scatterplot

slide-41
SLIDE 41

Tweet Glyphs

valence ⟶ hue arousal ⟶ luminance response frequency confidence ⟶ size standard deviation confidence ⟶ opacity

pleasant unpleasant sedate low low active high high

slide-42
SLIDE 42

Sentiment Scatterplot

King Fire (El Dorado County, CA, 2014)


97717 acres, $91 million, 80 residences destroyed

slide-43
SLIDE 43

Sentiment Values

King Fire (El Dorado County, CA, 2014)


97717 acres, $91 million, 80 residences destroyed

slide-44
SLIDE 44

Affinity Graph

King Fire (El Dorado County, CA, 2014)


97717 acres, $91 million, 80 residences destroyed

slide-45
SLIDE 45

Conclusions

  • Text visualization has matured
  • Numerous techniques tailored to specific text properties
  • Focus on analytics for analyzing massive data collections prior to

visualization

  • Focus on deriving useful properties from text
  • Sentiment analysis continues as an active research area
  • Multiple approaches, depending on text type
  • Unresolved challenges: negation, context, subject identification, sarcasm
  • Text analytics coupled with visualization can provide useful

insights

  • Environmental risk management and tracking
  • Political trend analysis
slide-46
SLIDE 46

Tweet Visualizer Demonstration

slide-47
SLIDE 47

Contact Information

Christopher G. Healey

healey@ncsu.edu www.csc.ncsu.edu/faculty/healey

Fire chasers Project

research.cnr.ncsu.edu/blogs/firechasers

Tweet Visualizer

www.csc.ncsu.edu/faculty/healey/
 tweet_viz/tweet_app

NC STATE UNIVERSITY

slide-48
SLIDE 48