INFORMATION VISUALIZATION Alvitta Ottley Washington University in - - PowerPoint PPT Presentation

information visualization
SMART_READER_LITE
LIVE PREVIEW

INFORMATION VISUALIZATION Alvitta Ottley Washington University in - - PowerPoint PPT Presentation

CSE 557A | Sep 07, 2016 INFORMATION VISUALIZATION Alvitta Ottley Washington University in St. Louis Recap GRAPHICAL INTEGRITY Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Above


slide-1
SLIDE 1

INFORMATION VISUALIZATION

Alvitta Ottley Washington University in St. Louis CSE 557A | Sep 07, 2016

slide-2
SLIDE 2

Recap…

slide-3
SLIDE 3

GRAPHICAL INTEGRITY

Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. “Above all else show the data”

slide-4
SLIDE 4

THE LIE FACTOR

  • Tufte coined the term “the lie factor”, which is

defined as:

Lie_factor =

  • “High” lie factor (LF) leads to:
  • Exaggeration of differences or similarities
  • Deception
  • Misinterpretation
slide-5
SLIDE 5

HOW TO NOT LIE

“Maximize the Data-Ink Ratio”

slide-6
SLIDE 6

DATA-INK RATIO

  • The goal is to aim for high data-ink ratio
  • Ink used for the data should be relatively large compared to the ink

in the entire graphic

slide-7
SLIDE 7

GRAPHICAL EXCELLENCE

1. Graphical excellence is the well-designed presentation of interesting data – a matter of substance, of statistics, and of design. 2. Complex ideas communicated with clarity, precision, and efficiency. 3. Gives the viewer the greatest number of ideas in the shortest time with the least ink in the smallest place. 4. Nearly always multivariate 5. Requires telling the truth about the data.

slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11

Today…

Design Critiques

slide-12
SLIDE 12

Andrew

slide-13
SLIDE 13

GOOD

  • Simple yet effective.
  • The axes and grid lines make the chart

easily digestible.

  • The colors and “frontier” line make

the message of the chart clear.

  • The four teams that went the farthest

in the playoffs are all in color and are all on that rightmost “frontier”.

  • Other teams stats are present for

context but are greyed out because to reduce distraction

slide-14
SLIDE 14

BAD

slide-15
SLIDE 15

BAD

  • Having both visualizations is redundant

and combining them into one more visualization would be more effective.

  • Data labels are on their sides so it is

difficult to actually read the numbers themselves

  • For the first chart, there is a column

for total viewers which I believe makes the relative size of ESPN’s columns seem smaller as at first glance ESPN’s columns do not appear to be the biggest

slide-16
SLIDE 16

BETTER

7305 7341 7610 899 2302 2464 1831 1605 1770 559 458 740 435 398 466 324 341 373 500 327 369 468 575 600 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Sports Network Prime-Time Viewership (in Millions) August-October (Total)

ESPN NFL Network ESPN2 Fox Sports 1 MLB Network Golf Channel NBCSN Other

2011 2012 2013

slide-17
SLIDE 17

Arivan

slide-18
SLIDE 18

GOOD

  • Extremely clear: the reader can

see right away which areas in which parts of the country have the most voting influence and how they voted.

  • Good color choices: basic but that

is all that was needed; any other embellishments may have made the visualization to be more confusing.

slide-19
SLIDE 19

BAD

  • The title of this visualization

is very misleading.

  • Data is represented for

each state as a part of a whole shown by the ratio

  • f the marijuana leaf that is

shaded.

slide-20
SLIDE 20

BETTER

slide-21
SLIDE 21

Claire

slide-22
SLIDE 22

GOOD

  • Clearly labeled axes and years
  • Easy to see variations at

different time points along the graph and to distinguish between years/colors

  • Conveying a good amount of

information without being too cluttered or trying to squeeze too much into one graph

slide-23
SLIDE 23

BAD

  • No y-axis scale
  • Focus is more on the bars than the data/

numbers, the numbers are very large and do not accurately represent a comparison

  • Unclear:
  • If “Deficit” is difference between the

“Normal” and “Year to Date”,

  • If “Year to Date” starts on January 1,

2016 or another time point,

  • What “Normal” is referencing - normal

rain amount for the year until this day? What is “Normal”?

  • Overall vague and lack of information
slide-24
SLIDE 24

BETTER

  • Merged the “Deficit” with the “Year to Date” to

more clearly portray that the “Deficit” is the difference between what the expected rainfall is, and what the rainfall has been

  • Added y-axis and years to clarify

measurements and in what terms these measurements are being compared

  • Not as appealing for viewers watching the news

who want a quick, appealing visual, but less vague and more accurate

24 24.5 25 25.5 26 26.5 27

Drought Update for Dayton, OH (as of 8/15/16)

1.76 in

2015 2016 inches year

slide-25
SLIDE 25

Clayton

slide-26
SLIDE 26

GOOD

  • Visualization of scale done properly
  • horizontally split up into scales of

code in a descending fashion

  • Good use of colors
  • Before making the jump between a

hundred thousand and 1 million lines

  • f code the chart shows the scale of

a million lines of code on the hundred thousand scale and the 1 million scale before continuing on

slide-27
SLIDE 27

BAD

  • Although the end product is

aesthetically pleasing as a picture it fails to show a clear picture

  • It is not stated what the heatmap to

the right corresponds to it just shows numerical values in relation to the heatmap

  • The number 200 could be great or

terrible, it’s just a number with an associated color.

  • Graph does a poor job of showing

relations between the different reported states

slide-28
SLIDE 28

BETTER

  • The axis is shifted so that midnight

would be in the middle

  • Change the heatmap to a 0 to 100

scale indicating quality of sleep with 100 being the highest quality of sleep,.

  • Compute an average across the

different days but have each reported state contribute a different value

slide-29
SLIDE 29

Eric

slide-30
SLIDE 30

BAD

  • too comprehensive with all the transactions that occur at a fire

incident

  • The tabular data uses terms that are fit for the audience (the general

public)

  • regarding the "False" item, there is no clear indicator visually what

false implies

  • Accurate interpretation of meaning and context of the tabular data

is virtually impossible

  • the data reported for 2015 only reflects 2 months of information

while the data for 2013 and 2014 reflect a full complement of data.

slide-31
SLIDE 31

BETTER

slide-32
SLIDE 32

Evan

slide-33
SLIDE 33

GOOD

  • Visually interesting, and succinctly show

how each algorithm works

  • Takes you through the steps of sorting

an unordered list of 15 elements

  • Coloring of each line indicates its order

in the list, and by the nature of the sorting, the visualizations draw the eye from left to right

  • Give a lot of information on the sorting

algorithm; how it works, how an element moves throughout the algorithm, how quickly sections of the list become sorted, etc.

slide-34
SLIDE 34

BAD

  • Seems to mislead
  • One would assumed that

the higher points in the graph represent a higher number of gun deaths

slide-35
SLIDE 35

BETTER

  • Flipped the vertical axis to

fit with the natural assumption of higher points representing more gun deaths.

  • Spread the points out

more on the horizontal axis.

  • Added more labels
  • Added more recent data
slide-36
SLIDE 36

Jarett

slide-37
SLIDE 37

GOOD

  • http://polygraph.cool/nba/
  • Very easy to use (scrolling), interactive
  • Easy to read text
  • Meaningful colors
  • Plenty of information
  • Can be used at viewer’s discretion; viewer won’t be overwhelmed with information
  • Data is very clear, immediately know what you are looking at based on the title
slide-38
SLIDE 38

BAD

slide-39
SLIDE 39
  • Found on Forbes
  • http://www.forbes.com/sites/markfidelman/2012/06/05/heres-the-real-reason-there-are-not-more-women-in-technology/2/#4bfe90433c67
  • Funnels serve absolutely no purpose other than to probably induce

bias

  • ‘Interest’ is a confusing metric
  • Most likely means of the people that are interested, 35% are females and

65% are males

  • There are increases from levels of the funnel (e.g. 65% à 82%) yet

the funnel gets smaller

  • Funnel changes size at a constant rate for both male and female,

regardless of the percentages

  • Funnel makes data difficult to understand and does not help with

comprehension

  • Labels are inconsistently placed
  • Interest is not between percentages
  • Tech degrees has a meaningless arrow
slide-40
SLIDE 40

BETTER

  • Separated funnel into four bars
  • Bars show percentage of males and

females in a particular category

  • Females are red (as indicated)
  • Males are blue (as indicated)
slide-41
SLIDE 41

WHY IT’S BETTER AND FIXED

  • Clearly shows distribution between males

and females for each category

  • More meaningful title
  • Better explained ‘interest’
  • Consistent labeling, easy to read and

understand

  • Percent changes are actually visually

displayed, unlike the funnel

  • Clear, clean data; nothing to induce bias
  • Can quickly and easily see the difference in

percentages between males and females

slide-42
SLIDE 42

John

slide-43
SLIDE 43

GOOD

  • Clear title and data labels
  • Color spectrum to show

range

  • Good = blue
  • Bad = purple
  • Scale on right graph is

consistent

  • Interactive
  • Hover over state and see data
  • n graph
  • Can drill down into the data
  • Highlight smaller states on

legend to see corresponding data

http://stateofobesity.org/adult-obesity/

slide-44
SLIDE 44

BAD

  • Two sets of non correlating data
  • Scales are off by a couple of factors
  • 3D chart is inappropriate for this set of

data

  • Hard to see the larger data relative to

the y-axis

https://www.reddit.com/r/dataisugly/

slide-45
SLIDE 45

BETTER

  • Break single chart into two
  • Helps see data trends between the two sets
  • f data
  • Avoids any confusion about the correlation

between the two

  • Fixes the scaling of axes
  • Can pick proper intervals for both
  • Proper Alignment do a better job for

data comparison

50,000 51,000 52,000 53,000 54,000 55,000 2009 2010 2011 2012 2013 2014 2015

Number of Wells

Class II (SWD, ER, HC) Annual Well Inventory 2009-2015

50 70 90 110 130 2009 2010 2011 2012 2013 2014 2015

Number of Wells

Class III (Bring Mining) Annual Well Inventory 2009-2015

slide-46
SLIDE 46

Jordan

slide-47
SLIDE 47

http://graphics.wsj.com/infectious-diseases-and-vaccines/

GOOD

  • Easy to understand the

message the designer is trying to get across.

  • Interacting adds another layer
  • f reassurance and

preciseness of the data.

  • Clearly states what the data

being represented is and how it is portrayed graphically.

  • Very little wasted ink.
slide-48
SLIDE 48

BAD

  • Arrow representing 20 ft. is unproportioned

to the arrow representing 10 ft.

  • The cell towers in the background seem to

be designed to mislead the reader to thinking that the spacecraft is the size of a cellphone tower.

  • Weight is displayed in words but has no

visual correspondent.

  • The length actually represents the wingspan

however the wings are not fully extended in the picture example.

  • The title is “Instruments and Mission”

however the graph shows little to none of these explicitly.

slide-49
SLIDE 49

Wingspan with Solar Panels: 20 ft. 3 in. Height: 10 ft. Dimension: 8 ft. X 8 ft. Tagsam Sample Arm Length: 11 ft. Shaq’s Size: 344 lb, 7

  • ft. 1 in.

OSIRIS-Rex’s Specifications

OSIRIS-Rex Weighs About 13.5 Shaqs (4650 lbs)

slide-50
SLIDE 50

Joshua

slide-51
SLIDE 51

GOOD

  • Incredibly easy to follow
  • Minimal chart junk while still

retaining memorability about important points

  • Key allows the reader to interpret

information easily

  • Each arrow can encode multiple

forms of information while still allowing the reader to process the information efficiently

  • Captions include additional

information not immediately available from data

Source: “A Wired Analysis of Tech Company Tax Schemes”

  • Article by Lee Simmons
  • Visualization by

Valerio Pellegrini

slide-52
SLIDE 52

BAD – PART 1

  • Not that easy to compare values
  • Values are encoded by
  • Diameter/radius but the eye is first
  • Drawn to an area comparison
  • A lot of extra information
  • Aesthetically messy

“The Olympic Games Always Go Over Budget, in One Chart (1968-2016)” Source: http://howmuch.net/articles/olympiccosts Dataset: http://howmuch.net/sources/olympiccosts

slide-53
SLIDE 53

BAD – PART 2

  • This visualization has the same

issues as its companion

  • Additionally, percent of cost
  • verrun is a misleading statistic
  • For a small number, a large percent
  • verrun can still be a very small

number

  • No sense of original budget or cost

unless combined with previous graph

“The Olympic Games Always Go Over Budget, in One Chart (1968-2016)” Source: http://howmuch.net/articles/olympiccosts Dataset: http://howmuch.net/sources/olympiccosts

slide-54
SLIDE 54

BETTER

  • Original budgets are

calculated and included in this visualization

  • Simpler, more easily

understandable encoding of cost data

  • Comparison of original

budget vs. actual cost is much clearer

  • Not as immediately appealing

in terms of visuals, but the data comes first

slide-55
SLIDE 55

Kelly

slide-56
SLIDE 56

GOOD

  • To scale
  • Shows total spending and how it is

broken up in the same image

  • Clear labeling
  • Color coding
  • Context is limited (only 2 years), but

not needed to convey message (distribution)

slide-57
SLIDE 57

BAD

slide-58
SLIDE 58

BAD

  • PRO: to scale!
  • Black circle: 1 cm2 : 16 people = * 16
  • Blue circle: 3.24 cm2 : 50 people = * 15.4
  • Dotted circle: 4.62 cm2 : 70 people = * 15.2
  • CON: optical illusion
  • Which is larger: the center circle or the

ring between the dotted line and the blue circle?

slide-59
SLIDE 59

BETTER

  • Shows each of the 70 companies and where

they are in the crowdfunding process

  • No optical illusions
  • Further improvements beyond my GIMP

abilities:

  • Nesting boxes like the original circle image
  • Clearer labeling
  • Is there a better way to communicate

steps in a process?

slide-60
SLIDE 60

Nathan

slide-61
SLIDE 61
slide-62
SLIDE 62

POSITIVE QUALITIES

  • Area and color show the relative

weight of each state.

  • Allows for a high-level overview and

detailed analysis.

  • Accompanied by sufficient

explanation.

slide-63
SLIDE 63
slide-64
SLIDE 64

NEGATIVE QUALITIES

  • Vertical positioning does not

correspond to price, only relative.

  • Color and position (left/

right) both encode the same information (approval status).

  • Positive: all the information is

available to the reader.

slide-65
SLIDE 65
slide-66
SLIDE 66

RE-DESIGN BENEFITS

  • Scale is correct, easy to see

relative price values.

  • Category coloring gives a

sense of popular themes.

  • Negative: lose some

information

slide-67
SLIDE 67

Shengmin

slide-68
SLIDE 68

GOOD

  • Low data-ink ratio and high lie

factor BUT “Chart junk” can help to improve the recognizability of specific buildings and landmarks

  • Lie factors of this map is no

longer a critical problem

  • A good data visualization

answers readers’ questions, and this map answers many questions about where one should go.

slide-69
SLIDE 69

BAD

  • Low data-ink ratio and high lie factor

AND these problems do decrease the accuracy of the chart

  • visual embellishments should be

considered as “chart junk” because they are irrelevant to what the data shows

  • Does the size of human figure

represent the popularity of a certain language? 10 languages shown but only 3 human figures

  • The bars in this chart are out of

proportion

slide-70
SLIDE 70

BETTER

  • removed the visual

embellishments like the human figures and the dialog box logos in the bars, and turned it into a plain chart.

  • put the numbers on the right
  • f the bars and delete the

numerical unit part so that the left side was less messy with words

slide-71
SLIDE 71

NEXT TIME…

PROCESSING