INFORMATION VISUALIZATION Alvitta Ottley Washington University in - - PowerPoint PPT Presentation
INFORMATION VISUALIZATION Alvitta Ottley Washington University in - - PowerPoint PPT Presentation
CSE 557A | Sep 07, 2016 INFORMATION VISUALIZATION Alvitta Ottley Washington University in St. Louis Recap GRAPHICAL INTEGRITY Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Above
Recap…
GRAPHICAL INTEGRITY
Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. “Above all else show the data”
THE LIE FACTOR
- Tufte coined the term “the lie factor”, which is
defined as:
Lie_factor =
- “High” lie factor (LF) leads to:
- Exaggeration of differences or similarities
- Deception
- Misinterpretation
HOW TO NOT LIE
“Maximize the Data-Ink Ratio”
DATA-INK RATIO
- The goal is to aim for high data-ink ratio
- Ink used for the data should be relatively large compared to the ink
in the entire graphic
GRAPHICAL EXCELLENCE
1. Graphical excellence is the well-designed presentation of interesting data – a matter of substance, of statistics, and of design. 2. Complex ideas communicated with clarity, precision, and efficiency. 3. Gives the viewer the greatest number of ideas in the shortest time with the least ink in the smallest place. 4. Nearly always multivariate 5. Requires telling the truth about the data.
Today…
Design Critiques
Andrew
GOOD
- Simple yet effective.
- The axes and grid lines make the chart
easily digestible.
- The colors and “frontier” line make
the message of the chart clear.
- The four teams that went the farthest
in the playoffs are all in color and are all on that rightmost “frontier”.
- Other teams stats are present for
context but are greyed out because to reduce distraction
BAD
BAD
- Having both visualizations is redundant
and combining them into one more visualization would be more effective.
- Data labels are on their sides so it is
difficult to actually read the numbers themselves
- For the first chart, there is a column
for total viewers which I believe makes the relative size of ESPN’s columns seem smaller as at first glance ESPN’s columns do not appear to be the biggest
BETTER
7305 7341 7610 899 2302 2464 1831 1605 1770 559 458 740 435 398 466 324 341 373 500 327 369 468 575 600 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Sports Network Prime-Time Viewership (in Millions) August-October (Total)
ESPN NFL Network ESPN2 Fox Sports 1 MLB Network Golf Channel NBCSN Other
2011 2012 2013
Arivan
GOOD
- Extremely clear: the reader can
see right away which areas in which parts of the country have the most voting influence and how they voted.
- Good color choices: basic but that
is all that was needed; any other embellishments may have made the visualization to be more confusing.
BAD
- The title of this visualization
is very misleading.
- Data is represented for
each state as a part of a whole shown by the ratio
- f the marijuana leaf that is
shaded.
BETTER
Claire
GOOD
- Clearly labeled axes and years
- Easy to see variations at
different time points along the graph and to distinguish between years/colors
- Conveying a good amount of
information without being too cluttered or trying to squeeze too much into one graph
BAD
- No y-axis scale
- Focus is more on the bars than the data/
numbers, the numbers are very large and do not accurately represent a comparison
- Unclear:
- If “Deficit” is difference between the
“Normal” and “Year to Date”,
- If “Year to Date” starts on January 1,
2016 or another time point,
- What “Normal” is referencing - normal
rain amount for the year until this day? What is “Normal”?
- Overall vague and lack of information
BETTER
- Merged the “Deficit” with the “Year to Date” to
more clearly portray that the “Deficit” is the difference between what the expected rainfall is, and what the rainfall has been
- Added y-axis and years to clarify
measurements and in what terms these measurements are being compared
- Not as appealing for viewers watching the news
who want a quick, appealing visual, but less vague and more accurate
24 24.5 25 25.5 26 26.5 27
Drought Update for Dayton, OH (as of 8/15/16)
1.76 in
2015 2016 inches year
Clayton
GOOD
- Visualization of scale done properly
- horizontally split up into scales of
code in a descending fashion
- Good use of colors
- Before making the jump between a
hundred thousand and 1 million lines
- f code the chart shows the scale of
a million lines of code on the hundred thousand scale and the 1 million scale before continuing on
BAD
- Although the end product is
aesthetically pleasing as a picture it fails to show a clear picture
- It is not stated what the heatmap to
the right corresponds to it just shows numerical values in relation to the heatmap
- The number 200 could be great or
terrible, it’s just a number with an associated color.
- Graph does a poor job of showing
relations between the different reported states
BETTER
- The axis is shifted so that midnight
would be in the middle
- Change the heatmap to a 0 to 100
scale indicating quality of sleep with 100 being the highest quality of sleep,.
- Compute an average across the
different days but have each reported state contribute a different value
Eric
BAD
- too comprehensive with all the transactions that occur at a fire
incident
- The tabular data uses terms that are fit for the audience (the general
public)
- regarding the "False" item, there is no clear indicator visually what
false implies
- Accurate interpretation of meaning and context of the tabular data
is virtually impossible
- the data reported for 2015 only reflects 2 months of information
while the data for 2013 and 2014 reflect a full complement of data.
BETTER
Evan
GOOD
- Visually interesting, and succinctly show
how each algorithm works
- Takes you through the steps of sorting
an unordered list of 15 elements
- Coloring of each line indicates its order
in the list, and by the nature of the sorting, the visualizations draw the eye from left to right
- Give a lot of information on the sorting
algorithm; how it works, how an element moves throughout the algorithm, how quickly sections of the list become sorted, etc.
BAD
- Seems to mislead
- One would assumed that
the higher points in the graph represent a higher number of gun deaths
BETTER
- Flipped the vertical axis to
fit with the natural assumption of higher points representing more gun deaths.
- Spread the points out
more on the horizontal axis.
- Added more labels
- Added more recent data
Jarett
GOOD
- http://polygraph.cool/nba/
- Very easy to use (scrolling), interactive
- Easy to read text
- Meaningful colors
- Plenty of information
- Can be used at viewer’s discretion; viewer won’t be overwhelmed with information
- Data is very clear, immediately know what you are looking at based on the title
BAD
- Found on Forbes
- http://www.forbes.com/sites/markfidelman/2012/06/05/heres-the-real-reason-there-are-not-more-women-in-technology/2/#4bfe90433c67
- Funnels serve absolutely no purpose other than to probably induce
bias
- ‘Interest’ is a confusing metric
- Most likely means of the people that are interested, 35% are females and
65% are males
- There are increases from levels of the funnel (e.g. 65% à 82%) yet
the funnel gets smaller
- Funnel changes size at a constant rate for both male and female,
regardless of the percentages
- Funnel makes data difficult to understand and does not help with
comprehension
- Labels are inconsistently placed
- Interest is not between percentages
- Tech degrees has a meaningless arrow
BETTER
- Separated funnel into four bars
- Bars show percentage of males and
females in a particular category
- Females are red (as indicated)
- Males are blue (as indicated)
WHY IT’S BETTER AND FIXED
- Clearly shows distribution between males
and females for each category
- More meaningful title
- Better explained ‘interest’
- Consistent labeling, easy to read and
understand
- Percent changes are actually visually
displayed, unlike the funnel
- Clear, clean data; nothing to induce bias
- Can quickly and easily see the difference in
percentages between males and females
John
GOOD
- Clear title and data labels
- Color spectrum to show
range
- Good = blue
- Bad = purple
- Scale on right graph is
consistent
- Interactive
- Hover over state and see data
- n graph
- Can drill down into the data
- Highlight smaller states on
legend to see corresponding data
http://stateofobesity.org/adult-obesity/
BAD
- Two sets of non correlating data
- Scales are off by a couple of factors
- 3D chart is inappropriate for this set of
data
- Hard to see the larger data relative to
the y-axis
https://www.reddit.com/r/dataisugly/
BETTER
- Break single chart into two
- Helps see data trends between the two sets
- f data
- Avoids any confusion about the correlation
between the two
- Fixes the scaling of axes
- Can pick proper intervals for both
- Proper Alignment do a better job for
data comparison
50,000 51,000 52,000 53,000 54,000 55,000 2009 2010 2011 2012 2013 2014 2015
Number of Wells
Class II (SWD, ER, HC) Annual Well Inventory 2009-2015
50 70 90 110 130 2009 2010 2011 2012 2013 2014 2015
Number of Wells
Class III (Bring Mining) Annual Well Inventory 2009-2015
Jordan
http://graphics.wsj.com/infectious-diseases-and-vaccines/
GOOD
- Easy to understand the
message the designer is trying to get across.
- Interacting adds another layer
- f reassurance and
preciseness of the data.
- Clearly states what the data
being represented is and how it is portrayed graphically.
- Very little wasted ink.
BAD
- Arrow representing 20 ft. is unproportioned
to the arrow representing 10 ft.
- The cell towers in the background seem to
be designed to mislead the reader to thinking that the spacecraft is the size of a cellphone tower.
- Weight is displayed in words but has no
visual correspondent.
- The length actually represents the wingspan
however the wings are not fully extended in the picture example.
- The title is “Instruments and Mission”
however the graph shows little to none of these explicitly.
Wingspan with Solar Panels: 20 ft. 3 in. Height: 10 ft. Dimension: 8 ft. X 8 ft. Tagsam Sample Arm Length: 11 ft. Shaq’s Size: 344 lb, 7
- ft. 1 in.
OSIRIS-Rex’s Specifications
OSIRIS-Rex Weighs About 13.5 Shaqs (4650 lbs)
Joshua
GOOD
- Incredibly easy to follow
- Minimal chart junk while still
retaining memorability about important points
- Key allows the reader to interpret
information easily
- Each arrow can encode multiple
forms of information while still allowing the reader to process the information efficiently
- Captions include additional
information not immediately available from data
Source: “A Wired Analysis of Tech Company Tax Schemes”
- Article by Lee Simmons
- Visualization by
Valerio Pellegrini
BAD – PART 1
- Not that easy to compare values
- Values are encoded by
- Diameter/radius but the eye is first
- Drawn to an area comparison
- A lot of extra information
- Aesthetically messy
“The Olympic Games Always Go Over Budget, in One Chart (1968-2016)” Source: http://howmuch.net/articles/olympiccosts Dataset: http://howmuch.net/sources/olympiccosts
BAD – PART 2
- This visualization has the same
issues as its companion
- Additionally, percent of cost
- verrun is a misleading statistic
- For a small number, a large percent
- verrun can still be a very small
number
- No sense of original budget or cost
unless combined with previous graph
“The Olympic Games Always Go Over Budget, in One Chart (1968-2016)” Source: http://howmuch.net/articles/olympiccosts Dataset: http://howmuch.net/sources/olympiccosts
BETTER
- Original budgets are
calculated and included in this visualization
- Simpler, more easily
understandable encoding of cost data
- Comparison of original
budget vs. actual cost is much clearer
- Not as immediately appealing
in terms of visuals, but the data comes first
Kelly
GOOD
- To scale
- Shows total spending and how it is
broken up in the same image
- Clear labeling
- Color coding
- Context is limited (only 2 years), but
not needed to convey message (distribution)
BAD
BAD
- PRO: to scale!
- Black circle: 1 cm2 : 16 people = * 16
- Blue circle: 3.24 cm2 : 50 people = * 15.4
- Dotted circle: 4.62 cm2 : 70 people = * 15.2
- CON: optical illusion
- Which is larger: the center circle or the
ring between the dotted line and the blue circle?
BETTER
- Shows each of the 70 companies and where
they are in the crowdfunding process
- No optical illusions
- Further improvements beyond my GIMP
abilities:
- Nesting boxes like the original circle image
- Clearer labeling
- Is there a better way to communicate
steps in a process?
Nathan
POSITIVE QUALITIES
- Area and color show the relative
weight of each state.
- Allows for a high-level overview and
detailed analysis.
- Accompanied by sufficient
explanation.
NEGATIVE QUALITIES
- Vertical positioning does not
correspond to price, only relative.
- Color and position (left/
right) both encode the same information (approval status).
- Positive: all the information is
available to the reader.
RE-DESIGN BENEFITS
- Scale is correct, easy to see
relative price values.
- Category coloring gives a
sense of popular themes.
- Negative: lose some
information
Shengmin
GOOD
- Low data-ink ratio and high lie
factor BUT “Chart junk” can help to improve the recognizability of specific buildings and landmarks
- Lie factors of this map is no
longer a critical problem
- A good data visualization
answers readers’ questions, and this map answers many questions about where one should go.
BAD
- Low data-ink ratio and high lie factor
AND these problems do decrease the accuracy of the chart
- visual embellishments should be
considered as “chart junk” because they are irrelevant to what the data shows
- Does the size of human figure
represent the popularity of a certain language? 10 languages shown but only 3 human figures
- The bars in this chart are out of
proportion
BETTER
- removed the visual
embellishments like the human figures and the dialog box logos in the bars, and turned it into a plain chart.
- put the numbers on the right
- f the bars and delete the