Data Visualization Visualization Taxonomy & Statistical Graphs - - PowerPoint PPT Presentation

data visualization
SMART_READER_LITE
LIVE PREVIEW

Data Visualization Visualization Taxonomy & Statistical Graphs - - PowerPoint PPT Presentation

Data Visualization Visualization Taxonomy & Statistical Graphs Azalea Vo Alexander Lex avo@college.harvard.edu alex@seas.harvard.edu xkcd Administrivia Thursday: guest lecture on Tableau Jock D. Mackinlay Friday: Tableau and


slide-1
SLIDE 1

Data Visualization

Visualization Taxonomy & Statistical Graphs

Azalea Vo Alexander Lex avo@college.harvard.edu alex@seas.harvard.edu

xkcd

slide-2
SLIDE 2

Administrivia

  • Thursday: guest lecture on Tableau

Jock D. Mackinlay

  • Friday: Tableau and ManyEyes lab

Blake and Shirley

  • Readings for this week

Chapter 1

slide-3
SLIDE 3

Homework 2

  • Topic:
  • Project 1 proposal!
  • Advanced web scraping
  • Working with APIs
  • Look at lecture 3 (Ian) and Friday‘s lab

Lab video: http://tinyurl.com/cs171lab1video Sources: http://tinyurl.com/cs171lab1

slide-4
SLIDE 4

Project I

  • Has to use data you scraped yourself!
  • Update your Process Books continually
  • Your choice of visualization tools
  • We support ManyEyes and Tableau
slide-5
SLIDE 5

Design Critique

http://mariandoerk.de/edgemaps/demo/ Learning Catalytics Session ID 118785

slide-6
SLIDE 6
slide-7
SLIDE 7

Comparisons

slide-8
SLIDE 8

Bar Chart

How Much Does Beer Consumption Vary by Country?

Bottles per person per week

slide-9
SLIDE 9

Length

VizWiz Blog

slide-10
SLIDE 10

Direction

Nicolas Rapp

slide-11
SLIDE 11

Negative Values

US Department of the Treasury "The Financial Crisis Response In Charts"

slide-12
SLIDE 12
slide-13
SLIDE 13

Naveen Sinha, 2009

slide-14
SLIDE 14

Waterfall Chart

VizWiz Blog

slide-15
SLIDE 15

3D

Few, “Show me the Numbers”

slide-16
SLIDE 16

Excel Charts Blog

slide-17
SLIDE 17
slide-18
SLIDE 18

Employment Law HQ Blog

slide-19
SLIDE 19

Bullet Graphs

http://en.wikipedia.org/wiki/Bullet_graph

slide-20
SLIDE 20

Bars vs. Lines

Line implies trend. Do not use for categorical data.

Zacks 1999

slide-21
SLIDE 21

Relationships

slide-22
SLIDE 22

Dot Plots

slide-23
SLIDE 23
slide-24
SLIDE 24

De novo mutations revealed by whole-exome sequencing are strongly associated with autism Sanders et. Al

Figure: Identification of multiple de novo mutations in the same gene reliably distinguishes risk-associated mutations.

Nature, May 2012

slide-25
SLIDE 25
slide-26
SLIDE 26

Proportions

slide-27
SLIDE 27

2

WILLIAM PLAYFAIR

1759-1823

slide-28
SLIDE 28

Pie Charts

slide-29
SLIDE 29

VizWiz Blog

Total: 104%

slide-30
SLIDE 30

v.s.

Showing Change

VizWiz Blog

slide-31
SLIDE 31

Showing Change

v.s.

VizWiz Blog

slide-32
SLIDE 32

The Perfect Pour Coffee Drinks Illustrated

slide-33
SLIDE 33

3D Pie Charts

Few, “Show me the Numbers”

slide-34
SLIDE 34

Why 3D pie charts are bad

Kevin Fox

slide-35
SLIDE 35

Green vs. Purple

slide-36
SLIDE 36

Fox News, 2009

wikipedia.org

slide-37
SLIDE 37
slide-38
SLIDE 38

No Comments...

slide-39
SLIDE 39

Donut Chart

The Economist Daily Chart

slide-40
SLIDE 40

Total % ?

WSJ Graphics Blog

slide-41
SLIDE 41

Stacked Bar Chart

Few, “Show me the Numbers”

slide-42
SLIDE 42

Storytelling with Data Blog

slide-43
SLIDE 43

Small Multiples

Peltier Tech Blog

slide-44
SLIDE 44

Stacked Area Chart

Asymco

slide-45
SLIDE 45

4

JOSEPH MINARD

1781-1870

slide-46
SLIDE 46

Flowingdata, Changes in Consumer Spending

Stacked 100% Area Chart

slide-47
SLIDE 47

v.s.

leancrew.com & Practically Efficient

slide-48
SLIDE 48

Area as Intersection

WSJ Graphics Blog

slide-49
SLIDE 49

WSJ Graphics Blog

slide-50
SLIDE 50

http://maryandmatt.net/

slide-51
SLIDE 51

Summary:

Comparison, Proportions

slide-52
SLIDE 52
slide-53
SLIDE 53

Ivan Cash

slide-54
SLIDE 54

Visualization Taxonomy

slide-55
SLIDE 55

Motivation

Useful Junk? The Chartjunk Debate

slide-56
SLIDE 56

Visual Literacy Periodic Table

slide-57
SLIDE 57

Melanie Tory and Torsten Moller, Infovis04 Jeffrey Heer and Ben Shneiderman, ACM

slide-58
SLIDE 58
slide-59
SLIDE 59
slide-60
SLIDE 60
slide-61
SLIDE 61
slide-62
SLIDE 62

Distributions

slide-63
SLIDE 63

Histogram

ggplot2

slide-64
SLIDE 64

Population Pyramid – A Histogram

[Wikipedia]

slide-65
SLIDE 65

Bin Width

binwidth = 0.1 binwidth = 0.01

ggplot2

slide-66
SLIDE 66

Density Plots

slide-67
SLIDE 67

takeasweater.com

slide-68
SLIDE 68

Box & Whisker Plots

Few, “Show me the Numbers”

slide-69
SLIDE 69

ggplot2

slide-70
SLIDE 70

Correlations

slide-71
SLIDE 71

Scatterplots

http://xkcd.com/388/

slide-72
SLIDE 72

Scatterplots

ggplot2

slide-73
SLIDE 73

Don‘t

[Spotfire]

slide-74
SLIDE 74

Box Office Quant Blog

slide-75
SLIDE 75

Overplotting

ggplot2

slide-76
SLIDE 76

Overplotting

alpha = 1/10 alpha = 1/100

ggplot2

slide-77
SLIDE 77

Binning

http://vis.stanford.edu/projects/datavore/splom

slide-78
SLIDE 78

Trend Lines

ggplot2

slide-79
SLIDE 79

Residual Graph

  • Plot vertical distance from trend line

[Cleveland 85]

slide-80
SLIDE 80

NY Times, March 2009

Trend Lines

slide-81
SLIDE 81

Quadrants

http://www.asymco.com/2010/10/05/the-symmetry-of-share-shifts-in-mobile-phones/

slide-82
SLIDE 82

Relative Changes

Junk Charts

slide-83
SLIDE 83

Path Plots

New York Times

slide-84
SLIDE 84

Trends

slide-85
SLIDE 85

85

WILLIAM PLAYFAIR

1759-1823

slide-86
SLIDE 86

86

JOSEPH MINARD

1781-1870

slide-87
SLIDE 87

Yahoo! Finance

slide-88
SLIDE 88
slide-89
SLIDE 89
slide-90
SLIDE 90

Data & Trend Plot

The Daily Dish

slide-91
SLIDE 91

Trend Histories

New York Times, Porcupine Graphics

slide-92
SLIDE 92

Trend Projections

Washington Post

slide-93
SLIDE 93

Russell Investments

slide-94
SLIDE 94

Design?

slide-95
SLIDE 95

Fox News

flowingdata.com

slide-96
SLIDE 96

Aspect Ratio & Scales

slide-97
SLIDE 97

Aspect Ratio

Yearly CO2 concentrations [Cleveland 95]

Slide based on J. Heer / M. Agrawala

slide-98
SLIDE 98

Banking to 45 [Cleveland]

  • Optimize the aspect ratio to bank to 45

Two line segments are maximally discriminable when their average absolute angle is 45°

Slide based on J. Heer / M. Agrawala

slide-99
SLIDE 99

CO2 Measurements William S. Cleveland Visualizing Data

Aspect Ratio = 1.17 Aspect Ratio = 7.87

Slide based on J. Heer / M. Agrawala

Banking to 45 Degrees

slide-100
SLIDE 100

Non-Zero Based Scales

Few, “Show me the Numbers”

slide-101
SLIDE 101

Non-Zero Based Scales

v.s.

slide-102
SLIDE 102

Fox News

slide-103
SLIDE 103
  • Linear scale
  • Absolute change
  • Log scale
  • Small fluctuations
  • Percent change
  • d(10,20) = d(30,60)

MSFT MSFT

10 20 30 60 40 50 10 20 30 60 40 50

Linear scale vs. Log scale

slide-104
SLIDE 104

Clearly mark scale breaks

Well marked scale break [Cleveland 85]

Poor scale break [Cleveland 85]

Slide based on J. Heer / M. Agrawala

slide-105
SLIDE 105

Scale break vs. Log scale

[Cleveland 85]

  • Both increase visual resolution
  • Log scale - easy comparisons of all data
  • Scale break – more difficult to compare across break

Slide based on J. Heer / M. Agrawala