Data Visualization Steve Marschner Cornell CS 3220 unless noted, - - PowerPoint PPT Presentation

data visualization
SMART_READER_LITE
LIVE PREVIEW

Data Visualization Steve Marschner Cornell CS 3220 unless noted, - - PowerPoint PPT Presentation

Data Visualization Steve Marschner Cornell CS 3220 unless noted, images are from Tufte, The Visual Display of Quantitative Information (these slides also indebted to Pat Hanrahans slides for CS448B at Stanford) Cornell CS 3220 Data


slide-1
SLIDE 1

Data Visualization Cornell CS 3220

Data Visualization

Steve Marschner Cornell CS 3220

unless noted, images are from Tufte, The Visual Display of Quantitative Information (these slides also indebted to Pat Hanrahan’s slides for CS448B at Stanford)

slide-2
SLIDE 2

Data Visualization Cornell CS 3220

Data

A lot of 3220 is about data

input to fitting problems

  • utput of simulations

Understanding all but the simplest is not easy

tables of numbers give little insight appropriate pictures are invaluable!

slide-3
SLIDE 3

Data Visualization Cornell CS 3220

slide-4
SLIDE 4

Data Visualization Cornell CS 3220

slide-5
SLIDE 5

Data Visualization Cornell CS 3220

Purposes of visualization

Organize and display data (for yourself)

provide data in a form our brains & visual systems are able to use making pictures of data helps you understand it designing visualizations forces you to organize the data a key part of the intellectual and creative process

Present data (for others)

data in support of arguments (scientific, policy, …) data for making decisions (funding, operational, …) good presentation of data is key to any good presentation

  • f complex technical material

a part of informative & persuasive communication

slide-6
SLIDE 6

Data Visualization Cornell CS 3220

John C. Snow (1854)

slide-7
SLIDE 7

Data Visualization Cornell CS 3220

Purposes of visualization

Organize and display data (for yourself)

provide data in a form our brains & visual systems are able to use making pictures of data helps you understand it designing visualizations forces you to organize the data a key part of the intellectual and creative process

Present data (for others)

data in support of arguments (scientific, policy, …) data for making decisions (funding, operational, …) good presentation of data is key to any good presentation

  • f complex technical material

a part of informative & persuasive communication

slide-8
SLIDE 8

Data Visualization Cornell CS 3220

data presented by rocket’s manufacturer to argue for canceling the launch.

[from Tufte, Visual Explanations]

slide-9
SLIDE 9

Data Visualization Cornell CS 3220

[from Tufte, Visual Explanations]

data presented by rocket’s manufacturer to argue for canceling the launch.

slide-10
SLIDE 10

Data Visualization Cornell CS 3220

Space Shuttle mission STS-51-L, about 75 sec. after liftoff. 1986

[NASA]

slide-11
SLIDE 11

Data Visualization Cornell CS 3220

Tufte’s more convincing re-presentation of the same data. 1997

[from Tufte, Visual Explanations]

slide-12
SLIDE 12

Data Visualization Cornell CS 3220

Data Mappings

slide-13
SLIDE 13

Data Visualization Cornell CS 3220

Mapping data into a visual display

Datatypes

programming: char, int, float, double, String, … scientific data has types too

Graphical information channels

there are many ways to put the data into pictures good datatype-to-channel matches are important!

slide-14
SLIDE 14

Data Visualization Cornell CS 3220

Datatypes

Nominal select from unorganized set (enumerated type, in C)

apples, oranges, tomatoes, … Toyota, Ford, Subaru, …

Ordinal ordered set of values (< operator available)

January, February, March, … Trial 1, Trial 2, Trial 3, … 12 Oak St., 125 Oak St., 129 Oak St., …

  • S. S. Stevens, On the theory of scales of measurement (1946)
slide-15
SLIDE 15

Data Visualization Cornell CS 3220

Datatypes (quantitative)

Interval values are meaningful, but zero is arbitrary (+, – avail.)

degrees Celsius position potential energy

Ratio values are meaningful, meaningful zero (×, ÷ avail.)

degrees Kelvin length mass

  • S. S. Stevens, On the theory of scales of measurement (1946)
slide-16
SLIDE 16

Data Visualization Cornell CS 3220

Graphical information channels

Spatial

length position size (area, volume?)

Color

value (lightness, black to white) saturation (colorfulness, gray to vivid) hue (color) texture (fill pattern)

Details

shape

  • rientation
slide-17
SLIDE 17

Data Visualization Cornell CS 3220

Datatypes and channels

N O I R length position size value saturation hue texture shape

  • rientation

Y Y Y Y Y ~ ~ Y ~ Y Y Y Y Y ~

Pay attention to data semantics Chose channel that carries the semantics well

spatial color detail

slide-18
SLIDE 18

Data Visualization Cornell CS 3220

Common types of visualizations

data maps time series relational plots histograms bar charts polar plots color maps

slide-19
SLIDE 19

Data Visualization Cornell CS 3220

Data Maps

Position: position Symbols, colors: various variables (N, O, or Q) very old form of data visualization readily interpreted with little training or effort

slide-20
SLIDE 20

Data Visualization Cornell CS 3220

  • E. Halley. Map illustrating trade winds. 1686
slide-21
SLIDE 21

Data Visualization Cornell CS 3220

  • C. J. Minard. Map illustrating exports of French wine. 1864
slide-22
SLIDE 22

Data Visualization Cornell CS 3220

J.C. Minard. Depiction of losses during French Army march to (and retreat from) Moscow, 1812–1813.

slide-23
SLIDE 23

Data Visualization Cornell CS 3220

Time series

Horizontal axis: time (Interval—Position) Vertical axis: some quantitative value (often money) very old form of data visualization readily interpreted with little training or effort

slide-24
SLIDE 24

Data Visualization Cornell CS 3220

slide-25
SLIDE 25

Data Visualization Cornell CS 3220

J.H. Lambert. Soil temperature over time at various depths. 1779

slide-26
SLIDE 26

Data Visualization Cornell CS 3220

E.J. Marey. Train schedule for Paris–Lyon line. 1885

slide-27
SLIDE 27

Data Visualization Cornell CS 3220

Relational plots

Horizontal axis: alleged “cause” Vertical axis: alleged “effect” very powerful tool to investigate relationships scatter plot for unordered set of points; connected line for ordered sequence of points

  • r to emphasize functional “law”
slide-28
SLIDE 28

Data Visualization Cornell CS 3220

ABC: temperature over time DEF: height of water over time evaporation rate

  • vs. temperature

J.H. Lambert: influence of temperature on evaporation. 1769

slide-29
SLIDE 29

Data Visualization Cornell CS 3220

C.Y. Ho et al. Review of thermal conductivity data. 1974

slide-30
SLIDE 30

Data Visualization Cornell CS 3220

P . McCracken et al. Phillips curves. 1977

slide-31
SLIDE 31

Data Visualization Cornell CS 3220

Logarithmic plots

For one or both axes, replace direct (linear) data–position mapping with logarithmic mapping Useful for data with high dynamic range Useful for exponential and power-law relationships Caution: converts type from ratio to interval

slide-32
SLIDE 32

Data Visualization Cornell CS 3220

AKG Acoustics. Performance data for C451B microphone. 1973

slide-33
SLIDE 33

Data Visualization Cornell CS 3220

Histograms

First axis (oft. horiz.): Nominal or Ordinal variable Second axis: count of something (ratio)

  • ften convert Quantitative to Ordinal by binning (danger!)
slide-34
SLIDE 34

Data Visualization Cornell CS 3220

  • J. Hjort. Age composition of herring catches. 1914
slide-35
SLIDE 35

Data Visualization Cornell CS 3220

H.S. Shyrock & J.S. Siegel. Rendering of French government population data. 1973

slide-36
SLIDE 36

Data Visualization Cornell CS 3220

slide-37
SLIDE 37

Data Visualization Cornell CS 3220

Bar charts

First axis (oft. horiz.): Nominal or Ordinal variable Second axis: ratio quantity (ratio—length) less appropriate for non-ratio quantities (implied meaningful zero)

slide-38
SLIDE 38

Data Visualization Cornell CS 3220

slide-39
SLIDE 39

Data Visualization Cornell CS 3220

Polar plots

Angle: some relevant angle Radius: ratio quantity (ratio—length) not appropriate for non-angular quantities less appropriate for non-ratio quantities beware of area exaggeration

slide-40
SLIDE 40

Data Visualization Cornell CS 3220

AKG Acoustics. Performance data for C451B microphone. 1973

slide-41
SLIDE 41

Data Visualization Cornell CS 3220

Danger of polar plots with interval scales

180° 150° 120° 90° 60° 30° 0° 180° 150° 120° 90° 60° 30° 0° –5 –10 180° 150° 120° 90° 60° 30° 0° 180° 150° 120° 90° 60° 30° 0° –10 –20 180° 150° 120° 90° 60° 30° 0° 180° 150° 120° 90° 60° 30° 0° –20 –40

Same data, 3 choices of logarithmic scale: leads to very different shapes

slide-42
SLIDE 42

Data Visualization Cornell CS 3220

Ratio quantity in polar plot: set shape

0.2 0.4 0.6 0.8

0˚ 30˚ –30˚ –60˚ = 0˚ = 30˚ = 60˚

S.R. Marschner. Light scattering data for paper. 1998

slide-43
SLIDE 43

Data Visualization Cornell CS 3220

Color maps

Position: position, direction, or more abstract mapping Color: interval, ratio, or nominal quantity be careful to map color attributes appropriately!

slide-44
SLIDE 44

Data Visualization Cornell CS 3220

Color mappings

lightness (brightness, value) hue (what kind of color) saturation (colorfulness, vividness)

strongly ordered, high resolution quantitative variables circular, weakly ordered, identifiable nominal variables, or as secondary feature

  • rdered, low resolution

minor quantitative variables, or combined with saturation for nominal

slide-45
SLIDE 45

Data Visualization Cornell CS 3220

International Hydrographic Organization, 1984 (as deliberately corrupted by Tufte)

[from Tufte, Visual Explanations]

slide-46
SLIDE 46

Data Visualization Cornell CS 3220

International Hydrographic Organization, 1984

[from Tufte, Visual Explanations]

slide-47
SLIDE 47

Data Visualization Cornell CS 3220

P . Irawan & S. Marschner. Scattering data for polyester cloth. 2007 (Matlab default colormap)

slide-48
SLIDE 48

Data Visualization Cornell CS 3220

P . Irawan & S. Marschner. Scattering data for polyester cloth. 2007 (increasing value colormap)

slide-49
SLIDE 49

Data Visualization Cornell CS 3220

Vector fields

Vectors are 2 (or more)-D ratio quantities Often mapped to a textural representation

slide-50
SLIDE 50

Data Visualization Cornell CS 3220

Vector fields as repeated oriented glyphs

[Jim Belk] Magnitude maps to size; direction maps to direction (note arrows are centered at grid points)

slide-51
SLIDE 51

Data Visualization Cornell CS 3220

Natural visualization of magnetic field

Black & Davis, Practical Physics. 1922

slide-52
SLIDE 52

Data Visualization Cornell CS 3220

Line Integral Convolution for vector fields

Cabral and Leedom, SIGGRAPH 1993

slide-53
SLIDE 53

Data Visualization Cornell CS 3220

Treemaps

Martin Wattenberg (SmartMoney), Map of the Market, 1998

slide-54
SLIDE 54

Data Visualization Cornell CS 3220

Small Multiples

A set of small figures following a common design that can be readily compared

slide-55
SLIDE 55

Data Visualization Cornell CS 3220

Los Angeles Times / G.J. McRae. 1979

slide-56
SLIDE 56

Data Visualization Cornell CS 3220

Consumer Reports. Display of historical automobile reliability data. 1982

slide-57
SLIDE 57

Data Visualization Cornell CS 3220

slide-58
SLIDE 58

Data Visualization Cornell CS 3220

  • E. Tufte “sparklines”
slide-59
SLIDE 59

Data Visualization Cornell CS 3220

Visualization for medical records

S.M. Powsner & E.R. Tufte, The Lancet 344:6 1994

slide-60
SLIDE 60

Data Visualization Cornell CS 3220

S.M. Powsner & E.R. Tufte, The Lancet 344:6 1994

slide-61
SLIDE 61

Data Visualization Cornell CS 3220

Graphical integrity

slide-62
SLIDE 62

Data Visualization Cornell CS 3220

To emphasize growth, use tall scale and don’t adjust for inflation

  • W. Playfair, 1786
slide-63
SLIDE 63

Data Visualization Cornell CS 3220

To emphasize growth, use tall scale and don’t adjust for inflation

  • W. Playfair, 1786
slide-64
SLIDE 64

Data Visualization Cornell CS 3220

New York Times. 1976

slide-65
SLIDE 65

Data Visualization Cornell CS 3220

  • E. R. Tufte. Fair presentation of the same data. 1983
slide-66
SLIDE 66

Data Visualization Cornell CS 3220

Day Mines, Inc. 1974

slide-67
SLIDE 67

Data Visualization Cornell CS 3220

Day Mines, Inc. 1974

–$4.2e6

slide-68
SLIDE 68

Data Visualization Cornell CS 3220

Washington Post, 1978

slide-69
SLIDE 69

Data Visualization Cornell CS 3220

Graphical makeovers

slide-70
SLIDE 70

Data Visualization Cornell CS 3220

Maximizing data:ink ratio

“A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts.” —William Strunk, Jr.

slide-71
SLIDE 71

Data Visualization Cornell CS 3220

“Chart-junk”

25.0 50.0 75.0 100.0 2007 2008 2009 2010

slide-72
SLIDE 72

Data Visualization Cornell CS 3220

  • R. Hayward. From L. Pauling, General Chemistry. 1947
slide-73
SLIDE 73

Data Visualization Cornell CS 3220

as modified by Tufte

slide-74
SLIDE 74

Data Visualization Cornell CS 3220

slide-75
SLIDE 75

Data Visualization Cornell CS 3220

slide-76
SLIDE 76

Data Visualization Cornell CS 3220

slide-77
SLIDE 77

Data Visualization Cornell CS 3220

slide-78
SLIDE 78

Data Visualization Cornell CS 3220

S.R. Marschner. Presentation of fiber scattering data using default MATLAB plots. 2002

slide-79
SLIDE 79

S.R. Marschner. Re-presentation using polar coordinates and small multiples. 2003 (thanks to François Guimbretière)

Marschner, Jensen, Cammarano, Worley, and Hanrahan. “Light Scattering from Human Hair Fibers,” SIGGRAPH 2003.

slide-80
SLIDE 80
slide-81
SLIDE 81
slide-82
SLIDE 82