CS 171: Visualization Process & Visual Variables Hanspeter - - PowerPoint PPT Presentation

cs 171 visualization
SMART_READER_LITE
LIVE PREVIEW

CS 171: Visualization Process & Visual Variables Hanspeter - - PowerPoint PPT Presentation

CS 171: Visualization Process & Visual Variables Hanspeter Pfister pfister@seas.harvard.edu This Week Friday lab 10:30-11 am in MD G115 HW1 due today, group reflection due Monday Readings for next week Chapter 1 Group


slide-1
SLIDE 1

CS 171: Visualization

Process & Visual Variables

Hanspeter Pfister pfister@seas.harvard.edu

slide-2
SLIDE 2

This Week

  • Friday lab 10:30-11 am in MD G115
  • HW1 due today, group reflection due Monday
  • Readings for next week

Chapter 1

slide-3
SLIDE 3

Group Reflection

  • Optional - due on Monday
  • Not if you are taking late days
  • Work in groups to improve your HW
  • Must write a self reflection about

improvements

  • Grade will take both parts into account
  • Need serious effort on individual part
slide-4
SLIDE 4

Learning Catalytics

  • Go to https://learningcatalytics.com/
  • Go to Courses -> CS 171
  • Enter session ID: 697815
slide-5
SLIDE 5

Survey Results

  • 213 responses, 31% female, 65% male, 3% N/A
  • 202 registered, 115 College, 82 DCE, 5 Other
slide-6
SLIDE 6

Survey Results

slide-7
SLIDE 7

Survey Results

slide-8
SLIDE 8

Survey Results

slide-9
SLIDE 9

Survey Results

slide-10
SLIDE 10

Survey Results

slide-11
SLIDE 11

Last Week

slide-12
SLIDE 12

Design Excellence

“Well-designed presentations of interesting data are a matter of substance, of statistics, and of design.”

  • E. Tufte
slide-13
SLIDE 13

Graphical Integrity

  • Missing scales
  • Distortions
  • Lie factor
slide-14
SLIDE 14

Washington Post, 2012

slide-15
SLIDE 15

Design Principles

  • Maximize Data-Ink Ratio
  • Avoid Chartjunk
  • Increase Data Density
  • Subjective Dimensions
slide-16
SLIDE 16

Graphic Design

  • Contrast
  • Repetition
  • Alignment
  • Proximity
slide-17
SLIDE 17

Design Critique

slide-18
SLIDE 18

Outline

  • Process
  • Data Model
  • Image Model
  • Psychophysics
  • Graphical Perception
slide-19
SLIDE 19

Process

slide-20
SLIDE 20

Reading

slide-21
SLIDE 21

Tamara Munzner

  • Associate Professor

at UBC, Canada

  • Ph.D. Stanford 2000
  • Worked at Geometry

Center, Compaq Research

  • Widely published in

InfoVis

slide-22
SLIDE 22

target translate design implement validate evaluate user-centered design usability engineering participatory design

slide-23
SLIDE 23

target translate design implement validate evaluate user-centered design usability engineering participatory design

slide-24
SLIDE 24

Miriah Meyer

  • NSF CI Postdoctoral

Fellow at Harvard

  • Ph.D. Utah 2008
  • Works with genomics

and molecular biology data

slide-25
SLIDE 25

target translate design implement validate choose a specific domain define research question(s) find & clean the data

slide-26
SLIDE 26

Pathline - A Tool for Comparative Functional Genomics Data

slide-27
SLIDE 27

target translate design implement validate formulate data analysis tasks exploratory data analysis transform & summarize data

slide-28
SLIDE 28

“The greatest value of a picture is when it forces us to notice what we never expected to see.” John Tukey

Exploratory Data Analysis

slide-29
SLIDE 29

Ascombe’s Quartet

Same mean, variance, correlation coefficient, and linear regression line

http://upload.wikimedia.org/wikipedia/commons/b/b6/Anscombe.svg

slide-30
SLIDE 30

Interactive Exploration

  • Construct visualization to address questions
  • Inspect “answers” and pose new questions
  • Transform the data appropriately
  • Repeat!
slide-31
SLIDE 31

gene expression

t1 g1 g2 g3 g4 g5 g6 g7 g8 0.2 1.0

  • 0.7

1.0

  • 0.5
  • 0.7
  • 1.0
  • 0.5

t1 t2 g1 g2 g3 g4 g5 g6 g7 g8 0.2 0.4 1.0 0.0

  • 0.7

0.8 1.0 0.0

  • 0.5

0.8

  • 0.7

0.5

  • 1.0
  • 0.3
  • 0.5

0.0 t1 t2 t3 t4 g1 g2 g3 g4 g5 g6 g7 g8 0.2 0.4 1.0 1 1.0 0.0 0.0 0.0

  • 0.7

0.8 1.0 1 1.0 0.0 0.2 0.5

  • 0.5

0.8 0.5

  • 0.3
  • 0.7

0.5 0.8

  • 0.7
  • 1.0
  • 0.3

0.4

  • 1
  • 0.5

0.0 0.0

  • 0.7

t1 t2 t3 t4 t5 g1 g2 g3 g4 g5 g6 g7 g8 0.2 0.4 1.0 1.0 1.0 1.0 0.0 0.0 0.0 1.0

  • 0.7

0.8 1.0 1.0 0.8 1.0 0.0 0.2 0.5 1.0

  • 0.5

0.8 0.5

  • 0.3
  • 0.5
  • 0.7

0.5 0.8

  • 0.7
  • 1.0
  • 1.0
  • 0.3

0.4

  • 1.0
  • 1.0
  • 0.5

0.0 0.0

  • 0.7
  • 0.5

t1 t2 t3 t4 t5 t6 g1 g2 g3 g4 g5 g6 g7 g8 0.2 0.4 1.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 1.0 0.8

  • 0.7

0.8 1.0 1.0 0.8 0.2 1.0 0.0 0.2 0.5 1.0 0.2

  • 0.5

0.8 0.5

  • 0.3
  • 0.5
  • 0.5
  • 0.7

0.5 0.8

  • 0.7
  • 1.0

0.5

  • 1.0
  • 0.3

0.4

  • 1.0
  • 1.0
  • 1.0
  • 0.5

0.0 0.0

  • 0.7
  • 0.5
  • 0.7

t1 t2 t3 t4 t5 t6 g1 g2 g3 g4 g5 g6 g7 g8 0.2 0.4 1.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 1.0 0.8

  • 0.7

0.8 1.0 1.0 0.8 0.2 1.0 0.0 0.2 0.5 1.0 0.2

  • 0.5

0.8 0.5

  • 0.3
  • 0.5
  • 0.5
  • 0.7

0.5 0.8

  • 0.7
  • 1.0

0.5

  • 1.0
  • 0.3

0.4

  • 1.0
  • 1.0
  • 1.0
  • 0.5

0.0 0.0

  • 0.7
  • 0.5
  • 0.7

t1 t2 t3 t4 t5 t6 g1 m1 g2 m2 g3 m3 g7 g8 0.2 0.4 1.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 1.0 0.8

  • 0.7

0.8 1.0 1.0 0.8 0.2 1.0 0.0 0.2 0.5 1.0 0.2

  • 0.5

0.8 0.5

  • 0.3
  • 0.5
  • 0.5
  • 0.7

0.5 0.8

  • 0.7
  • 1.0

0.5

  • 1.0
  • 0.3

0.4

  • 1.0
  • 1.0
  • 1.0
  • 0.5

0.0 0.0

  • 0.7
  • 0.5
  • 0.7

tca cycle glycolysis

similarity scores

s1 s2 s3 s4 s5 s6 t1 t2 t3 t4 t5 t6 g1 0.2 0.4 1.0 1.0 1.0 1.0 t1 t2 t3 t4 t5 t6 g1 0.2 0.4 1.0 1.0 1.0 1.0 t1 t2 t3 t4 t5 t6 g1 0.2 0.4 1.0 1.0 1.0 1.0

, , , ...

s1 s2 s3

= 0.83

  • aggregate time series

for a gene/metabolite

  • ver species
  • similarity of expression

across species

  • aggregate: Pearson,

Spearman, others

  • quantitative value

aggregate

  • Y. lip
  • S. cer
  • S. mik
  • S. bay
  • S. bayuv
  • C. gla
  • S. cas
  • K. pol
  • K. wal
  • K. lac
  • S. klu
  • D. han
  • C. alb
  • S. jap
  • S. pom

metabolic pathways

  • 6000 genes and

140 metabolites

  • 6 time points
  • 14 species of yeast
  • 3D table
  • 10 to 50 pathways
  • f interest
  • inputs/outputs

called metabolites

  • directed graph

phylogeny

  • evolutionary

relationship

  • binary

tree

slide-32
SLIDE 32

target translate design implement validate design visual encodings design interactions sketch many ideas!

slide-33
SLIDE 33

Blake Walsh, Gabriel Trevino, Antony Bett

slide-34
SLIDE 34

Bang Wong

slide-35
SLIDE 35

target translate design implement validate use code “sketches” define data structures find efficient algorithms

slide-36
SLIDE 36
slide-37
SLIDE 37

target translate design implement validate what? how? 80% 20%

slide-38
SLIDE 38

target translate design implement validate is the abstraction right? does it support the tasks? does it provide new insights?

slide-39
SLIDE 39

Nested Validation

  • T. Munzner, A Nested Model for

Visualization Design and Validation

target translate design implement

slide-40
SLIDE 40

Process Books

slide-41
SLIDE 41

“A methodological approach to visualization development makes effective design decisions salient.”

  • Miriah Meyer
slide-42
SLIDE 42

Data Model

slide-43
SLIDE 43
slide-44
SLIDE 44

Nominal

Categorical Qualitative

Ordinal Interval Ratio

On the theory of scales and measurements [S. Stevens, 46]

slide-45
SLIDE 45

Data Types

  • Nominal (categorical) (N)

Are = or ≠ to other values Apples, Oranges, Bananas,...

  • Ordinal (ordered) (O)

Obey a < relationship Small, medium, large

  • Quantitative (Q)

Can do arithmetic on them 10 inches, 23 inches, etc.

slide-46
SLIDE 46

Quantitative

  • Q - Interval (location of zero arbitrary)

Dates: Jan 19; Location: (Lat, Long) Only differences (i.e., intervals) can be compared

  • Q - Ratio (zero fixed)

Measurements: Length, Mass, Temp, ... Origin is meaningful, can measure ratios & proportions

On the theory of scales and measurements [S. Stevens, 46]

slide-47
SLIDE 47

Item

slide-48
SLIDE 48

Attribute

slide-49
SLIDE 49

1 = Quantitative 2 = Nominal 3 = Ordinal

slide-50
SLIDE 50

1 = Quantitative 2 = Nominal 3 = Ordinal

slide-51
SLIDE 51

Nominal /Ordinal = Dimensions

Describe the data, independent variables

Quantitative = Measures

Numbers to be analyzed, dependent variables

slide-52
SLIDE 52

Data vs. Conceptual Models

  • Data Model: Low-level description of the data

Set with operations, e.g., floats with +, -, /, *

  • Conceptual Model: Mental construction

Includes semantics, supports reasoning Data Conceptual 1D floats temperature 3D vector of floats space

slide-53
SLIDE 53

Data vs. Conceptual Model

  • From data model...

32.5, 54.0, -17.3, … (floats)

  • using conceptual model...

Temperature

  • to data type

Continuous to 4 significant figures (Q) Hot, warm, cold (O) Burned vs. Not burned (N)

Based on slide from Munzner

slide-54
SLIDE 54

Image Model

slide-55
SLIDE 55

Jacques Bertin

  • French cartographer

[1918-2010]

  • Semiology of Graphics

[1967]

  • Theoretical principles

for visual encodings

slide-56
SLIDE 56

Bertin’s Visual Variables

Semiology of Graphics [J. Bertin, 67]

Points Lines Areas Marks

Position Size (Grey)Value Texture Color Orientation Shape

Channels

slide-57
SLIDE 57

Mapping to Data Types

Nominal Ordinal Quantitative Position

✔ ✔ ✔

Size

✔ ✔ ~

(Grey)Value

✔ ✔ ~

Texture

✔ ~ ✖

Color

✔ ✖ ✖

Orientation

✔ ✖ ✖

Shape

✔ ✖ ✖ ✔ = Good ~ = OK ✖ = Bad

slide-58
SLIDE 58

[Mackinlay, Automating the Design of Graphical Presentations of Relational Information, 1986]

Jock Mackinlay, 1986

Decreasing

slide-59
SLIDE 59

Stolte & Hanrahan, 2002

[“Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases” Chris Stolte, Diane Tang, and Pat Hanrahan, 2002]

slide-60
SLIDE 60

Psychophysics

slide-61
SLIDE 61

Weber’s Law (1795–1878)

  • Sensitivity to changes in stimulus decreases when

stimulus magnitude increases

  • True for intensity, length, weight, sound, time, etc.

Weber fraction (constant!) Just-Noticeable Difference Base intensity

∆I I = k

slide-62
SLIDE 62

∆I I = k ∆I = k

  • J. H. Krantz
slide-63
SLIDE 63

∆I I = k ∆I = k

  • J. H. Krantz
slide-64
SLIDE 64

∆I I = k ∆I = k

  • J. H. Krantz
slide-65
SLIDE 65

Fechner’s Law (1801–1887)

  • The relationship between stimulus and perception

is logarithmic

  • I.e., we perceive brightness on a logarithmic scale

Sensation Intensity

S = k log(I)

slide-66
SLIDE 66

Based on slide from Mazur

slide-67
SLIDE 67

Steven’s Power Law, 1961

From Wilkinson 99, based on Stevens 61

S = kIp

Underestimating Overestimating

0.5 0.6 0.7 1.3 1.5 3.5 1.0 Electric

slide-68
SLIDE 68

Graphical Perception

slide-69
SLIDE 69

Cleveland / McGill, 1984

William S. Cleveland; Robert McGill , “Graphical Perception: Theory, Experimentation, and Application to the Development
  • f Graphical Methods.” 1984
slide-70
SLIDE 70

Cleveland / McGill, 1984

William S. Cleveland; Robert McGill , “Graphical Perception: Theory, Experimentation, and Application to the Development
  • f Graphical Methods.” 1984
  • Position judgements > length >> angle
slide-71
SLIDE 71

Heer & Bostock, 2010

slide-72
SLIDE 72

Most Efficient Least Efficient

  • C. Mulbrandon

VisualizingEconomics.com

Quantitative Ordinal Nominal

⎬ ⎫ ⎧ ⎬ ⎫ ⎧

slide-73
SLIDE 73

Most Efficient

  • C. Mulbrandon

VisualizingEconomics.com

slide-74
SLIDE 74

Pie & Donut Charts

  • C. Mulbrandon

VisualizingEconomics.com

slide-75
SLIDE 75

Rainbow Colormap

Rogowitz and Treinish, Why should engineers and scientists be worried about color?

slide-76
SLIDE 76

Rainbow Colormap

Rogowitz and Treinish, Why should engineers and scientists be worried about color?, 1996

Southeastern United States and Gulf of Mexico zero crossing not explicit

slide-77
SLIDE 77

hard to order easy to order creates artifacts lower resolution Borland 2007

Rainbow Colormap

slide-78
SLIDE 78
slide-79
SLIDE 79

Sunday Star Times, 2012

slide-80
SLIDE 80
  • R. Cunliffe, Stats Chat
slide-81
SLIDE 81
  • R. Cunliffe, Stats Chat
slide-82
SLIDE 82

Peter and Maria Hoey (Source: Tommy McCall/Environmental Law Institute)

slide-83
SLIDE 83

Further Reading

slide-84
SLIDE 84

woodgears.ca

Position, Length & Angle