Week 2: Arrange Tables Tamara Munzner Department of Computer - - PowerPoint PPT Presentation

week 2 arrange tables
SMART_READER_LITE
LIVE PREVIEW

Week 2: Arrange Tables Tamara Munzner Department of Computer - - PowerPoint PPT Presentation

Week 2: Arrange Tables Tamara Munzner Department of Computer Science University of British Columbia JRNL 520H, Special Topics in Contemporary Journalism: Data Visualization Week 2: 20 September 2016


slide-1
SLIDE 1

http://www.cs.ubc.ca/~tmm/courses/journ16

Week 2: 
 Arrange Tables

Tamara Munzner Department of Computer Science University of British Columbia

JRNL 520H, Special Topics in Contemporary Journalism: Data Visualization Week 2: 20 September 2016

slide-2
SLIDE 2

Finding us

  • office hours in Sing Tao bldg

–1-ish to 3-ish pm Tuesdays in Room 313: Tamara and/or Caitlin –by appointment: Tamara in ICICS/CS bldg Room X661

  • email other times

–tmm@cs.ubc.ca, caitlin@discoursemedia.org

  • course page is font of all information

–don’t forget to refresh, frequent updates –http://www.cs.ubc.ca/~tmm/courses/journ16

2

slide-3
SLIDE 3

Last Time

3

slide-4
SLIDE 4

Demo 1: Basic Visual Encoding & Dashboarding

  • Tableau Lessons

–Dimensions (categorical) and Measures (quantitative) –drag and drop to create visual encodings –combining multiple charts side by side into dashboards

  • Big Ideas

–see different patterns with different visual encodings

4

slide-5
SLIDE 5

Demo 2: Vancouver Election Results

  • Tableau Lessons

–sorting along axis –disaggregate into multiple charts

  • Big Ideas

–absolute numbers can sometimes mislead –check hunches with relative percentages!

5

slide-6
SLIDE 6

Demo 3: Vancouver Crime

  • Tableau Lessons

–multiple pills on a shelf, pill ordering –show filters –undo –duplicate & rename tabs

  • Big Ideas

–underlying causes can be tricky to understand

6

slide-7
SLIDE 7

Arrange Tables

7

slide-8
SLIDE 8

8

Encode Arrange Express Separate Order Align Use Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed

How? Encode Manipulate Facet

Map Color Motion Size, Angle, Curvature, ...

Hue Saturation Luminance

Shape

Direction, Rate, Frequency, ...

from categorical and ordered attributes

slide-9
SLIDE 9

9

Encode Arrange Express Separate Order Align

How? Encode Manipulate Facet

slide-10
SLIDE 10

10

Encode Arrange Express Separate Order Align

Encode tables: Arrange space

slide-11
SLIDE 11

11

Keys and values

  • key

–independent attribute –used as unique index to look up items –simple tables: 1 key –multidimensional tables: multiple keys

  • value

–dependent attribute, value of cell

  • classify arrangements by key count

–0, 1, 2, many...

1 Key 2 Keys 3 Keys Many Keys

List Recursive Subdivision Volume Matrix

Express Values Tables

Attributes (columns) Items (rows) Cell containing value

Multidimensional Table

Value in cell

slide-12
SLIDE 12

Idiom: scatterplot

  • express values

–quantitative attributes

  • no keys, only values

–data

  • 2 quant attribs

–mark: points –channels

  • horiz + vert position

–tasks

  • find trends, outliers, distribution, correlation, clusters

–scalability

  • hundreds of items

12

[A layered grammar of graphics.

  • Wickham. Journ. Computational and Graphical Statistics 19:1 (2010), 3–28.]

Express Values

slide-13
SLIDE 13

Some keys: Categorical regions

  • regions: contiguous bounded areas distinct from each other

–using space to separate (proximity) –following expressiveness principle for categorical attributes

  • use ordered attribute to order and align regions

13

1 Key 2 Keys 3 Keys Many Keys

List Recursive Subdivision Volume Matrix

Separate Order Align

slide-14
SLIDE 14

Idiom: bar chart

  • one key, one value

–data

  • 1 categ attrib, 1 quant attrib

–mark: lines –channels

  • length to express quant value
  • spatial regions: one per mark

– separated horizontally, aligned vertically – ordered by quant attrib » by label (alphabetical), by length attrib (data-driven)

–task

  • compare, lookup values

–scalability

  • dozens to hundreds of levels for key attrib

14

100 75 50 25 Animal Type 100 75 50 25 Animal Type

slide-15
SLIDE 15

Separated and Aligned but not Ordered

LIMITATION: Hard to know rank. What’s the 4th most? The 7th?

[Slide courtesy of Ben Jones]

slide-16
SLIDE 16

Separated, Aligned and Ordered

[Slide courtesy of Ben Jones]

slide-17
SLIDE 17

Separated but not Ordered or Aligned

LIMITATION: Hard to make comparisons

[Slide courtesy of Ben Jones]

slide-18
SLIDE 18

Idiom: stacked bar chart

  • one more key

–data

  • 2 categ attrib, 1 quant attrib

–mark: vertical stack of line marks

  • glyph: composite object, internal structure from multiple marks

–channels

  • length and color hue
  • spatial regions: one per glyph

– aligned: full glyph, lowest bar component – unaligned: other bar components

–task

  • part-to-whole relationship

–scalability

  • several to one dozen levels for stacked attrib

18

[Using Visualization to Understand the Behavior of Computer Systems. Bosch. Ph.D. thesis, Stanford Computer Science, 2001.]

slide-19
SLIDE 19

Idiom: streamgraph

  • generalized stacked graph

–emphasizing horizontal continuity

  • vs vertical items

–data

  • 1 categ key attrib (artist)
  • 1 ordered key attrib (time)
  • 1 quant value attrib (counts)

–derived data

  • geometry: layers, where height encodes counts
  • 1 quant attrib (layer ordering)

–scalability

  • hundreds of time keys
  • dozens to hundreds of artist keys

– more than stacked bars, since most layers don’t extend across whole chart

19

[Stacked Graphs Geometry & Aesthetics. Byron and Wattenberg. IEEE Trans. Visualization and Computer Graphics (Proc. InfoVis 2008) 14(6): 1245–1252, (2008).]

slide-20
SLIDE 20

Idiom: line chart

  • one key, one value

–data

  • 2 quant attribs

–mark: points

  • line connection marks between them

–channels

  • aligned lengths to express quant value
  • separated and ordered by key attrib into horizontal regions

–task

  • find trend

– connection marks emphasize ordering of items along key axis by explicitly showing relationship between one item and the next

20

20 15 10 5 Year

slide-21
SLIDE 21

Choosing bar vs line charts

  • depends on type of key

attrib

–bar charts if categorical –line charts if ordered

  • do not use line charts for

categorical key attribs

–violates expressiveness principle

  • implication of trend so strong

that it overrides semantics!

– “The more male a person is, the taller he/she is”

21

after [Bars and Lines: A Study of Graphic Communication. Zacks and Tversky. Memory and Cognition 27:6 (1999), 1073–1079.]

Female Male

60 50 40 30 20 10

Female Male

60 50 40 30 20 10

10-year-olds 12-year-olds

60 50 40 30 20 10 60 50 40 30 20 10

10-year-olds 12-year-olds

slide-22
SLIDE 22

Idiom: heatmap

  • two keys, one value

–data

  • 2 categ attribs (gene, experimental condition)
  • 1 quant attrib (expression levels)

–marks: area

  • separate and align in 2D matrix

– indexed by 2 categorical attributes

–channels

  • color by quant attrib

– (ordered diverging colormap)

–task

  • find clusters, outliers

–scalability

  • 1M items, 100s of categ levels, ~10 quant attrib levels

22

1 Key 2 Keys

List Matrix

Many Keys

Recursive Subdivision

slide-23
SLIDE 23

Idiom: cluster heatmap

  • in addition

–derived data

  • 2 cluster hierarchies

–dendrogram

  • parent-child relationships in tree with connection line marks
  • leaves aligned so interior branch heights easy to compare

–heatmap

  • marks (re-)ordered by cluster hierarchy traversal

23

slide-24
SLIDE 24

24

Axis Orientation Rectilinear Parallel Radial

slide-25
SLIDE 25

Idioms: scatterplot matrix, parallel coordinates

  • scatterplot matrix (SPLOM)

–rectilinear axes, point mark –all possible pairs of axes –scalability

  • one dozen attribs
  • dozens to hundreds of items
  • parallel coordinates

–parallel axes, jagged line representing item –rectilinear axes, item as point

  • axis ordering is major challenge

–scalability

  • dozens of attribs
  • hundreds of items

25

after [Visualization Course Figures. McGuffin, 2014. http://www.michaelmcguffin.com/courses/vis/]

Math Physics Dance Drama Math Physics Dance Drama Math Physics Dance Drama

100 90 80 70 60 50 40 30 20 10

Scatterplot Matrix Parallel Coordinates

Math Physics Dance Drama 85 90 65 50 40 95 80 50 40 60 70 60 90 95 80 65 50 90 80 90

Table

slide-26
SLIDE 26

Task: Correlation

  • scatterplot matrix

–positive correlation

  • diagonal low-to-high

–negative correlation

  • diagonal high-to-low

–uncorrelated

  • parallel coordinates

–positive correlation

  • parallel line segments

–negative correlation

  • all segments cross at halfway point

–uncorrelated

  • scattered crossings

26

[Hyperdimensional Data Analysis Using Parallel Coordinates.

  • Wegman. Journ. American Statistical Association 85:411

(1990), 664–675.] [A layered grammar of graphics.

  • Wickham. Journ.

Computational and Graphical Statistics 19:1 (2010), 3–28.]

slide-27
SLIDE 27

Idioms: radial bar chart, star plot

  • radial bar chart

–radial axes meet at central ring, line mark

  • star plot

–radial axes, meet at central point, line mark

  • bar chart

–rectilinear axes, aligned vertically

  • accuracy

–length unaligned with radial

  • less accurate than aligned with rectilinear

27

[Vismon: Facilitating Risk Assessment and Decision Making In Fisheries Management. Booshehrian, Möller, Peterman, and Munzner. Technical Report TR 2011-04, Simon Fraser University, School of Computing Science, 2011.]

slide-28
SLIDE 28

Radial Orientation: Radar Plots

LIMITATION: Not good when categories aren’t cyclic

[Slide courtesy of Ben Jones]

slide-29
SLIDE 29

"Diagram of the causes of mortality in the army in the East" (1858)

[Slide courtesy of Ben Jones]

slide-30
SLIDE 30

“Radar graphs: Avoid them (99.9% of the time)”

http://www.thefunctionalart.com/2012/11/radar-graphs-avoid-them-999-of-time.html

[Slide courtesy of Ben Jones]

slide-31
SLIDE 31

Idioms: pie chart, polar area chart

  • pie chart

–area marks with angle channel –accuracy: angle/area much less accurate than line length

  • polar area chart

–area marks with length channel –more direct analog to bar charts

  • data

–1 categ key attrib, 1 quant value attrib

  • task

–part-to-whole judgements

31

[A layered grammar of graphics.

  • Wickham. Journ. Computational and Graphical Statistics 19:1 (2010), 3–28.]
slide-32
SLIDE 32

Idioms: normalized stacked bar chart

  • task

–part-to-whole judgements

  • normalized stacked bar chart

–stacked bar chart, normalized to full vert height –single stacked bar equivalent to full pie

  • high information density: requires narrow rectangle
  • pie chart

–information density: requires large circle

32

http://bl.ocks.org/mbostock/3887235, http://bl.ocks.org/mbostock/3886208, http://bl.ocks.org/mbostock/3886394.

3/21/2014 bl.ocks.org/mbostock/raw/3887235/ http://bl.ocks.org/mbostock/raw/3887235/ 1/1 <5 5-13 14-17 18-24 25-44 45-64 ≥65 3/21/2014 bl.ocks.org/mbostock/raw/3886394/ http://bl.ocks.org/mbostock/raw/3886394/ 1/1 UT TX ID AZ NV GA AK MSNMNE CA OK SDCO KSWYNC AR LA IN IL MNDE HI SCMOVA IA TN KY AL WAMDNDOH WI OR NJ MT MI FL NY DC CT PA MAWV RI NHME VT 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Under 5 Years 5 to 13 Years 14 to 17 Years 18 to 24 Years 25 to 44 Years 45 to 64 Years 65 Years and Over 3/21/2014 bl.ocks.org/mbostock/raw/3886208/ http://bl.ocks.org/mbostock/raw/3886208/ 1/1 CA TX NY FL IL PA OH MI GA NC NJ VA WA AZ MA IN TN MO MD WI MN CO AL SC LA KY OR OK CT IA MS AR KS UT NV NMWV NE ID ME NH HI RI MT DE SD AK ND VT DC WY 0.0 5.0M 10M 15M 20M 25M 30M 35M Population 65 Years and Over 45 to 64 Years 25 to 44 Years 18 to 24 Years 14 to 17 Years 5 to 13 Years Under 5 Years 3/21/2014 bl.ocks.org/mbostock/raw/3886394/ http://bl.ocks.org/mbostock/raw/3886394/ 1/1 UT TX ID AZ NV GA AK MSNMNE CA OK SDCO KSWYNC AR LA IN IL MNDE HI SCMOVA IA TN KY AL WAMDNDOH WI OR NJ MT MI FL NY DC CT PA MAWV RI NHME VT 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Under 5 Years 5 to 13 Years 14 to 17 Years 18 to 24 Years 25 to 44 Years 45 to 64 Years 65 Years and Over
slide-33
SLIDE 33

Idiom: glyphmaps

  • rectilinear good for linear vs

nonlinear trends

  • radial good for cyclic patterns

33

[Glyph-maps for Visually Exploring Temporal Patterns in Climate Data and Models. Wickham, Hofmann, Wickham, and Cook. Environmetrics 23:5 (2012), 382–393.]

slide-34
SLIDE 34

34

  • rectilinear: scalability wrt #axes
  • 2 axes best
  • 3 problematic

– more in afternoon

  • 4+ impossible
  • parallel: unfamiliarity, training time
  • radial: perceptual limits

–asymmetry: angles lower precision than lengths

  • sometimes can be exploited!

Orientation limitations

Axis Orientation Rectilinear Parallel Radial

[Uncovering Strengths and Weaknesses of Radial Visualizations - an Empirical Approach. Diehl, Beck and Burch. IEEE TVCG (Proc. InfoVis) 16(6):935--942, 2010.]

slide-35
SLIDE 35

35

Layout Density Dense

[Visualization of test information to assist fault localization. Jones, Harrold, Stasko. Proc. ICSE 2002, p 467-477.]

dense software overviews

slide-36
SLIDE 36

Basic Timelines – Working with Dates

[Slide courtesy of Ben Jones]

slide-37
SLIDE 37

Column Charts

[Slide courtesy of Ben Jones]

slide-38
SLIDE 38

Inverted Column Charts

[Slide courtesy of Ben Jones]

slide-39
SLIDE 39

Gantt Charts

[Slide courtesy of Ben Jones]

slide-40
SLIDE 40

Slopegraphs

[Slide courtesy of Ben Jones]

slide-41
SLIDE 41

Change from Previous

[Slide courtesy of Ben Jones]

slide-42
SLIDE 42

Connected Scatterplots

[Slide courtesy of Ben Jones]

slide-43
SLIDE 43

Dual Axis Line Plots

[Slide courtesy of Ben Jones]

slide-44
SLIDE 44

Next

  • Break (15 min)
  • Demos (45 min)

– Caitlin will walk through Tableau demos – you follow along step by step on your own laptop –Tamara will rove the room to help out folks who get stuck

  • Lab (30 min)

– you’ll get started on Tableau assignment

44