http://www.cs.ubc.ca/~tmm/courses/547-17F
Ch 1/2/3: Intro, Data, Tasks Paper: Design Study Methodology
Tamara Munzner Department of Computer Science University of British Columbia
CPSC 547, Information Visualization Week 2: 19 September 2017
Ch 1/2/3: Intro, Data, Tasks Paper: Design Study Methodology Tamara - - PowerPoint PPT Presentation
Ch 1/2/3: Intro, Data, Tasks Paper: Design Study Methodology Tamara Munzner Department of Computer Science University of British Columbia CPSC 547, Information Visualization Week 2: 19 September 2017 http://www.cs.ubc.ca/~tmm/courses/547-17F
http://www.cs.ubc.ca/~tmm/courses/547-17F
Tamara Munzner Department of Computer Science University of British Columbia
CPSC 547, Information Visualization Week 2: 19 September 2017
News
–one question/comment per reading required
–many of you could be more concise/compact –few responses to others
–if you spot typo in book, let me know if it’s not already in errata list
2
3
Why have a human in the loop?
– don’t know exactly what questions to ask in advance
– long-term use for end users (e.g. exploratory analysis of scientific data) – presentation of known results – stepping stone to better understanding of requirements before developing models – help developers of automatic solution refine/debug, determine parameters – help end users of automatic solutions verify, build trust
4
Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively. Visualization is suitable when there is a need to augment human capabilities rather than replace people with computational decision-making methods.
Why use an external representation?
5
Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively.
[Cerebral: Visualizing Multiple Experimental Conditions on a Graph with Biological Context. Barsky, Munzner, Gardy, and Kincaid. IEEE TVCG (Proc. InfoVis) 14(6):1253-1260, 2008.]
Why represent all the data?
–confirm expected and find unexpected patterns –assess validity of statistical model
6
Identical statistics x mean 9 x variance 10 y mean 7.5 y variance 3.75 x/y correlation 0.816
Anscombe’s Quartet
Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively.
https://www.youtube.com/watch?v=DbJyPELmhJc
Same Stats, Different Graphs
Why focus on tasks and effectiveness?
–idioms do not serve all tasks equally! –challenge: recast tasks from domain-specific vocabulary to abstract forms
–validation is necessary, but tricky –increases chance of finding good solutions if you understand full space of possibilities
–novel: enable entirely new kinds of analysis –faster: speed up existing workflows
7
Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively.
Why are there resource limitations?
–processing time –system memory
–human attention and memory
–pixels are precious resource, the most constrained resource –information density: ratio of space used to encode info vs unused whitespace
8
Vis designers must take into account three very different kinds of resource limitations: those of computers, of humans, and of displays.
Analysis: What, why, and how
–data abstraction
–task abstraction
–idiom: visual encoding and interaction
–translation process iterative, tricky
about design space
9
Why analyze?
space
–scaffold to help you think systematically about choices –analyzing existing as stepping stone to designing new –most possibilities ineffective for particular task/data combination
10 [SpaceTree: Supporting Exploration in Large Node Link Tree, Design Evolution and Empirical
SpaceTree
[TreeJuxtaposer: Scalable Tree Comparison Using Focus +Context With Guaranteed
Graphics (Proc. SIGGRAPH) 22:453– 462, 2003.]
TreeJuxtaposer
Present Locate Identify Path between two nodes Actions Targets SpaceTree TreeJuxtaposer Encode Navigate Select Filter Aggregate Tree Arrange Why? What? How? Encode Navigate Select
11
Encode Arrange Express Separate Order Align Use Map Color Motion Size, Angle, Curvature, ...
Hue Saturation Luminance
Shape
Direction, Rate, Frequency, ...
from categorical and ordered attributes
Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed
How? Encode Manipulate Facet
VAD Ch 2: Data Abstraction
12
[VAD Fig 2.1]
Datasets
What?
Attributes Dataset Types Data Types Data and Dataset Types Tables
Attributes (columns) Items (rows) Cell containing valueNetworks
Link Node (item) TreesFields (Continuous) Geometry (Spatial)
Attributes (columns) Value in cell Cell Multidimensional Table Value in cellItems Attributes Links Positions Grids Attribute Types Ordering Direction Categorical Ordered
Ordinal QuantitativeSequential Diverging Cyclic Tables Networks & Trees Fields Geometry Clusters, Sets, Lists
Items Attributes Items (nodes) Links Attributes Grids Positions Attributes Items Positions Items Grid of positions PositionWhy? How? What?
13
Three major datatypes
14
Node em)
Fields (Continuous)
Attributes (columns) Value in cell
Cell Grid of positions
Geometry (Spatial)
Position
Spatial
Net Tables
Attributes (columns) Items (rows) Cell containing value
Dataset Types
Multidimensional Table
Value in cell
Networks
Link Node (item)
Trees
–geometry is design decision
15
Attribute types
Attribute Types Ordering Direction Categorical Ordered
Ordinal Quantitative
Sequential Diverging Cyclic
Dataset and data types
16
Dataset Availability Static Dynamic Data Types Items Attributes Links Positions Grids Data and Dataset Types Tables Networks & Trees Fields Geometry Clusters, Sets, Lists
Items Attributes Items (nodes) Links Attributes Grids Positions Attributes Items Positions Items
Further reading: Articles
56(5):586-599, 2009.
Visualization: A High-Level Taxonomy. InfoVis 2004, p 151-158, 2004.
Visualizations Ben Shneiderman, Proc. 1996 IEEE Visual Languages
Visualization Design Space. Stuart Card and Jock Mackinlay, Proc. InfoVis 97.
Visualization of Multi-dimensional Relational Databases. Chris Stolte, Diane Tang and Pat Hanrahan, IEEE TVCG 8(1): 52-65 2002.
17
Further reading: Books
–Chap 2: Data Abstraction
Visualization: Using Vision to Think. Stuart Card, Jock Mackinlay, and Ben Shneiderman.
–Chap 1
Visualization: Principles and Practice, 2nd ed. Alexandru Telea, CRC Press, 2014.
Visualization: Foundations, Techniques, and Applications, 2nd ed. Matthew
Visualization Handbook. Charles Hansen and Chris Johnson, eds. Academic Press, 2004.
Schroeder, Ken Martin, and Bill Lorensen. Kitware 2006.
Chris Tominski. Springer 2011.
18
VAD Ch 3: Task Abstraction
19
[VAD Fig 3.1]
Trends Actions Analyze Search Query
Why?
All Data Outliers Features Attributes One Many
Distribution Dependency Correlation Similarity
Network Data Spatial Data Shape Topology
Paths Extremes
Consume
Present Enjoy Discover
Produce
Annotate Record Derive
Identify Compare Summarize
tagTarget known Target unknown Location known Location unknown Lookup Locate Browse Explore
Targets Why? How? What?
–discover distribution –compare trends –locate outliers –browse topology
20
High-level actions: Analyze
–discover vs present
–enjoy
–annotate, record –derive
Analyze Consume
Present Enjoy Discover
Produce
Annotate Record Derive
tag
Derive
–decide what the right thing to show is –create it with a series of transformations from the original dataset –draw that
21
Original Data
exports imports
Derived Data
trade balance = exports −imports trade balance
22
Actions: Mid-level search, low-level query
–target, location
matters?
–one, some, all
mix & match –analyze, query, search
Search Query Identify Compare Summarize
Target known Target unknown Location known Location unknown
Lookup Locate Browse Explore
Targets
23
Trends All Data Outliers Features Attributes One Many
Distribution Dependency Correlation Similarity Extremes
Network Data Spatial Data Shape Topology
Paths
Analysis example: Compare idioms
24 [SpaceTree: Supporting Exploration in Large Node Link Tree, Design Evolution and Empirical
SpaceTree
[TreeJuxtaposer: Scalable Tree Comparison Using Focus +Context With Guaranteed
Graphics (Proc. SIGGRAPH) 22:453– 462, 2003.]
TreeJuxtaposer
Present Locate Identify Path between two nodes Actions Targets SpaceTree TreeJuxtaposer Encode Navigate Select Filter Aggregate Tree Arrange Why? What? How? Encode Navigate Select
Analysis example: Derive one attribute
25
[Using Strahler numbers for real time visual exploration of huge graphs. Auber.
– centrality metric for trees/networks – derived quantitative attribute – draw top 5K of 500K for good skeleton
Task 1
.58 .54 .64 .84 .24 .74 .64 .84 .84 .94 .74
Out Quantitative attribute on nodes
.58 .54 .64 .84 .24 .74 .64 .84 .84 .94 .74
In Quantitative attribute on nodes Task 2 Derive Why? What? In Tree Reduce Summarize How? Why? What? In Quantitative attribute on nodes Topology In Tree Filter In Tree Out Filtered Tree Removed unimportant parts In Tree
+
Out Quantitative attribute on nodes Out Filtered Tree
Chained sequences
26
–express dependencies –separate means from ends
joint work with:
Reflections from the Trenches and from the Stacks
Sedlmair, Meyer, Munzner. IEEE Trans. Visualization and Computer Graphics 18(12): 2431-2440, 2012 (Proc. InfoVis 2012).
Michael Sedlmair, Miriah Meyer http://www.cs.ubc.ca/labs/imager/tr/2012/dsm/
Design Study Methodology: Reflections from the Trenches and from the Stacks.
27
Design Studies: Lessons learned after 21 of them
MizBee genomics Car-X-Ray in-car networks Cerebral genomics RelEx in-car networks AutobahnVis in-car networks QuestVis sustainability LiveRAC server hosting Pathline genomics SessionViewer web log analysis PowerSetViewer data mining MostVis in-car networks Constellation linguistics Caidants multicast Vismon fisheries management ProgSpy2010 in-car networks WiKeVis in-car networks Cardiogram in-car networks LibVis cultural heritage MulteeSum genomics LastHistory music listening VisTra in-car networks
28
Methodology
29
Methodology for problem-driven work
DESIGN STUDY METHODOLOGY SUITABLE
ALGORITHM AUTOMATION POSSIBLE PRECONDITION personal validation CORE inward-facing validation ANALYSIS30
PF-1 premature advance: jumping forward over stages general PF-2 premature start: insufficient knowledge of vis literature learn PF-3 premature commitment: collaboration with wrong people winnow PF-4 no real data available (yet) winnow PF-5 insufficient time available from potential collaborators winnow PF-6 no need for visualization: problem can be automated winnow PF-7 researcher expertise does not match domain problem winnow PF-8 no need for research: engineering vs. research project winnow PF-9 no need for change: existing tools are good enough winnow PF-10 no real/important/recurring task winnowDesign studies: problem-driven vis research
–real users and real data, –collaboration is (often) fundamental
–implications: requirements, multiple ideas
–at appropriate levels
–transferable research: improve design guidelines for vis in general
31
Design study methodology: definitions
32
INFORMATION LOCATION
computer head
TASK CLARITY
fuzzy crisp NOT ENOUGH DATA
DESIGN STUDY METHODOLOGY SUITABLE
ALGORITHM AUTOMATION POSSIBLE
9 stage framework
PRECONDITION CORE ANALYSIS
learn implement winnow cast discover design deploy reflect write
33
9-stage framework
ANALYSIS
reflect write
CORE
implement discover design deploy learn winnow cast
PRECONDITION
learn winnow cast
34
9-stage framework
PRECONDITION ANALYSIS
reflect write
CORE
implement discover design deploy learn winnow cast
35
9-stage framework
PRECONDITION ANALYSIS
reflect write
CORE
implement discover design deploy learn winnow cast
36
9-stage framework
PRECONDITION ANALYSIS
reflect write
CORE
implement discover design deploy learn winnow cast
37
Design study methodology: 32 pitfalls
38
PF-1 premature advance: jumping forward over stages general PF-2 premature start: insufficient knowledge of vis literature learn PF-3 premature commitment: collaboration with wrong people winnow PF-4 no real data available (yet) winnow PF-5 insufficient time available from potential collaborators winnow PF-6 no need for visualization: problem can be automated winnow PF-7 researcher expertise does not match domain problem winnow PF-8 no need for research: engineering vs. research project winnow PF-9 no need for change: existing tools are good enough winnow PF-10 no real/important/recurring task winnow
Collaboration incentives: Bidirectional
–win: access to more suitable tools, can do better/faster/cheaper science –time spent could pay off with earlier access and/or more customized tools
–win: access to better understanding of your driving problems
–opportunities to observe how you use them
–leads us to develop guidelines on how to build better tools in general
39
Of course!!! I’m a domain expert! Wanna collaborate?
40
PREMATURE COLLABORATION COMMITMENT
41
Collaborator winnowing
initial conversation
42
(potential collaborators)
initial conversation further meetings
43
Collaborator winnowing
initial conversation further meetings prototyping
Collaborator winnowing
44
initial conversation further meetings prototyping full collaboration
Collaborator winnowing
45
collaborator
Collaborator winnowing
initial conversation further meetings prototyping full collaboration
46
Design study methodology: 32 pitfalls
47
PF-1 premature advance: jumping forward over stages general PF-2 premature start: insufficient knowledge of vis literature learn PF-3 premature commitment: collaboration with wrong people winnow PF-4 no real data available (yet) winnow PF-5 insufficient time available from potential collaborators winnow PF-6 no need for visualization: problem can be automated winnow PF-7 researcher expertise does not match domain problem winnow PF-8 no need for research: engineering vs. research project winnow PF-9 no need for change: existing tools are good enough winnow PF-10 no real/important/recurring task winnow
Have data? Have time? Have need? ... Research problem for me?...
48
Design study methodology: 32 pitfalls
49
Are you a user??? ... or maybe a fellow tool builder?
50
biologist bioinformatician
Examples from the trenches
PowerSet Viewer 2 years / 4 researchers WikeVis 0.5 years / 2 researchers
51
Design study methodology: 32 pitfalls
52
53
PREMATURE DESIGN COMMITMENT
I want a tool with that cool technique I saw the
Of course they need the cool technique I built last year!
54
PREMATURE DESIGN COMMITMENT
+ +
+ good
+ +
small scope
Design study methodology: 32 pitfalls
57
PF-1 premature advance: jumping forward over stages general PF-2 premature start: insufficient knowledge of vis literature learn PF-3 premature commitment: collaboration with wrong people winnow PF-4 no real data available (yet) winnow PF-5 insufficient time available from potential collaborators winnow PF-6 no need for visualization: problem can be automated winnow PF-7 researcher expertise does not match domain problem winnow PF-8 no need for research: engineering vs. research project winnow PF-9 no need for change: existing tools are good enough winnow PF-10 no real/important/recurring task winnow
+ +
+ +
broad scope
+ +
+ +
+ +
+ +
Design study methodology: 32 pitfalls
64
65
PREMATURE DESIGN COMMITMENT DOMAIN EXPERTS FOCUSED ON VIS DESIGN VS DOMAIN PROBLEM
Tell me more about your current workflow problems! I want a tool with that cool technique I saw the
Design study methodology: 32 pitfalls
66
67
algorithm innovation design studies Must be first! Am I ready?
http://www.alaineknipes.com/interests/violin_concert.jpg http://www.prlog.org/10480334-wolverhampton-horse-racing-live-streaming-wolverhampton-handicap-8- jan-2010.html
Pitfall Example: Premature Publishing
Further reading: Design studies
Visualization and Computer Graphics 16(6):908-917 (Proc. InfoVis 2010), 2010.
2010), 29(3):1043-1052
Visualization and Computer Graphics (Proc. InfoVis 2010), 16(6):900-907, 2010.
Visualizing genome sequence assemblies. Cydney B. Nielsen, Shaun D. Jackman, Inanc Birol, Steven J.M. Jones. IEEE Transactions on Visualization and Computer Graphics (Proc InfoVis 2009) 15(6):881-8, 2009.
Visualization of Biomechanical Motion Data. Daniel F. Keefe, Marcus Ewert, William Ribarsky, Remco Chang. IEEE Trans. Visualization and Computer Graphics (Proc. Vis 2009), 15(6):1383-1390, 2009.
Visualization and Computer Graphics (Proc. InfoVis 09), 15(6):897-904, 2009.
Visual Analysis of Protein Complexes Using Mass Spectrometry. Robert Kincaid and Kurt Dejgaard. IEEE Symp Visual Analytics Science and Technology (VAST 2009), p 163-170, 2009.
Visualizing Multiple Experimental Conditions on a Graph with Biological Context. Aaron Barsky, Tamara Munzner, Jennifer L. Gardy, and Robert Kincaid. IEEE Transactions on Visualization and Computer Graphics (Proc. InfoVis 2008) 14(6) (Nov-Dec) 2008, p 1253-1260.
Visualization (Special Issue on Visual Analytics), Feb 2007.
Viewer: Visual Exploratory Analysis of Web Session Logs. Heidi Lam, Daniel Russell, Diane Tang, and Tamara Munzner. Proc. IEEE Symposium on Visual Analytics Science and Technology (VAST), p 147-154, 2007.
Visualization (2005) 4, 176-190.
Views for the Visual Exploration of Microarray Time-Series Data Paul Craig and Jessie Kennedy, Proc. InfoVis 2003, p 173-180.
Visualization of Time Series Data. Jarke J. van Wijk and Edward R. van Selow, Proc. InfoVis 1999, p 4-9.
Visualization Tool For Linguistic Queries from MindNet. Tamara Munzner, Francois Guimbretiere, and George Robertson. Proc. InfoVis 1999, p 132-135.
68
69
70
Next Time
–VAD Ch. 4: Validation –VAD Ch. 5: Marks and Channels –VAD Ch 6: Rules of Thumb –paper: Artery Viz
71