best practices in data visualization
play

Best Practices in Data Visualization Jodie Jenkinson, Associate - PowerPoint PPT Presentation

Best Practices in Data Visualization Jodie Jenkinson, Associate Professor + Director Biomedical Communications University of Toronto bmc.med.utoronto.ca Why visualize data? Because of the inexplicability of complex information or raw data


  1. Best Practices in Data Visualization Jodie Jenkinson, Associate Professor + Director Biomedical Communications University of Toronto bmc.med.utoronto.ca Why visualize data? • Because of the inexplicability of complex information or raw data • In order to leverage visual perception • To create an aid to understanding • To provide insight Best Practices in Data Visualization – ComSciComCan Jenkinson

  2. Leveraging visual perception Vision is high bandwidth • 40%+ of cortex devoted to visual perception • Information processing capacity of the visual system • High-bandwidth channel • 109 bits per second* • Parallel processing • = 1 billion bits • = ~120 Megabytes per second * Information Capacity of a Single Retinal Channel, DH Kelly, IRE Transactions on Information Theory, 1962, pp. 221 Best Practices in Data Visualization – ComSciComCan Jenkinson Best Practices in Data Visualization – ComSciComCan Jenkinson Pre-attentive Attend to All visual stimuli Pre-attentive

  3. Leveraging visual perception Vision is high bandwidth • 40%+ of cortex devoted to visual perception • Information processing capacity of the visual system • High-bandwidth channel • 109 bits per second* • Parallel processing • = 1 billion bits • = ~120 Megabytes per second * Information Capacity of a Single Retinal Channel, DH Kelly, IRE Transactions on Information Theory, 1962, pp. 221 Best Practices in Data Visualization – ComSciComCan Jenkinson Best Practices in Data Visualization – ComSciComCan Jenkinson Pre-attentive Attend to All visual stimuli Pre-attentive

  4. Providing insight Why visualize? Anscombe’s quartet Anscombe’s quartet • Same I II III IV I II III IV • Mean (x and y) x y x y x y x y x y x y x y x y 10 8.04 10 9.14 10 7.46 8 6.58 10 8.04 10 9.14 10 7.46 8 6.58 • Variance 8 6.95 8 8.14 8 6.77 8 5.76 8 6.95 8 8.14 8 6.77 8 5.76 13 7.58 13 8.74 13 12.74 8 7.71 • Correlation 9 8.81 9 8.77 9 7.11 8 8.84 13 7.58 13 8.74 13 12.74 8 7.71 11 8.33 11 9.26 11 7.81 8 8.47 • Regression 9 8.81 9 8.77 9 7.11 8 8.84 14 9.96 14 8.1 14 8.84 8 7.04 6 7.24 6 6.13 6 6.08 8 5.25 11 8.33 11 9.26 11 7.81 8 8.47 4 4.26 4 3.1 4 5.39 19 12.5 14 9.96 14 8.1 14 8.84 8 7.04 12 10.84 12 9.13 12 8.15 8 5.56 7 4.82 7 7.26 7 6.42 8 7.91 6 7.24 6 6.13 6 6.08 8 5.25 5 5.68 5 4.74 5 5.73 8 6.89 4 4.26 4 3.1 4 5.39 19 12.5 12 10.84 12 9.13 12 8.15 8 5.56 7 4.82 7 7.26 7 6.42 8 7.91 5 5.68 5 4.74 5 5.73 8 6.89 Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciComCan Jenkinson

  5. Why visualize? Anscombe’s quartet I II III IV x y x y x y x y 10 8.04 10 9.14 10 7.46 8 6.58 8 6.95 8 8.14 8 6.77 8 5.76 13 7.58 13 8.74 13 12.74 8 7.71 9 8.81 9 8.77 9 7.11 8 8.84 11 8.33 11 9.26 11 7.81 8 8.47 14 9.96 14 8.1 14 8.84 8 7.04 “…If a picture is only 6 7.24 6 6.13 6 6.08 8 5.25 4 4.26 4 3.1 4 5.39 19 12.5 worth a thousand 12 10.84 12 9.13 12 8.15 8 5.56 7 4.82 7 7.26 7 6.42 8 7.91 5 5.68 5 4.74 5 5.73 8 6.89 words, we're screwed.” 13 13 13 13 11 11 11 11 9 9 9 9 6 6 6 6 Eric Lander 4 4 4 4 4 8 12 16 20 4 8 12 16 20 4 8 12 16 20 4 8 12 16 20 Professor of Biology, MIT Francis J. Anscombe, Graphs in Statistical Analysis. The American Statistician, vol. 27, no. 1, pp. 17–21, 1973 VIZBI 2011, Opening Remarks Best Practices in Data Visualization – ComSciConCan Jenkinson Data Visualization in a nutshell • Visual mapping (encoding data) • Providing adequate context • Balancing clarity & aesthetics Encoding Data Best Practices in Data Visualization – ComSciComCan Jenkinson

  6. Title Text Data types • Visual representation of data should be consistent with the numerical • Nominal representation • name, type, category • eg. mammals, reptiles, birds • Ordinal • integer sequence • eg. first, second, third • happy, very happy, ecstatic Best Practices in Data Visualization – ComSciComCan Jenkinson Best Practices in Data Visualization – ComSciComCan Jenkinson Data types Data features • Interval • Spatial • gap in values • eg. maps, GIS, directions • eg. every three months • scalar fields • Ratio • Narrative • real numbers; zero as reference • eg. assembly sequence, process • 45.7 out of 100 Best Practices in Data Visualization – ComSciComCan Jenkinson Best Practices in Data Visualization – ComSciComCan Jenkinson

  7. Data features Visual encoding • The way in which data is mapped to visual structures 1d • Every visualization can be described as a set of mappings: 2d • from data items to visual marks • from data attributes to visual channels 3d Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciComCan Jenkinson Title Text Marks • Data Items: • Data Marks: the basic visual units that represent data objects visually • Data Attributes: • Visual Channels: the visual variables we can use to represent characteristics of these objects From Enrico Bertini Best Practices in Data Visualization – ComSciComCan Jenkinson Best Practices in Data Visualization – ComSciConCan Jenkinson

  8. Allows more accurate judgments Position along a common scale Channels Position along nonaligned scales Cleveland & McGill’s Length Perceptual Task Direction Scale Angle Area Volume Curvature Shading Allows more generic judgments Colour saturation From Enrico Bertini From The Functional Art, Alberto Cairo Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciConCan Jenkinson Cleveland and McGill’s Perceptual Task Scale A basic example: the pie chart • Tasks are grouped according to how well you can perceive differences in • People are not good at making visual angular distinctions the data • Pie charts are sometimes rolled out to encode 1 or 2 numbers; usually a very low data density! Best Practices in Data Visualization – ComSciComCan Jenkinson Best Practices in Data Visualization – ComSciComCan Jenkinson

  9. Pie charts Pie charts 7% 8% 2002 2003 2004 2002 2003 2004 2005 2006 2007 2005 2006 2007 35% 10% 11% 29% Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciConCan Jenkinson Lies, damned lies… • Misalignments between graphic elements and the data they are intended to represent “There are three kinds of lies: lies, damned lies, and statistics” Benjamin Disraeli Best Practices in Data Visualization – ComSciComCan Jenkinson

  10. Misleading use of area Misleading use of area Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciConCan Jenkinson Misleading use of area Misleading use of area… Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciConCan Jenkinson

  11. Linear vs. quadratic change Linear vs quadratic change • A common mistake for chart design is to scale an area by two sides at the same time, producing a quadratic effect for a linear change The area of the blue circle is equal to π r2 (20,106) Ti e area of the white square = a 2 r = 80 Ti e area of the white square =6,400 px r = 160 The area of the red circle is equal to π r2 To double the area of white square =12,800 px (80, 424) Ti e area of red square is twice that of white square Ti e area of the greysquare a = 25,600 px 2 √ 12,800 = 113 ...four times that of the white square a = 80 a = 160 a = 113 Best Practices in Data Visualization – ComSciComCan Jenkinson Best Practices in Data Visualization – ComSciConCan Jenkinson Size encoding How statistics lie • Modifications to the X or Y axis in an attempt to make differences or change appear to be more dramatic • Data represented out of context does not allow for adequate comparison Height Area Volume Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciComCan Jenkinson

  12. The disappearing baseline The disappearing baseline Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciConCan Jenkinson Reversing the x-axis Best Practices in Data Visualization – ComSciConCan Jenkinson

  13. Reversing the x-axis No defined y-axis Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciConCan Jenkinson No defined y-axis Context Best Practices in Data Visualization – ComSciConCan Jenkinson

  14. Context • Without context we are unable to see the big picture • Without context we are unable to make meaningful comparisons Source: New York Times Best Practices in Data Visualization – ComSciComCan Jenkinson Source: New York Times Source: New York Times

  15. Data in and out of context Data in and out of context Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciConCan Jenkinson Data in and out of context Historical context Best Practices in Data Visualization – ComSciConCan Jenkinson Best Practices in Data Visualization – ComSciConCan Jenkinson

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend