What is data visualization & how can you use it in your daily - - PowerPoint PPT Presentation

what is data visualization how can you use it in your
SMART_READER_LITE
LIVE PREVIEW

What is data visualization & how can you use it in your daily - - PowerPoint PPT Presentation

What is data visualization & how can you use it in your daily work? Anamaria (Ana) Crisan PhD Student University of British Columbia Evening Rounds November, 15 th , 2016 WHO AM I? + Data Visualization Skills! Computer Science Skills


slide-1
SLIDE 1

What is data visualization & how can you use it in your daily work?

Anamaria (Ana) Crisan PhD Student University of British Columbia

Evening Rounds November, 15th, 2016

slide-2
SLIDE 2

Un Undergrad

CompSci

Mas Masters

Bioinformatics

Ph PhD Ge GenomeDX BC BCCDC

EX EXPER ERIEN ENCE

(work experience) (Clinical) (Public Health)

Computer Science Skills

+ Data Visualization Skills!

WHO AM I?

@amcrisan http://cs.ubc.ca/~acrisan

slide-3
SLIDE 3

WHAT DO I RESEARCH?

Genomic Contact Network Patient Data Outcomes Geography / Location time Treatment

Person Place Time

TB Nurses TB Clinicians Medical Health Officers Researchers Community Leaders

slide-4
SLIDE 4

WHAT DO I RESEARCH?

slide-5
SLIDE 5

Data Visualization is not an art or graphic design project

WHAT YOU CAN TAKE AWAY FROM THIS TALK

Deciding upon the most appropriate data visualization can be a research problem

Think about ”why, what, and how” framework Design & Evaluation

Think broadly, progressively find the right data visualization

Example: Communicating Patient Risk Example : Visualization and Election

slide-6
SLIDE 6

IF VISUALIZATIONS WERE CARS A BEAUTIFULLY IMPRACTICAL OPTION

Ferrari visualizations look super cool and take a lot of time, effort, and resources to produce, but they’re not necessarily practical for most applications. Worthwhile creating sometimes, but think it through.

slide-7
SLIDE 7

A LESS BEAUTIFUL PRACTICAL OPTION

Toyota Visualizations are well engineered and fit a variety of needs making it a more practical choice. Also, less expensive (time, effort, money) than a Ferrari. Lacks the “wow” factor of a Ferrari, but can hold its own.

IF VISUALIZATIONS WERE CARS

slide-8
SLIDE 8

POSSIBLY DANGEROUS

Pinto visualizations are tempting because they inexpensive (really low time, energy, money), but they are questionably engineered. Aspects of the visualization are not properly tested with stakeholders, and it can explode if tapped the wrong way.

IF VISUALIZATIONS WERE CARS

slide-9
SLIDE 9

DATA VISUALIZATION IS NOT AN ART PROJECT

slide-10
SLIDE 10
  • Before we talk “big data” let’s talk “artisanal small

batch data”

  • With the paper in front of you, sketch out as many

examples as you can to visualize the following to numbers:

EXERCISE: VISUALIZING TWO QUANTITIES

75 37

example:

slide-11
SLIDE 11

http://www.scribblelive.com/blog/2012/07/27/45-ways-to-communicate-two-quantities/

EXERCISE: VISUALIZING TWO QUANTITIES

Why do this exercise?

slide-12
SLIDE 12

Borkin (2011). “Evaluation of Artery Visualizations for Heart Disease Diagnosis”

EXAMPLE : CHANGING ARTERY VISUALIZATIONS

MOTIVATION: Improve accuracy to identify blockages in heart arteries

slide-13
SLIDE 13

Borkin (2011). “Evaluation of Artery Visualizations for Heart Disease Diagnosis”

EXAMPLE : CHANGING ARTERY VISUALIZATIONS

MOTIVATION: Improve accuracy to identify blockages in heart arteries

slide-14
SLIDE 14

Borkin (2011). “Evaluation of Artery Visualizations for Heart Disease Diagnosis”

EXISTING STANDARD Accuracy : 39% REVISED VISUALIZATION Accuracy: 91%

RESULTS: Revised visualizations had higher accuracy

EXAMPLE : CHANGING ARTERY VISUALIZATIONS

slide-15
SLIDE 15

Why?

Why do you need to visualize data?

What?

What kind of data is being visualized?

How?

How is data being visualized?

(Motivation) (Data) (Visual and Interaction Design)

DATA VISUALIZATION IN THREE QUESTIONS

slide-16
SLIDE 16

A QUICK NOTE ON “WHAT”

16 Munzner (2014) “Visualization Analysis and Design”

Don’t just visualize the raw data!

slide-17
SLIDE 17

Why?

17

What? How?

Design Evaluation

Does the visualization solve a relevant problem? Are you using the right data, or deriving the right data? Are the visual and interactive design choices appropriate?

DATA VISUALIZATION IN THREE QUESTIONS

slide-18
SLIDE 18

HOW TO DESIGN & EVALUATE DATA VIZ

18 Munzner (2014) “Visualization Analysis and Design”

Data Visual + Interaction Design Choices Technique Motivation

Why? What? How?

A visualization can be decomposed into four layers

slide-19
SLIDE 19

HOW TO DESIGN & EVALUATE DATA VIZ

19 Munzner (2014) “Visualization Analysis and Design”

Data Visual + Interaction Design Choices Technique

A visualization can be decomposed into four layers

Motivation

Why? What? How?

Design Process: start with the “why” (domain problem) work your way in to “how”

DESIGN

slide-20
SLIDE 20

HOW TO DESIGN & EVALUATE DATA VIZ

20 Munzner (2014) “Visualization Analysis and Design”

Data Visual + Interaction Design Choices Technique

A visualization can be decomposed into four layers

Motivation

Why? What? How?

Evaluation Process: start with the “how” and assess if it’s the right choice for the “why” and “what”

EVALUATION

slide-21
SLIDE 21

HOW TO DESIGN & EVALUATE DATA VIZ

21 Munzner (2014) “Visualization Analysis and Design”

Data Visual + Interaction Design Choices Technique Motivation

Why? What? How?

A visualization can be decomposed into four layers We’ll talk a little bit about the “how” today

slide-22
SLIDE 22

BREAKING DOWN “HOW”

22 Munzner (2014) “Visualization Analysis and Design”

Building up a visualization from geometric points

slide-23
SLIDE 23

Cleveland and McGill 1984; Heer and Bostock 2010

Some visualizations are more effective than others

Most errors Least Errors

BREAKING DOWN “HOW”

slide-24
SLIDE 24

Munzner (2014) “Visualization Analysis and Design”

EXAMPLE: BREAKING DOWN A VISUALIZATION

Vertical Position Vertical Position Vertical Position Vertical Position Horizontal Position Horizontal Position Horizontal Position Colour Colour Size

slide-25
SLIDE 25

25

Colour = Continent Size = Population

Five dimensions are plotted in 2D

(4 continuous dimensions & 1 categorical dimension)

Transparency = Similarity Position: HE Position: LE

EXAMPLE: BREAKING DOWN A VISUALIZATION

slide-26
SLIDE 26

EXAMPLE: BREAKING DOWN A VISUALIZATION

*Note* not the same data

slide-27
SLIDE 27

Find more terrible visualizations here!

slide-28
SLIDE 28

Matthew Brehmer’s totally subjective ranking of vis design tools

SOFTWARE TOOLS FOR DATA VISUALIZATION

slide-29
SLIDE 29

BUT ALSO PEN & PAPER!

Dear Data Project (Lupi & Posavec)

slide-30
SLIDE 30

VISUALIZING AN ELECTION

slide-31
SLIDE 31

31

  • Point of example is not to discuss:
  • Correctness / relevancy of polling or forecasting
  • Politics of results
  • Very interesting data visualizations emerged

from US election cycle (before & after)

  • Forecasting relied on reporting probabilities; also

commonly reported in medicine

US ELECTIONS DATA VISUALIZATIONS

slide-32
SLIDE 32

32

http://bit.ly/1FxtT2z

PROBABILITY INCONSISTENTLY INTERPRETTED

slide-33
SLIDE 33

33

US ELECTIONS DATA VISUALIZATIONS

Show forecasted voter intentions % chance of “winning” state; geography Choropleth map WHY WHAT HOW

slide-34
SLIDE 34

34

US ELECTIONS DATA VISUALIZATIONS

Show forecasted voter intentions % chance of “winning” state; geography; # of EC votes Cartogram WHY WHAT HOW

slide-35
SLIDE 35

35

US ELECTIONS DATA VISUALIZATIONS

Show forecasted voter intentions % chance of “winning” state; # of EC votes Snakey Diagram WHY WHAT HOW

slide-36
SLIDE 36

36

US ELECTIONS DATA VISUALIZATIONS

Support for each party by region Margin of win; total # votes cast; geography Choropleth map WHY WHAT HOW

slide-37
SLIDE 37

37

US ELECTIONS DATA VISUALIZATIONS

Changes in votes cast between 2016 & 2012 Changing support; margin (points) of lead by party Choropleth map WHY WHAT HOW

slide-38
SLIDE 38

38

US ELECTIONS DATA VISUALIZATIONS

Changes in votes cast between 2016 & 2012 Changing support; margin (points) of lead by party Choropleth map WHY WHAT HOW

slide-39
SLIDE 39

39

US ELECTIONS DATA VISUALIZATIONS

Changes in votes cast between 2016 & 2012 changing support; margin (percentage) of lead by party; # EC votes Stacked bar chart WHY WHAT HOW

slide-40
SLIDE 40

40

  • All visualizations have trying to solve very similar

problem

§ Show how people may vote & how this effects elections

  • Very different types of data shown in each

visualization

§ Visualization only as good as underlying data § VERY important to understand data sources

  • Different use of visual metaphors, some simple,

some complex

US ELECTIONS DATA VISUALIZATIONS

slide-41
SLIDE 41

COMMUNICATING PATIENT RISK

slide-42
SLIDE 42

42

XKCD Comic #881

HOW DO WE COMMUNICATE RISK?

slide-43
SLIDE 43

43

60%

Probability Frequency Visualization

6 in 10

< <

(difficult to understand) (easier to understand)

EVIDENCE FROM RISK COMMUNICATION

Whiting (2015) “How well do health professionals interpret diagnostic information? A systematic review”

  • Numeracy : the ability to reason with numbers

§ Individuals with low numeracy have a difficulty interpreting numbers and probabilities

  • Visualizations can help people with low

numeracy make sense of data

  • But- limited guidance toward vis design

§ Different visualizations are not equally effective

slide-44
SLIDE 44

Garcia-Retamero et. al (2013) “Visual representation of statistical information improves diagnostic inferences in doctors and their patients”

R A N D O M I Z E Probability Frequency R N D Visual Aid No Visual Aid R N D Visual Aid No Visual Aid Patients + Doctors

STUDY DESIGN RESULTS

Visualization improved comprehension of both doctors and patients Visualization improved concordance between doctors and patients Quasi-randomized trial with four conditions Outcome : correctly calculating the risk (essentially a math test)

EXAMPLE : SHARED DECISION MAKING

slide-45
SLIDE 45

45

EXAMPLE : BREAST CANCER TX CHOICE

Baseline Visualization Alternative 1 Alternative 2

slide-46
SLIDE 46

EXAMPLE : WWW. VIZHEALTH.ORG

slide-47
SLIDE 47

MAMOGRAPHY SCREENING PROBLEM

What is the probability that a woman who participates in routine screening and receives a positive result has breast cancer? “The probability of breast cancer is 1% for a woman who participates in routine screening. If a woman who participates in routine screening has breast cancer, the probability is 80% that she will have a positive test result. If a woman who participates in routine screening does not have breast cancer, the probability is 9.6% that she will have a positive test result”

slide-48
SLIDE 48

MAMOGRAPHY SCREENING PROBLEM

“The probability of breast cancer is 1% for a woman who participates in routine screening. If a woman who participates in routine screening has breast cancer, the probability is 80% that she will have a positive test result. If a woman who participates in routine screening does not have breast cancer, the probability is 9.6% that she will have a positive test result”

Has Breast Cancer No Breast Cancer Total Positive Result 8 95 103 Negative Result 2 895 897 Total 10 (1%) 990 (99%) 1000

slide-49
SLIDE 49

MAMOGRAPHY SCREENING PROBLEM

“The probability of breast cancer is 1% for a woman who participates in routine screening. If a woman who participates in routine screening has breast cancer, the probability is 80% that she will have a positive test result. If a woman who participates in routine screening does not have breast cancer, the probability is 9.6% that she will have a positive test result”

Has Breast Cancer No Breast Cancer Total Positive Result 8 95 103 Negative Result 2 895 897 Total 10 (1%) 990 (99%) 1000 Probability of breast cancer given positive mammography : 7.8% (8 ÷ 103)

slide-50
SLIDE 50

MAMOGRAPHY SCREENING PROBLEM 1000 women 10

Have breast cancer

990

Do Not have breast cancer

8

Positive Result

2

Negative Result

895

Negative Result

95

Positive Result

slide-51
SLIDE 51

MAMOGRAPHY SCREENING PROBLEM

slide-52
SLIDE 52

MAMOGRAPHY SCREENING PROBLEM

slide-53
SLIDE 53

WHAT DO YOU THINK?

Why? What? How?