Visualizing Public Health Data Anamaria Crisan, MSc PhD student - - PowerPoint PPT Presentation

visualizing public health data
SMART_READER_LITE
LIVE PREVIEW

Visualizing Public Health Data Anamaria Crisan, MSc PhD student - - PowerPoint PPT Presentation

Visualizing Public Health Data Anamaria Crisan, MSc PhD student with Drs. Jennifer Gardy & Tamara Munzner UBC School of Population and Public Health Primary Research Question To what extent and in what ways does the visualization of


slide-1
SLIDE 1

Visualizing Public Health Data

Anamaria Crisan, MSc

PhD student with Drs. Jennifer Gardy & Tamara Munzner UBC School of Population and Public Health
slide-2
SLIDE 2 2

To what extent and in what ways does the visualization of genomic, administrative, and contact network data support decision making for communicable disease prevention and control Primary Research Question

slide-3
SLIDE 3 3

To what extent and in what ways does the visualization of genomic, administrative, and contact network data support decision making for communicable disease prevention and control Primary Research Question

  • aka. “How is visualization of communicable disease (public

health) data useful? Can I quantify how useful it is?”

slide-4
SLIDE 4 4

How I came to ask this question

Communicating with non-technical experts Communicating cancer risk to patients Statistics and data visualization

The Structure of this Talk

slide-5
SLIDE 5 5

How I came to ask this question How I plan to answer this question

Communicating with non-technical experts Communicating cancer risk to patients Statistics and data visualization Data Visualization Research Integration with Evaluation from Public Health Examples of Work

The Structure of this Talk

slide-6
SLIDE 6

Part 1:

How I came to ask the question

6
slide-7
SLIDE 7 7

Disclaimer

I’ll be talking about a project I worked on while employed at GenomeDx Biosciences. Everything I am presenting is publically available, but this doesn’t mean that I endorse their products or the products of their competitors. Furthermore, I am relaying high level details of my own thought process during and after this project, not the thoughts of others at the organization.

slide-8
SLIDE 8 8

I’m not an artist. I’m a data analyst.

http://blog.framed.io/ Computer Science Skills + Data Visualization Skills!
slide-9
SLIDE 9 9

Eventually I had Explain my Work to Experts with Different Backgrounds

I often used data visualization to explain the results of data mining and statistical techniques But one day I got tasked with a rather challenging problem…

slide-10
SLIDE 10 10

The Question:

The task: We had developed a genomic biomarker panel to assess a man’s risk of metastatic prostate cancer following prostatectomy

How do we communicate “risk”?

XKCD Comic #881
slide-11
SLIDE 11 11

I wanted to take more ownership of the question “how do we communicate risk?”

slide-12
SLIDE 12 12

I wanted to take more ownership of the question “how do we communicate risk?” There wasn’t a simple answer

slide-13
SLIDE 13 13

http://bit.ly/1Knrj19

Just show a Number …

slide-14
SLIDE 14 14

60%

Probability Frequency Visualization

6 in 10

< <

(difficult to understand) (easier to understand)

Evidence from Risk Communication Literature

Whiting et. al (2015) “How well do health professionals interpret diagnostic information? A systematic review”

Numeracy : the ability to reason with numbers Individuals with low numeracy have a difficulty interpreting numbers and probabilities Visualizations can help people with low numeracy make sense of data, But, there is some evidence that low numeracy affects reasoning with graphs as well.

slide-15
SLIDE 15 15

Example : Data Visualization in Shared decision Making

Garcia-Retamero et. al (2013) “Visual representation of statistical information improves diagnostic inferences in doctors and their patients” R A N D O M I Z E Probability Frequency R N D Visual Aid No Visual Aid R N D Visual Aid No Visual Aid Patients + Doctors

STUDY DESIGN RESULTS

Visualization improved comprehension of both doctors and patients Visualization improved concordance between doctors and patients Quasi-randomized trial with four conditions Outcome : correctly calculating the risk (essentially a math test)
slide-16
SLIDE 16

Yes! Data visualization was more than a “nice to have”!

slide-17
SLIDE 17 17

Example Report: OncotypeDx DCIS report

Show a Number and a Picture

slide-18
SLIDE 18 18

Example Report: Myriad Prolaris Prostate Cancer Test Report

Show a Number and a Picture

slide-19
SLIDE 19 19

Example Report: Decipher Prostate Cancer Test Report

Primary population: Men, who are susceptible to red- green colour blindness

Show a Number and a Picture

slide-20
SLIDE 20 20

Example : Deciding upon an Intervention

Baseline Visualization Alternative 1 Alternative 2

Zikmund-Fisher (2013). A demonstration of ''less can be more'' in risk graphics. Zikmund-Fisher (2008). Improving understanding of adjuvant therapy

  • ptions by using simpler risk

graphics

Helping breast cancer patients decide between multiple treatment

  • ptions.
slide-21
SLIDE 21 21

Data visualization is not art

Beyond Building Pretty & Cool Visualizations

slide-22
SLIDE 22

Design Art

Ideas taken from @rachelbinx’s 2016 Open Vis talk And http:/ /featureguru.com/art-vs-design.html

Data Visualization

(I argue data visualization is much more about design)

22

Defining Data Visualization

Beyond Building Pretty & Cool Visualizations

slide-23
SLIDE 23

Final Data Visualization

TB incidence rates overlain on geography Iceberg Ideas borrowed from @rachelbinx’s 2016 Open Vis talk 23

There’s more a Visualization than Meets the Eye

(BCCDC reportable disease dashboard)
slide-24
SLIDE 24

Final Data Visualization

TB incidence rates overlain on geography 24 (BCCDC reportable disease dashboard)

There’s more a Visualization than Meets the Eye

slide-25
SLIDE 25

But there was a lot that went into creating that simple visualization

25

Data

  • We rarely visualize raw data
  • Data has issues of quality
  • We combined multiple dataset
We often derive we data

There’s more a Visualization than Meets the Eye

slide-26
SLIDE 26

But there was a lot that went into creating that simple visualization

26

Alternative choices

Picked this choice of visualization over others

There’s more a Visualization than Meets the Eye

slide-27
SLIDE 27

But there was a lot that went into creating that simple visualization

27

Visual & Interactive Design

Visual Design: How data visualized data looks Interaction Design: How to interact with the data visualization

There’s more a Visualization than Meets the Eye

slide-28
SLIDE 28

But there was a lot that went into creating that simple visualization

28

Motivations

Increasing public awareness Allocate Resources Monitor program progress Target outreach programs

There’s more a Visualization than Meets the Eye

slide-29
SLIDE 29

But there was a lot that went into creating that simple visualization

29

There’s more a visualization than meets the eye

Tasks

(Atomized components of the motivation) Communicate rates of TB by Health Service Delivery Area (HSDA) region Overlay descriptive statistics on geography

slide-30
SLIDE 30 30

There’s more to data visualization than simply communicating numerical data

BUT WAIT!

slide-31
SLIDE 31 31

Example : Hypothesis Generation

John Snow’s Visualization of the 1854 Cholera Outbreak

Allowed John Snow to form the hypothesis of what may be leading to the cholera

  • utbreak
slide-32
SLIDE 32 32

Example : Hypothesis Generation

John Snow’s Visualization of the 1854 Cholera Outbreak

Allowed John Snow to form the hypothesis of what may be leading to the cholera

  • utbreak
slide-33
SLIDE 33 33

Example : Checking Assumptions of Statistical Models

Anscombe’s quartet, four datasets that have near identical descriptive statistics but that look very different when visualized.

Anscombe, F. (1973) “Graphs in Statistical Analysis”

Data visualization has long complemented applied statistical

  • practices. Consider Tukey’s classic

“Exploratory Data Analysis”, which is rife with suggestions for how to visualization data.

slide-34
SLIDE 34 34

Example : Visualizing Public Health Data

slide-35
SLIDE 35

Why?

Why do you need to visualize data?

What?

What kind of data is being visualized?

How?

How is data being visualized?

A Data visualization in 3 Questions:

35

(Motivation) (Data) (Visual and Interaction Design)

slide-36
SLIDE 36

Why? A Data visualization in 3 Questions:

36

What? How?

Design Evaluation

Does the visualization solve a relevant problem? Are you using the right data, or deriving the right data? Are the visual and interactive design choices appropriate?

slide-37
SLIDE 37

Why What How How

Steps to Design and Evaluate a Data Visualization

DESIGN EVALUATION

37 Munzner (2014) “Visualization Analysis and Design”
slide-38
SLIDE 38

Why What How How

Steps to Design and Evaluate a Data Visualization

Qualitative Methods, Domain Knowledge Qualitative & Quantitative Methods Design & Cognitive Science Computer Science

Methodology

38
slide-39
SLIDE 39

Part 2:

How I plan to answer the question

39
slide-40
SLIDE 40 40

How Data Visualization is like Statistical Modelling

statistical model Input data (to fit the model) Parameters

Model selection is a design problem

slide-41
SLIDE 41 41

Colour = Continent Size = Population

Five dimensions are plotted in 2D

(4 continuous dimensions & 1 categorical dimension)

Transparency = Similarity

How Data Visualization is like Statistical Modelling

“Parameters” of Visual and Interaction Design!

slide-42
SLIDE 42 Munzner (2014) “Visualization Analysis and Design”

Basic Building Blocks of Data Visualization

42

“Parameters”

slide-43
SLIDE 43 43

“Parameters” of Visual and Interaction Design!

Colour = Continent Transparency = Density Reveal detail on hover

How Data Visualization is like Statistical Modelling

slide-44
SLIDE 44 44

The same parameters can be combined in different ways to yield different visualizations

How Data Visualization is like Statistical Modelling

“Parameters” of Visual and Interaction Design!

slide-45
SLIDE 45 45

A finale note on parameters For brevity, I haven’t exhaustively described all the different components, which I’ve called parameters, that can be a part of data visualization

How Data Visualization is like Statistical Modelling

For more in depth details consider: Visualization Design and Analysis (2014) by Tamara Munzner

slide-46
SLIDE 46 46

OPTIMIZATION! Searching the parameter space for a model that yields that lowest error

error

How Data Visualization is like Statistical Modelling

Finding the best model

slide-47
SLIDE 47 47

The “Design Space” metaphor

Sedlmair 2012 https://www.cs.ubc.ca/nest/imager/tr/2012/dsm/dsm-talk.pdf

OPTIMIZATION!

How Data Visualization is like Statistical Modelling

slide-48
SLIDE 48 48

The “Design Space” metaphor

Sedlmair 2012 https://www.cs.ubc.ca/nest/imager/tr/2012/dsm/dsm-talk.pdf

OPTIMIZATION!

How Data Visualization is like Statistical Modelling

slide-49
SLIDE 49 49 Sedlmair 2012 https://www.cs.ubc.ca/nest/imager/tr/2012/dsm/dsm-talk.pdf

The “Design Space” metaphor

Progressively Identify the Right Visualization

Use “why, what, and how” framework to guide the selection

  • f the optimal design choice
slide-50
SLIDE 50 50

The Importance of Thinking Broadly

Munzner (2014) “Visualization Analysis and Design”

Use “why, what, and how” framework to guide the selection

  • f the optimal design choice
slide-51
SLIDE 51

Designs for Visualizing Health Data (http:/

/www.vizhealth.org/) 51
slide-52
SLIDE 52 52

A final note

How Data Visualization is like Statistical Modelling

Data visualization and statistical modelling are not identical, even though at a high-level they share similar research processes I’ve presented one aspect of visualization research, but there are others I haven’t touched upon

I’ve emphasize problem driven work – finding the right visualization for a specific motivation or task – but there also exists technique and systems type research

slide-53
SLIDE 53 Matthew Brehmer’s totally subjective ranking of vis design tools 53

How to Implement Data Visualizations

slide-54
SLIDE 54

How do we design good visualizations for public health?

54

BUT…..

slide-55
SLIDE 55

Motivations Underlying my Doctoral Work

For communicable disease prevention and control

Decision Support Design Space

Characterizing and evaluation the design space

  • f public health microbial

genomics

slide-56
SLIDE 56

Motivations Underlying my Doctoral Work

For communicable disease prevention and control

Decision Support Design Space

Characterizing and evaluation the design space

  • f public health microbial

genomics

Methodology

Designing and evaluating data visualizations through a public health lens

slide-57
SLIDE 57 57

DECISION SUPPORT

Visualizing Tuberculosis data at the British Columbia Centre for Disease Control

slide-58
SLIDE 58

WHY

58
slide-59
SLIDE 59

Clinical Social Lab

Combining Data will Prepare us for the Pandemics of the Future

59
slide-60
SLIDE 60 60

But, that’s a lot of data….

slide-61
SLIDE 61

Can Visualizing TB data help Decision Support?

We wanted to create an interactive and visual tool that allowed

  • ur public health stakeholders to analyze the different data types

We want to understand how this tool can be used by different public health stakeholders

TB Nurses TB Clinicians Medical Health Officers 61 Researchers Epis / Biostats
slide-62
SLIDE 62

WHAT

62
slide-63
SLIDE 63

Treatment Genomic Contact Network Patient Data Outcomes Geography / Location

63
slide-64
SLIDE 64

Treatment Genomic Contact Network Patient Data Outcomes Geography / Location

64

TB whole genome Genotyping

slide-65
SLIDE 65

Treatment Genomic Contact Network Patient Data Outcomes Geography / Location time

65
slide-66
SLIDE 66

HOW

66
slide-67
SLIDE 67

An Iterative Approach to Development

67

An iterative approach to development allows us to get feedback before committing to ineffective design choices

slide-68
SLIDE 68 Ac Active TB Genomic (Whole TB Genome Sequencing Data) La Latent TB Patient Demographic & Treatment Data Ac Active TB Genomic (MIRU- VNTR) & Contact Data Ac Active TB Patient Demographic & Treatment Data

The Big Picture

Effort Time

mo most le least mo most 68

But this takes a lot of time & effort

slide-69
SLIDE 69

Introducing EpiCOGs

DEMO

69

EpiCogs is a data viewer and currently a sandbox environment for developing data visualizations

slide-70
SLIDE 70

Task: Filter patients and identify where they are

Factors Influencing the Current Design

Filter patients from the side panel, and interactively update the line list & map based upon those interactions

Task: Follow-up on selected patients

Select patients view – a subset of the data. For out reach nurses, and request to include driving directions.

Task: Incorporate existing statistical methods

Analysis modes, allows epidemiologists and biostatisticians to integrate their R methods into EPI COGS

Task: Provide overview of key metrics

Predefined analysis modules that in the future will be migrated to “reports” section.

slide-71
SLIDE 71

Technology Changes

Factors Influencing the Current Design

Support for data visualization tools in R improved greatly allowing for the creation of better data visualizations

Data Driven Interface and Analysis

Created a data driven interface that is responsive to the user’s data.

Policies and Procedures

Existing policies and procedures at the BCCDC inform the utility of such a tool and how it can integrate into existing workflows

Needs of individuals

Gathered through meetings, dialogue with individuals, and various iterations of EpiCOGs

slide-72
SLIDE 72

Much initial work was to understand the tool’s feasibility

Initial Work & Next Directions

Could it meet the needs of stakeholders? How could it integrate (security & workflow)? How could it be supported long term? (Choice of R) Could we build a useful tool in R?

Next phases will explore genotypes, genomics, and contact networks

Right now, users can filter based on assigned genotype clusters (which will show patients on map), but we’re working towards better visual and interactive design for these data

slide-73
SLIDE 73

TRY THE DEMO:

https:/ /amcrisan.shinyapps.io/EpiCOGSDEMO/

GET THE CODE

(& contribute to the project!) :

https:/ /github.com/amcrisan/EpiCOGS/

This is an Open Source Project

slide-74
SLIDE 74

Call for Guinea Pigs!

To make relevant tools I need feedback! If you want to be involved and get project updates let me know!

E-mail: anamaria.crisan@bccdc.ca Twitter: @amcrisan Web : cs.ubc.ca/~acrisan

slide-75
SLIDE 75

Design Space

Exploring the Public Health Microbial Genomics Design Space

75
slide-76
SLIDE 76

WHY

76
slide-77
SLIDE 77

Can we Define the Design Space for Microbial Genomics?

77
slide-78
SLIDE 78

WHAT

78
slide-79
SLIDE 79

Research literature and public documents already contain visualizations that are commonly used Public Health and specifically for microbial genomics Annotate those visualizations to develop a code set for “why, what, how”

Can we Define the Design Space for Microbial Genomics?

slide-80
SLIDE 80

HOW

80

Example: Outbreak Narratives

slide-81
SLIDE 81 81
slide-82
SLIDE 82 82
slide-83
SLIDE 83 83
slide-84
SLIDE 84 Outbreaks Domain Specific Terminology How P1 P2 P3 P4 P5 P6 P7 1.0 Genomic Phylogenetic Tree X X X Hierarchical Clustering X Matrix X X 1.1 Clusters 1.1.1 Clades Annotation X Map Colour X 1.1.2 Derived Annotation X Arrange Order X X X Map Colour X X X 1.1.3 Genotype Annotation X Facet X 1.2 Node Distance Align root X X X X Annotation X X X Scatter Plot X 1.3 SNV/Ps Annotation X Map Colour X X X 1.4 Spatial Distribution X X 1.5 Temporal Distribution X X 84
slide-85
SLIDE 85

Part 3:

Take home messages

85
slide-86
SLIDE 86

Beyond Building Pretty & Cool Visualizations

86

Data visualization is not art It is a research process.

slide-87
SLIDE 87 87

Data Visualization is not an art or graphic design project

Take Home Messages

Relevance (utility) and usability trump aesthetics

slide-88
SLIDE 88 88

Data Visualization is not an art or graphic design project Deciding upon the most appropriate data visualization can be a research problem

Think about ”why, what, and how” framework Parallels to finding the right statistical model Relevance (utility) and usability trump aesthetics Design & Evaluation

Take Home Messages

slide-89
SLIDE 89 89

Data Visualization is not an art or graphic design project Deciding upon the most appropriate data visualization can be a research problem

Relevance (utility) and usability trump aesthetics

Think broadly, progressively find the right data visualization

The Design Space Concept Iterative development Think about ”why, what, and how” framework Parallels to finding the right statistical model Design & Evaluation

Take Home Messages

slide-90
SLIDE 90

Genomics is Becoming more Important

90
slide-91
SLIDE 91 PH PHSA SA Re Reference Labor borator
  • ry
  • Dr. Patrick Tang
Hope Lapointe Clare Kong BC BCCDC CDPACS Ciaran Aiken Laura MacDougall Mike Coss Sunny Mak Mike Otterstatter Robert Balshaw BC BCCDC Clinical TB B Clinical Team Clinicians
  • Dr. Maureen Mayhew
  • Dr. James Johnston
  • Dr. Jason Wong (CPS)
  • Dr. Victoria Cook
Nurses Nash Dhalla Michelle Mesaros Epidemiologists Dr
  • Dr. Da
David Roth The The Ga Gard rdy La Lab Dr
  • Dr. Jennifer Ga
Gard rdy Jennifer Guthrie UB UBC C Co Computer Science Dr
  • Dr. Tamara Mu
Munzner The InfoVis group

Th This would work not be possible without these fi fine people

91 The The large e tea eam of ind ndividua ual’s fr from B BC’s H HAs a and H HSDAs wi without wh whom there wo would be be no
  • da
data.