Introduction to data An Andr drew Heiss, PhD Brigham Young - - PowerPoint PPT Presentation

introduction to data
SMART_READER_LITE
LIVE PREVIEW

Introduction to data An Andr drew Heiss, PhD Brigham Young - - PowerPoint PPT Presentation

Introduction to data An Andr drew Heiss, PhD Brigham Young University visualization with R SLC RUG July 11, 2018 @andrewheiss Plan for today Why visualize data? Types of visualizations Aesthetics and design ! Take a sad plot and make


slide-1
SLIDE 1

Introduction to data visualization with R

An Andr drew Heiss, PhD

Brigham Young University SLC RUG • July 11, 2018 @andrewheiss

slide-2
SLIDE 2

Why visualize data?

Plan for today

Aesthetics and design Types of visualizations

! Take a sad plot and make it CRAPier !

slide-3
SLIDE 3

talks.andrewheiss.com/utah-rug-dataviz/

slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6

Why visualize data?

slide-7
SLIDE 7

Data alone cannot tell stories or prove theories

slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10

Never trust summary statistics alone

slide-11
SLIDE 11

Humans are visual creatures

@FacesPics

slide-12
SLIDE 12
slide-13
SLIDE 13

What makes a good visualization?

slide-14
SLIDE 14
slide-15
SLIDE 15

Characteristics of graphical excellence

  • 1. “... the well-designed presentation of interesting

data—a matter of substance, statistics, and design.”

  • 2. Complex ideas communicated with

clarity, precision, and efficiency.

  • 3. That which gives the viewer the greatest

number of ideas in the shortest time with the least ink in the smallest space.

  • 4. Nearly always multivariate.
  • 5. Requires telling the truth about the data.
slide-16
SLIDE 16

What makes Minard’s graph so great?

=

  • <latexit sha1_base64="pYen8CoRw76JGC2IK5JDw/YfKA=">ACN3icbVDLSgMxFM34rPVdekmWARXZUYERCKutCNKFgrtKXcSTNtaCYzJHfEMsxfufE3OnGhSJu/QPTB6itBwKHc87l5h4/lsKg6z47U9Mzs3PzuYX84tLymphbf3GRIlmvMIiGelbHwyXQvEKCpT8NtYcQl/yqt896fvVO6NiNQ19mLeCKGtRCAYoJWahYs68ntMTwHh8Fx1qe7rGT2i9UADS4euNbLmTzDLJvXrCEFmWbNQdEvuAHSeCNSJCNcNgtP9VbEkpArZBKMqXlujI0UNAomeZavJ4bHwLrQ5jVLFYTcNLB3RndtkqLBpG2TyEdqL8nUgiN6YW+TYaAHTPu9cX/vFqCwUEjFSpOkCs2XBQkmJE+yXSltCcoexZAkwL+1fKOmALQ1t13pbgjZ8SW52S5b8q72iuXjUR05skm2yA7xyD4pkzNySqEkQfyQt7Iu/PovDofzucwOuWMZjbIHzhf3yfvry4=</latexit><latexit sha1_base64="pYen8CoRw76JGC2IK5JDw/YfKA=">ACN3icbVDLSgMxFM34rPVdekmWARXZUYERCKutCNKFgrtKXcSTNtaCYzJHfEMsxfufE3OnGhSJu/QPTB6itBwKHc87l5h4/lsKg6z47U9Mzs3PzuYX84tLymphbf3GRIlmvMIiGelbHwyXQvEKCpT8NtYcQl/yqt896fvVO6NiNQ19mLeCKGtRCAYoJWahYs68ntMTwHh8Fx1qe7rGT2i9UADS4euNbLmTzDLJvXrCEFmWbNQdEvuAHSeCNSJCNcNgtP9VbEkpArZBKMqXlujI0UNAomeZavJ4bHwLrQ5jVLFYTcNLB3RndtkqLBpG2TyEdqL8nUgiN6YW+TYaAHTPu9cX/vFqCwUEjFSpOkCs2XBQkmJE+yXSltCcoexZAkwL+1fKOmALQ1t13pbgjZ8SW52S5b8q72iuXjUR05skm2yA7xyD4pkzNySqEkQfyQt7Iu/PovDofzucwOuWMZjbIHzhf3yfvry4=</latexit><latexit sha1_base64="pYen8CoRw76JGC2IK5JDw/YfKA=">ACN3icbVDLSgMxFM34rPVdekmWARXZUYERCKutCNKFgrtKXcSTNtaCYzJHfEMsxfufE3OnGhSJu/QPTB6itBwKHc87l5h4/lsKg6z47U9Mzs3PzuYX84tLymphbf3GRIlmvMIiGelbHwyXQvEKCpT8NtYcQl/yqt896fvVO6NiNQ19mLeCKGtRCAYoJWahYs68ntMTwHh8Fx1qe7rGT2i9UADS4euNbLmTzDLJvXrCEFmWbNQdEvuAHSeCNSJCNcNgtP9VbEkpArZBKMqXlujI0UNAomeZavJ4bHwLrQ5jVLFYTcNLB3RndtkqLBpG2TyEdqL8nUgiN6YW+TYaAHTPu9cX/vFqCwUEjFSpOkCs2XBQkmJE+yXSltCcoexZAkwL+1fKOmALQ1t13pbgjZ8SW52S5b8q72iuXjUR05skm2yA7xyD4pkzNySqEkQfyQt7Iu/PovDofzucwOuWMZjbIHzhf3yfvry4=</latexit><latexit sha1_base64="pYen8CoRw76JGC2IK5JDw/YfKA=">ACN3icbVDLSgMxFM34rPVdekmWARXZUYERCKutCNKFgrtKXcSTNtaCYzJHfEMsxfufE3OnGhSJu/QPTB6itBwKHc87l5h4/lsKg6z47U9Mzs3PzuYX84tLymphbf3GRIlmvMIiGelbHwyXQvEKCpT8NtYcQl/yqt896fvVO6NiNQ19mLeCKGtRCAYoJWahYs68ntMTwHh8Fx1qe7rGT2i9UADS4euNbLmTzDLJvXrCEFmWbNQdEvuAHSeCNSJCNcNgtP9VbEkpArZBKMqXlujI0UNAomeZavJ4bHwLrQ5jVLFYTcNLB3RndtkqLBpG2TyEdqL8nUgiN6YW+TYaAHTPu9cX/vFqCwUEjFSpOkCs2XBQkmJE+yXSltCcoexZAkwL+1fKOmALQ1t13pbgjZ8SW52S5b8q72iuXjUR05skm2yA7xyD4pkzNySqEkQfyQt7Iu/PovDofzucwOuWMZjbIHzhf3yfvry4=</latexit>
slide-17
SLIDE 17

We forget this!

slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20

Types of visualizations

slide-21
SLIDE 21

Exploratory visualizations

Academic-ish Quick scatterplots, histograms, other charts to help understand your data

Explanatory visualizations

Publishable Consumable by the general public; Vox, NYT, Washington Post, FiveThirtyEight, etc.

slide-22
SLIDE 22

Exploratory data analysis

Find analytical insight in data (even causal inference ! )

slide-23
SLIDE 23

Explanatory data analysis

Annotate and tell a story

slide-24
SLIDE 24

Exploratory Explanatory

slide-25
SLIDE 25

Which chart type do I use?

slide-26
SLIDE 26

Aesthetics and design

slide-27
SLIDE 27
slide-28
SLIDE 28
slide-29
SLIDE 29

Is he maybe a little right?

plot(mtcars$wt, mtcars$mpg)

slide-30
SLIDE 30

R can’t do everything

There’s still a place for Illustrator, InDesign, et al.

slide-31
SLIDE 31

But R can do a lot

And it can automate most of the hard work

slide-32
SLIDE 32

What constitutes good design?

slide-33
SLIDE 33

Four core design principles

Co Contrast Re Repetition Al Alignmen ent Pr Proximity

slide-34
SLIDE 34

Contrast

“If f two it items are not exactly ly th the same, , make th them diffe

  • ifferent. Really different.”

.”

Don’t be a wimp!

slide-35
SLIDE 35

Serif Sans Serif Slab Serif Script Decorative Lorem ipsum dolor sit amet Lo Lorem ipsum um do dolor sit am amet

Lorem ipsum dolor sit amet

Lo Lorem ip ipsu sum do dolor sit am amet

Lo Lorem ip ipsum do dolor sit am amet Light Lorem ipsum dolor sit amet Black Lorem ipsum dolor sit amet

slide-36
SLIDE 36

https://color.adobe.com/ http://colorbrewer2.org/ viridis Scientific Colour-Maps https://github.com/thomasp85/scico

slide-37
SLIDE 37
slide-38
SLIDE 38

Repetition

“R “Repeat at some as aspect

  • f

f the desig ign throughout the entir ire pie iece.”

slide-39
SLIDE 39
slide-40
SLIDE 40

Alignment

“Every it item should ld have a vis isual l connectio ion wit ith somethin ing els lse on the page.”

slide-41
SLIDE 41
slide-42
SLIDE 42
slide-43
SLIDE 43

Proximity

“Group rela lated it items together.”

slide-44
SLIDE 44
slide-45
SLIDE 45

Contrast Repetition Alignment Proximity

slide-46
SLIDE 46

! Take a sad plot and make it CRAPier !

slide-47
SLIDE 47

By default, R graphics violate CRAP

plot(mtcars$wt, mtcars$mpg, main = "Here's a title") ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() + labs(title = "Here's a title")

slide-48
SLIDE 48

With ggplot’s theme() and other functions, we can make beautiful CRAPy figures automatically with R talks.andrewheiss.com/utah-rug-dataviz/

You can also do this in base R, but I find ggplot’s paradigm more intuitive

slide-49
SLIDE 49

Moral of the story

Graphics are essential for telling stories and gaining insight Design principles (CRAP) make graphics better understandable R + ggplot can follow CRAP and make beautiful, insightful graphics