Best practices: bar plots IN TERMEDIATE DATA VIS UALIZ ATION W ITH - - PowerPoint PPT Presentation

best practices bar plots
SMART_READER_LITE
LIVE PREVIEW

Best practices: bar plots IN TERMEDIATE DATA VIS UALIZ ATION W ITH - - PowerPoint PPT Presentation

Best practices: bar plots IN TERMEDIATE DATA VIS UALIZ ATION W ITH GGP LOT2 Rick Scavetta Founder, Scavetta Academy In this chapter Common pitfalls in Data Viz Best way to represent data For effective explanatory (communication), and For


slide-1
SLIDE 1

Best practices: bar plots

IN TERMEDIATE DATA VIS UALIZ ATION W ITH GGP LOT2

Rick Scavetta

Founder, Scavetta Academy

slide-2
SLIDE 2

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

In this chapter

Common pitfalls in Data Viz Best way to represent data For effective explanatory (communication), and For effective exploratory (investigation) plots

slide-3
SLIDE 3

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Bar plots

Two types Absolute values Distributions

slide-4
SLIDE 4

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Mammalian sleep

Observations: 76 Variables: 3 $ vore <chr> "carni", "omni", "herbi", "omni", "herbi", "h $ total <dbl> 12.1, 17.0, 14.4, 14.9, 4.0, 14.4, 8.7, 10.1, $ rem <dbl> NA, 1.8, 2.4, 2.3, 0.7, 2.2, 1.4, 2.9, NA, 0.

slide-5
SLIDE 5

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Dynamite plot

d <- ggplot(sleep, aes(vore, # ... d + stat_summary(fun.y = mean, geom = "bar", fill = "grey5 stat_summary(fun.data = me fun.args = li geom = "error width = 0.2)

slide-6
SLIDE 6

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Individual data points

# position posn_j <- position_jitter(wi # plot d + geom_point(alpha = 0.6, position = posn

slide-7
SLIDE 7

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

geom_errorbar()

d + geom_point(...) + stat_summary(fun.y = mean, geom = "point fill = "red") stat_summary(fun.data = me fun.args = li geom = "error width = 0.2, color = "red"

slide-8
SLIDE 8

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

geom_pointrange()

d + geom_point(...) + stat_summary(fun.data = me mult = 1, width = 0.2, color = "red"

slide-9
SLIDE 9

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Without data points

d + stat_summary(fun.y = mean, geom = "point stat_summary(fun.data = me fun.args = li geom = "error width = 0.2)

slide-10
SLIDE 10

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Bars are not necessary

slide-11
SLIDE 11

Ready for exercises!

IN TERMEDIATE DATA VIS UALIZ ATION W ITH GGP LOT2

slide-12
SLIDE 12

Heatmaps use case scenario

IN TERMEDIATE DATA VIS UALIZ ATION W ITH GGP LOT2

Rick Scavetta

Founder, Scavetta Academy

slide-13
SLIDE 13

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

The barley dataset

head(barley, 9) yield variety year site 1 27.00000 Manchuria 1931 University Farm 2 48.86667 Manchuria 1931 Waseca 3 27.43334 Manchuria 1931 Morris 4 39.93333 Manchuria 1931 Crookston 5 32.96667 Manchuria 1931 Grand Rapids 6 28.96667 Manchuria 1931 Duluth 7 43.06666 Glabron 1931 University Farm 8 55.20000 Glabron 1931 Waseca 9 28.76667 Glabron 1931 Morris

slide-14
SLIDE 14

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

A basic heat map

ggplot(barley, aes(year, var fill = yi geom_tile() + facet_wrap(vars(site), nco ...

slide-15
SLIDE 15

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

A dot plot

ggplot(barley, aes(yield, va color = y geom_point(...) + facet_wrap(vars(site), nco ...

slide-16
SLIDE 16

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

As a time series

ggplot(barley, aes(year, yie group = v color = v geom_line() + facet_wrap(vars(site), nro ...

slide-17
SLIDE 17

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Using dodged error bars

ggplot(barley, aes(x = year, group = s color = s stat_summary(fun.y = mean, geom = "line" stat_summary(fun.data = me geom = "error ...

slide-18
SLIDE 18

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Using ribbons for error

ggplot(barley, aes(x = year, group = s color = s stat_summary(fun.y = mean, geom = "line" stat_summary(fun.data = me geom = "ribbo ...

slide-19
SLIDE 19

Coding Time!

IN TERMEDIATE DATA VIS UALIZ ATION W ITH GGP LOT2

slide-20
SLIDE 20

When good data makes bad plots

IN TERMEDIATE DATA VIS UALIZ ATION W ITH GGP LOT2

Rick Scavetta

Founder, Scavetta Academy

slide-21
SLIDE 21

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Bad plots: style

Color Not color-blind-friendly (e.g. primarily red and green) Wrong palette for data type (remember sequential, qualitative and divergent) Indistinguishable groups (i.e. colors are too similar) Ugly (high saturation primary colors) T ext Illegible (e.g. too small, poor resolution) Non-descriptive (e.g. "length" -- of what? which units?) Missing

slide-22
SLIDE 22

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Bad plots: structure and content

Information content T

  • o much information

(TMI) T

  • o little information (TLI)

No clear message or purpose Axes Poor aspect ratio Suppression of the origin Broken x or y axes Common but unaligned Statistics Visualization doesn't match actual statistics Geometries Wrong plot type Wrong orientation Non-data Ink Inappropriate use 3D plots Perceptual problems Useless 3rd axis

slide-23
SLIDE 23

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Wrong orientation

slide-24
SLIDE 24

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

slide-25
SLIDE 25

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

slide-26
SLIDE 26

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Broken y-axes

slide-27
SLIDE 27

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Broken y-axes, replace with transformed data

slide-28
SLIDE 28

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Broken y-axes, use facets

slide-29
SLIDE 29

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

3D plots, without data on the 3rd axis

slide-30
SLIDE 30

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

3D plots, with data on the 3rd axis

slide-31
SLIDE 31

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Double y-axes

slide-32
SLIDE 32

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Double y-axis for transformations

slide-33
SLIDE 33

INTERMEDIATE DATA VISUALIZATION WITH GGPLOT2

Guidelines not rules

Use your common sense: Is there anything on my plot that obscure a clear reading of the data or the take-home message?

slide-34
SLIDE 34

Let's practice!

IN TERMEDIATE DATA VIS UALIZ ATION W ITH GGP LOT2