Introd u ction IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G - - PowerPoint PPT Presentation

introd u ction
SMART_READER_LITE
LIVE PREVIEW

Introd u ction IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G - - PowerPoint PPT Presentation

Introd u ction IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e a Fo u nder , Sca v e a Academ y Yo u r instr u ctor - Rick Sca v etta - e - mail : o ce @ sca v e a . academ y - T w i er : @ Rick _ Sca


slide-1
SLIDE 1

Introduction

IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT2

Rick Scavea

Founder, Scavea Academy

slide-2
SLIDE 2

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Your instructor - Rick Scavetta

  • e-mail: oce@scavea.academy
  • Twier: @Rick_Scavea
slide-3
SLIDE 3

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Data visualization & data science

A core skill in Data Science.

slide-4
SLIDE 4

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Exploratory versus explanatory

slide-5
SLIDE 5

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

MASS::mammals

MASS::mammals body brain Arctic fox 3.385 44.50 Owl monkey 0.480 15.50 Mountain beaver 1.350 8.10 Cow 465.000 423.00 Grey wolf 36.330 119.50 Goat 27.660 115.00 Roe deer 14.830 98.20 ... Pig 192.000 180.00 Echidna 3.000 25.00 Brazilian tapir 160.000 169.00 Tenrec 0.900 2.60 Phalanger 1.620 11.40 Tree shrew 0.104 2.50 Red fox 4.235 50.40

slide-6
SLIDE 6

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

A scatter plot

ggplot(mammals, aes(x = body, y = brain)) geom_point()

slide-7
SLIDE 7

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Explore with a linear model

ggplot(mammals, aes(x = body, y = brain)) geom_point(alpha = 0.6) + stat_smooth( method = "lm", color = "red", se = FALSE )

slide-8
SLIDE 8

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Explore: fine-tuning

ggplot(mammals, aes(x = body, y = brain)) geom_point(alpha = 0.6) + coord_fixed() + scale_x_log10() + scale_y_log10() + stat_smooth( method = "lm", color = "#C42126", se = FALSE, size = 1 )

slide-9
SLIDE 9

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Publication-ready plot

slide-10
SLIDE 10

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Anscombe's plots

slide-11
SLIDE 11

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Anscombe's plots

slide-12
SLIDE 12

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Anscombe's plots

slide-13
SLIDE 13

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Anscombe's plots

slide-14
SLIDE 14

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Anscombe's plots

slide-15
SLIDE 15

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Anscombe's plots

slide-16
SLIDE 16

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Anscombe's plots

slide-17
SLIDE 17

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Anscombe's plots

slide-18
SLIDE 18

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Anscombe's plots

slide-19
SLIDE 19

Let's practice!

IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT2

slide-20
SLIDE 20

The grammar of graphics

IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT2

Rick Scavea

Founder, Scavea Academy

slide-21
SLIDE 21

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

The quick brown fox jumps over the lazy dog

slide-22
SLIDE 22

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

The quick brown fox jumps over the lazy dog

slide-23
SLIDE 23

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Grammar of graphics

Ploing framework Leland Wilkinson, Grammar of Graphics, 1999 2 principles Graphics = distinct layers of grammatical elements Meaningful plots through aesthetic mappings

slide-24
SLIDE 24

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

The three essential grammatical elements

Element Description Data The data-set being ploed. Aesthetics The scales onto which we map our data. Geometries The visual elements used for our data.

slide-25
SLIDE 25

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Course 1: core competency

Element Description Data The data-set being ploed. Aesthetics The scales onto which we map our data. Geometries The visual elements used for our data. Themes All non-data ink.

slide-26
SLIDE 26

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

The seven grammatical elements

Element Description Data The data-set being ploed. Aesthetics The scales onto which we map our data. Geometries The visual elements used for our data. Themes All non-data ink. Statistics Representations of our data to aid understanding. Coordinates The space on which the data will be ploed. Facets Ploing small multiples.

slide-27
SLIDE 27

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Jargon for each element

slide-28
SLIDE 28

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Course 2: Tools for EDA

Remaining 3 layers Best practices for Data Viz

slide-29
SLIDE 29

Let's practice!

IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT2

slide-30
SLIDE 30

ggplot2 layers

IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT2

Rick Scavea

Founder, Scavea Academy

slide-31
SLIDE 31

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

ggplot2 package

The grammar of graphics implemented in R Two key concepts:

  • 1. Layer grammatical elements
  • 2. Aesthetic mappings
slide-32
SLIDE 32

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Data

slide-33
SLIDE 33

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Iris dataset

Fisher, R. A. (1936) The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, Part II, 179–

  • 188. Anderson, Edgar (1935). The irises of the Gaspe Peninsula, Bulletin of the American Iris Society, 59, 2–5.

1 2

slide-34
SLIDE 34

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Iris dataset

iris Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa ... 50 5.0 3.3 1.4 0.2 setosa 51 7.0 3.2 4.7 1.4 versicolor 52 6.4 3.2 4.5 1.5 versicolor 53 6.9 3.1 4.9 1.5 versicolor ... 100 5.7 2.8 4.1 1.3 versicolor 101 6.3 3.3 6.0 2.5 virginica 102 5.8 2.7 5.1 1.9 virginica 103 7.1 3.0 5.9 2.1 virginica ... 150 5.9 3.0 5.1 1.8 virginica

slide-35
SLIDE 35

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Aesthetics

slide-36
SLIDE 36

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Iris aesthetics

slide-37
SLIDE 37

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Geometries

slide-38
SLIDE 38

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Iris geometries

g <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width) geom_jitter() g

slide-39
SLIDE 39

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Themes

slide-40
SLIDE 40

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2

Iris themes

g <- g + labs(x = "Sepal Length (cm)", y = "Sepal Width (cm)") theme_classic() g

slide-41
SLIDE 41

Let's practice!

IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT2