Be Be a Hawk not a Tu Turkey How a Birds Eye View of your Data Can - - PowerPoint PPT Presentation

be be a hawk not a tu turkey
SMART_READER_LITE
LIVE PREVIEW

Be Be a Hawk not a Tu Turkey How a Birds Eye View of your Data Can - - PowerPoint PPT Presentation

Be Be a Hawk not a Tu Turkey How a Birds Eye View of your Data Can Streamline Data Analysis Nicholas Tierney PhD Candidate QUT WOMBAT, Melbourne Zoo 19/02/2016 The Project 2 C Can you have a look at the data? What does that


slide-1
SLIDE 1

Be Be a Hawk not a Tu Turkey

How a Bird’s Eye View of your Data Can Streamline Data Analysis Nicholas Tierney PhD Candidate QUT WOMBAT, Melbourne Zoo 19/02/2016

slide-2
SLIDE 2

The Project

2

slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5

“C “Can you have a look at the data?”

What does that mean?

slide-6
SLIDE 6

“Looking” at the data

6

slide-7
SLIDE 7

“…Looking?” at the data?

7

ggplot(data = data, aes(x = IQ, y = income)) + geom_point()

slide-8
SLIDE 8

“…Looking?” at the data?

8

slide-9
SLIDE 9

So So…

What if the data is all weird, and stuff?

slide-10
SLIDE 10

Real data is generally real messy

Dates are not dates Gender is not Categorical Rows are supposed to be columns Missing data

10

slide-11
SLIDE 11

Data Cleaning…janitorial work...munging...

11

Data Wrangling Testing Data

dplyr plyr data.table assertr testdat

slide-12
SLIDE 12

Data inspection: `dplyr::glimpse(dat)`

Observations: 300 Variables: 15 $ date (date) 2015-03-15, 2015-03-... $ name (chr) "Bobby", "Trinidad", ... $ age (int) 21, 28, 31, 30, 23, 2... $ sex (fctr) Female, Female, Fema... $ grade (int) NA, 4, 3, NA, NA, NA,... $ height (dbl) 66, 59, 67, 71, 68, 7... $ hair (fctr) Brown, Red, Blonde, ... $ eye (fctr) Gray, Brown, Blue, H... $ smokes (lgl) FALSE, FALSE, FALSE, ... $ income (chr) NA, "36157.98", "17307.35” $ education (fctr) Regular High School ... $ IQ (fctr) 97, 115, 112, 94, 106... $ employment (int) NA, 1, 4, NA, 1, NA, ... $ race (fctr) Hispanic, Black, Bla... $ religion (fctr) Muslim, Christian, N... 12

slide-13
SLIDE 13

Pre-exploratory Visualisations?

13

Visualisation methods for Checking Data?

slide-14
SLIDE 14

visdat

Visualise whole data frames at once

slide-15
SLIDE 15

vis_dat(data)

15

slide-16
SLIDE 16

vis_dat(data, sort_type = F)

16

slide-17
SLIDE 17

vis_dat … clean … vis_dat … clean

17

slide-18
SLIDE 18

vis_dat … clean … vis_dat … clean

18

slide-19
SLIDE 19

vis_miss

19

slide-20
SLIDE 20

vis_miss(cluster = TRUE)

20

slide-21
SLIDE 21

Sl Slide missing

It’s probably not a big deal

slide-22
SLIDE 22

ggmissing

plotting missing data with ggplot

slide-23
SLIDE 23

ggmissing

ggplot(data = dat, aes(x = IQ , y = income)) + geom_point() Warning message: Removed 142 rows containing missing values(geom_point).

23

slide-24
SLIDE 24

ggmissing

24

slide-25
SLIDE 25

ggmissing: how to do it

25

dat %>% mutate(miss_cat = miss_cat(., "IQ", "income")) %>% ggplot(data = ., aes(x = shadow_shift(IQ), y = shadow_shift(income), colour = miss_cat)) + geom_point()

slide-26
SLIDE 26

ggmissing: how we’d like to do it

26

ggplot(data = data, aes(x = IQ, y = income)) + geom_point() + geom_missing() ggplot(data = data, aes(x = IQ, y = income)) + geom_point(show_missing = T)

slide-27
SLIDE 27

Future Work

ggmissing and visdat

slide-28
SLIDE 28

Future Work: visdat

Colour cells intelligently Guess what kind a variable is Read in horrible messy data Include interactivity Think about ways to sensibly encode summary / value information Pipe in expectations

28

slide-29
SLIDE 29

Future Work: ggmissing

Early days yet Create a philosophy / grammar of missingness Don’t re-write ggplot Include rug plot to show missing data Develop clear/intuitive ways of visualising missing values

29

slide-30
SLIDE 30

Got an idea or want to help?

Check out our github github.com/tierneyn/visdat github.com/tierneyn/ggmissing

slide-31
SLIDE 31

Thank you

Di Cook Miles McBain Jenny Bryan Kerrie Mengersen Fiona Harden Maurice Harden

31

slide-32
SLIDE 32

Thank you

32

slide-33
SLIDE 33

33

slide-34
SLIDE 34

Questions?

I caught a glimpse of happiness, And saw it was a bird on a branch, Fixing to take wing

  • Richard Peck

34