binds joining data in r with dplyr
play

Binds Joining Data in R with dplyr Joining Data in R with dplyr - PowerPoint PPT Presentation

JOINING DATA IN R WITH DPLYR Binds Joining Data in R with dplyr Joining Data in R with dplyr rbind() cbind() bind_rows() bind_cols() Joining Data in R with dplyr bind_rows() > band1 > band2 name surname name


  1. JOINING DATA IN R WITH DPLYR Binds

  2. Joining Data in R with dplyr

  3. Joining Data in R with dplyr ● rbind() ● cbind() ● bind_rows() ● bind_cols()

  4. Joining Data in R with dplyr bind_rows() > band1 > band2 name surname name surname 1 John Lennon 1 Mick Jagger 2 Paul McCartney 2 Keith Richards 3 George Harrison 3 Charlie Watts 4 Ringo Starr 4 Ronnie Wood > bind_rows(band1, band2) name surname 1 John Lennon 2 Paul McCartney tables to combine 3 George Harrison 4 Ringo Starr 5 Mick Jagger 6 Keith Richards 7 Charlie Watts 8 Ronnie Wood

  5. Joining Data in R with dplyr bind_cols() > band1 > plays1 name surname instrument born 1 John Lennon 1 Guitar 1940 2 Paul McCartney 2 Bass 1942 3 George Harrison 3 Guitar 1943 4 Ringo Starr 4 Drums 1940 > bind_cols(band1, plays1) name surname instrument born 1 John Lennon Guitar 1940 2 Paul McCartney Bass 1942 3 George Harrison Guitar 1943 4 Ringo Starr Drums 1940

  6. Joining Data in R with dplyr Benefits of bind_rows() and bind_cols() ● Faster ● Return a tibble ● Can handle lists of data frames ● .id

  7. Joining Data in R with dplyr bind_rows() > band1 > band2 name surname name surname 1 John Lennon 1 Mick Jagger 2 Paul McCartney 2 Keith Richards 3 George Harrison 3 Charlie Watts 4 Ringo Starr 4 Ronnie Wood > bind_rows(Beatles = band1, Stones = band2, .id = "band") band name surname 1 Beatles John Lennon 2 Beatles Paul McCartney Label names for new Column name for new 3 Beatles George Harrison column column 4 Beatles Ringo Starr 5 Stones Mick Jagger 6 Stones Keith Richards 7 Stones Charlie Watts 8 Stones Ronnie Wood

  8. JOINING DATA IN R WITH DPLYR Let’s practice!

  9. JOINING DATA IN R WITH DPLYR Build a be � er data frame

  10. Joining Data in R with dplyr ● data.frame() ● as.data.frame() ● data_frame() ● as_data_frame()

  11. Joining Data in R with dplyr data.frame() defaults ● Changes strings to factors ● Adds row names ● Changes unusual column names

  12. Joining Data in R with dplyr data_frame() > data_frame( + Beatles = c("John", "Paul", "George", "Ringo"), + Stones = c("Mick", "Keith", "Charlie", "Ronnie"), + Zeppelins = c("Robert", "Jimmy", "John Paul", "John") + ) # A tibble: 4 × 3 Beatles Stones Zeppelins <chr> <chr> <chr> 1 John Mick Robert 2 Paul Keith Jimmy 3 George Charlie John Paul 4 Ringo Ronnie John

  13. Joining Data in R with dplyr data_frame() data_frame() will not… ● Change the data type of vectors (e.g. strings to factors) ● Add row names ● Change column names ● Recycle vectors greater than length one

  14. Joining Data in R with dplyr data_frame() ● Evaluates arguments lazily, in order > data_frame( + numbers = 1:5, + squares = numbers ^ 2 + ) # A tibble: 5 × 2 numbers squares <int> <dbl> 1 1 1 2 2 4 3 3 9 4 4 16 5 5 25 ● Returns a tibble

  15. Joining Data in R with dplyr as_data_frame()

  16. JOINING DATA IN R WITH DPLYR Let’s practice!

  17. JOINING DATA IN R WITH DPLYR Working with data types

  18. Joining Data in R with dplyr > 1 + 1 [1] 2 > "one" + "one" Error in "one" + "one" : non-numeric argument to binary operator

  19. Joining Data in R with dplyr Character Character? Number? Number

  20. Joining Data in R with dplyr Atomic data types Logical > typeof(TRUE) [1] "logical" Character (i.e. string) > typeof("hello") [1] "character" Double (i.e. numeric w/ decimal) > typeof(3.14) [1] "double" > typeof(1L) Integer (i.e. numeric w/o decimal) [1] "integer" > typeof(1 + 2i) Complex [1] "complex" > typeof(raw(1)) Raw [1] "raw"

  21. Joining Data in R with dplyr Classes > x <- c(1L, 2L, 3L, 2L) > x [1] 1 2 3 2 > typeof(x) [1] "integer" > class(x) [1] "integer" > attributes(x) <- list(class = "factor", levels = c("A", "B", "C", "D")) > x [1] A B C B 1L = A Levels: A B C D 2L = B > typeof(x) 3L = C [1] "integer" 4L = D > class(x) [1] "factor"

  22. JOINING DATA IN R WITH DPLYR Let’s practice!

  23. JOINING DATA IN R WITH DPLYR dplyr 's coercion rules

  24. Joining Data in R with dplyr Character Character? Number? Number

  25. Joining Data in R with dplyr Integer Character Double as.character() (string) Logical Character Integer Double as.numeric() Logical TRUE -> 1 FALSE -> 0 Double as.integer() Integer Logical TRUE -> 1 FALSE -> 0 Integer

  26. Joining Data in R with dplyr factors # x is a factor > x [1] A B C B Levels: A B C D # How x is stored? > unclass(x) [1] 1 2 3 2 attr(,"levels") [1] "A" "B" "C" "D" > as.character(x) [1] "A" "B" "C" "B" > as.numeric(x) [1] 1 2 3 2

  27. Joining Data in R with dplyr factors # y is a factor > y <- factor(c(5, 6, 7, 6)) > y [1] 5 6 7 6 Levels: 5 6 7 > unclass(y) [1] 1 2 3 2 attr(,"levels") [1] "5" "6" "7" > as.character(y) [1] "5" "6" "7" "6" > as.numeric(y) [1] 1 2 3 2 > as.numeric(as.character(y)) [1] 5 6 7 6

  28. Joining Data in R with dplyr dplyr 's coercion behavior ● dplyr functions will not automatically coerce data types ● Returns an error ● Expects you to manually coerce data ● Exception: factors ● dplyr converts non-aligning factors to strings ● Gives warning message

  29. JOINING DATA IN R WITH DPLYR Let’s practice!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend