introd u ction to tid y data
play

Introd u ction to Tid y Data W OR K IN G W ITH DATA IN TH E - PowerPoint PPT Presentation

Introd u ction to Tid y Data W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist WORKING WITH DATA IN THE TIDYVERSE WORKING WITH DATA IN THE TIDYVERSE The Great British Bake Off Series 8 WORKING WITH DATA


  1. Introd u ction to Tid y Data W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist

  2. WORKING WITH DATA IN THE TIDYVERSE

  3. WORKING WITH DATA IN THE TIDYVERSE

  4. The Great British Bake Off Series 8 WORKING WITH DATA IN THE TIDYVERSE

  5. WORKING WITH DATA IN THE TIDYVERSE

  6. Tame b u t u n - tid y juniors_untidy # A tibble: 4 x 4 baker cinnamon_1 cardamom_2 nutmeg_3 <chr> <int> <int> <int> 1 Emma 1 0 1 2 Harry 1 1 1 3 Ruby 1 0 1 4 Zainab 0 NA 0 WORKING WITH DATA IN THE TIDYVERSE

  7. Tid y data juniors_tidy # A tibble: 12 x 4 baker spice order correct <chr> <chr> <int> <int> 1 Emma cinnamon 1 1 2 Harry cinnamon 1 1 3 Ruby cinnamon 1 1 4 Zainab cinnamon 1 0 5 Emma cardamom 2 0 6 Harry cardamom 2 1 7 Ruby cardamom 2 0 8 Zainab cardamom 2 NA 9 Emma nutmeg 3 1 10 Harry nutmeg 3 1 11 Ruby nutmeg 3 1 12 Zainab nutmeg 3 0 WORKING WITH DATA IN THE TIDYVERSE

  8. Who w on ? Co u nt it ! juniors_tidy %>% count(baker, wt = correct) # A tibble: 4 x 2 baker n <chr> <int> 1 Emma 2 2 Harry 3 3 Ruby 2 4 Zainab 0 WORKING WITH DATA IN THE TIDYVERSE

  9. Who w on ? Plot it ! ggplot(juniors_tidy, aes(baker, correct)) + geom_col() WORKING WITH DATA IN THE TIDYVERSE

  10. Which spice w as the hardest to g u ess ? Co u nt it ! ggplot(juniors_tidy, aes(baker, correct)) + geom_col() WORKING WITH DATA IN THE TIDYVERSE

  11. Which spice w as the hardest to g u ess ? Plot it ! ggplot(juniors_tidy, aes(spice, correct)) + geom_col() WORKING WITH DATA IN THE TIDYVERSE

  12. Insert title here ... WORKING WITH DATA IN THE TIDYVERSE

  13. Let ' s get to w ork ! W OR K IN G W ITH DATA IN TH E TIDYVE R SE

  14. Gather W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist

  15. The ` tid y r ` package 1 h � p :// tid y r . tid yv erse . org ## Title ```y aml t y pe : F u llSlide ke y: e 6 e 5223 c 49 hide _ title : tr u e ``` WORKING WITH DATA IN THE TIDYVERSE

  16. Gather : u sage ?gather WORKING WITH DATA IN THE TIDYVERSE

  17. Gather : arg u ments ?gather WORKING WITH DATA IN THE TIDYVERSE

  18. Gathering j u niors WORKING WITH DATA IN THE TIDYVERSE

  19. Gathering w hat y o u ha v e into w hat y o u w ant WORKING WITH DATA IN THE TIDYVERSE

  20. The ke y col u mn WORKING WITH DATA IN THE TIDYVERSE

  21. The ke y col u mn WORKING WITH DATA IN THE TIDYVERSE

  22. The ke y col u mn WORKING WITH DATA IN THE TIDYVERSE

  23. The v al u e col u mn WORKING WITH DATA IN THE TIDYVERSE

  24. The v al u e col u mn WORKING WITH DATA IN THE TIDYVERSE

  25. The v al u e col u mn WORKING WITH DATA IN THE TIDYVERSE

  26. A little trick WORKING WITH DATA IN THE TIDYVERSE

  27. Let ' s get to w ork ! W OR K IN G W ITH DATA IN TH E TIDYVE R SE

  28. Separate W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist

  29. Gathering the j u niors data WORKING WITH DATA IN THE TIDYVERSE

  30. Separate : u sage ?separate WORKING WITH DATA IN THE TIDYVERSE

  31. Separate : arg u ments ?separate WORKING WITH DATA IN THE TIDYVERSE

  32. Separating w hat y o u ha v e into w hat y o u w ant WORKING WITH DATA IN THE TIDYVERSE

  33. Separate ` spice ` WORKING WITH DATA IN THE TIDYVERSE

  34. Reminder : pre - separate juniors_untidy %>% gather(key = spice, value = correct, -baker) # A tibble: 12 x 3 baker spice correct <chr> <chr> <int> 1 Emma cinnamon_1 1 2 Harry cinnamon_1 1 3 Ruby cinnamon_1 1 4 Zainab cinnamon_1 0 5 Emma cardamom_2 0 6 Harry cardamom_2 1 7 Ruby cardamom_2 0 8 Zainab cardamom_2 NA 9 Emma nutmeg_3 1 10 Harry nutmeg_3 1 11 Ruby nutmeg_3 1 12 Zainab nutmeg_3 0 WORKING WITH DATA IN THE TIDYVERSE

  35. Gather and separate juniors_untidy %>% gather(key = "spice", value = "correct", -baker) %>% separate(spice, into = c("spice", "order")) # A tibble: 12 x 4 baker spice order correct <chr> <chr> <chr> <int> 1 Emma cinnamon 1 1 2 Harry cinnamon 1 1 3 Ruby cinnamon 1 1 4 Zainab cinnamon 1 0 5 Emma cardamom 2 0 6 Harry cardamom 2 1 7 Ruby cardamom 2 0 8 Zainab cardamom 2 NA 9 Emma nutmeg 3 1 10 Harry nutmeg 3 1 11 Ruby nutmeg 3 1 12 Zainab nutmeg 3 0 WORKING WITH DATA IN THE TIDYVERSE

  36. Gather , separate , and con v ert t y pes juniors_untidy %>% gather(key = "spice", value = "correct", -baker) %>% separate(spice, into = c("spice", "order"), convert = TRUE) # A tibble: 12 x 4 baker spice order correct <chr> <chr> <int> <int> 1 Emma cinnamon 1 1 2 Harry cinnamon 1 1 3 Ruby cinnamon 1 1 4 Zainab cinnamon 1 0 5 Emma cardamom 2 0 6 Harry cardamom 2 1 7 Ruby cardamom 2 0 8 Zainab cardamom 2 NA 9 Emma nutmeg 3 1 10 Harry nutmeg 3 1 11 Ruby nutmeg 3 1 12 Zainab nutmeg 3 0 WORKING WITH DATA IN THE TIDYVERSE

  37. Before and after separate # A tibble: 12 x 3 # A tibble: 12 x 4 baker spice correct baker spice order correct <chr> <chr> <int> <chr> <chr> <int> <int> 1 Emma cinnamon_1 1 1 Emma cinnamon 1 1 2 Harry cinnamon_1 1 2 Harry cinnamon 1 1 3 Ruby cinnamon_1 1 3 Ruby cinnamon 1 1 4 Zainab cinnamon_1 0 4 Zainab cinnamon 1 0 5 Emma cardamom_2 0 5 Emma cardamom 2 0 6 Harry cardamom_2 1 6 Harry cardamom 2 1 7 Ruby cardamom_2 0 7 Ruby cardamom 2 0 8 Zainab cardamom_2 NA 8 Zainab cardamom 2 NA 9 Emma nutmeg_3 1 9 Emma nutmeg 3 1 10 Harry nutmeg_3 1 10 Harry nutmeg 3 1 11 Ruby nutmeg_3 1 11 Ruby nutmeg 3 1 12 Zainab nutmeg_3 0 12 Zainab nutmeg 3 0 WORKING WITH DATA IN THE TIDYVERSE

  38. The ` sep ` arg u ment ?separate WORKING WITH DATA IN THE TIDYVERSE

  39. Let ' s practice ! W OR K IN G W ITH DATA IN TH E TIDYVE R SE

  40. Spread W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist

  41. Gather WORKING WITH DATA IN THE TIDYVERSE

  42. Spread WORKING WITH DATA IN THE TIDYVERSE

  43. Spread WORKING WITH DATA IN THE TIDYVERSE

  44. Spread : u sage ?spread WORKING WITH DATA IN THE TIDYVERSE

  45. Spread : arg u ments ?spread WORKING WITH DATA IN THE TIDYVERSE

  46. Using spread juniors_jumbled %>% juniors_jumbled spread(key = key, value = value) # A tibble: 12 x 3 # A tibble: 4 x 4 baker key value baker age outcome spices <chr> <chr> <chr> <chr> <chr> <chr> <chr> 1 Emma age 11 1 Emma 11 finalist 2 2 Harry age 10 2 Harry 10 winner 3 3 Ruby age 11 3 Ruby 11 finalist 2 4 Zainab age 10 4 Zainab 10 finalist 0 5 Emma outcome finalist 6 Harry outcome winner 7 Ruby outcome finalist 8 Zainab outcome finalist 9 Emma spices 2 10 Harry spices 3 11 Ruby spices 2 12 Zainab spices 0 WORKING WITH DATA IN THE TIDYVERSE

  47. Spread and con v ert juniors_jumbled %>% juniors_jumbled spread(key = key, value = value, convert = TRUE) # A tibble: 12 x 3 baker key value # A tibble: 4 x 4 <chr> <chr> <chr> baker age outcome spices 1 Emma age 11 <chr> <int> <chr> <int> 2 Harry age 10 1 Emma 11 finalist 2 3 Ruby age 11 2 Harry 10 winner 3 4 Zainab age 10 3 Ruby 11 finalist 2 5 Emma outcome finalist 4 Zainab 10 finalist 0 6 Harry outcome winner 7 Ruby outcome finalist 8 Zainab outcome finalist 9 Emma spices 2 10 Harry spices 3 11 Ruby spices 2 12 Zainab spices 0 WORKING WITH DATA IN THE TIDYVERSE

  48. Spread re v ie w WORKING WITH DATA IN THE TIDYVERSE

  49. Let ' s practice ! W OR K IN G W ITH DATA IN TH E TIDYVE R SE

  50. Tid y m u ltiple sets of col u mns W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend