sures : Some of my not ideal R habits and how the Tidyverse - - PowerPoint PPT Presentation

sures
SMART_READER_LITE
LIVE PREVIEW

sures : Some of my not ideal R habits and how the Tidyverse - - PowerPoint PPT Presentation

Con Confessi ssion ons s and Cou Counterm rmeasu sures : Some of my not ideal R habits and how the Tidyverse resolved them Rachael Workman PhD student, BCMB ** ** ** ** ** ** https://osf.io/69gub/ Th This year I Made data


slide-1
SLIDE 1

Con Confessi ssion

  • ns

s and Cou Counterm rmeasu sures:

Some of my not ideal R habits and how the Tidyverse resolved them

Rachael Workman PhD student, BCMB

slide-2
SLIDE 2

https://osf.io/69gub/

** ** ** ** ** **

slide-3
SLIDE 3

Th This year I… Made data import harder than it had to be

VS OR

Excel with multiple sheets à Open, select sheet of interest à Save worksheet as CSV à Import using base R into dataframe

slide-4
SLIDE 4

Th This year I… Made data import harder than it had to be

Benefits to tibbles over dataframes

  • 1. Tibbles print nicely, they show the data type of each column, and if you subset
  • ne, it returns another tibble.

vs

slide-5
SLIDE 5

Th This is year ar I… Did calculations in Excel and reimported my dataset

Excel with multiple sheets à Open, select sheet of interest à Save worksheet as CSV à Import using base R into dataframeà Realized I needed to compute the sum of two columns à opened Excel file à calculated sum in Excel à resaved as CSV à reimported into R VS Column name of new column Two numerical columns to add together

slide-6
SLIDE 6

Th This year I … Saved too many intermediate objects

  • The pipe operator is your friend

VS

slide-7
SLIDE 7

Th This is year ar I… Read in a bunch of similar datasets one at a time

……for 12 files, which I then concatenated…

VS

slide-8
SLIDE 8

On that note - why care about reducing duplication?

  • “It’s easier to see the intent of your code, because your eyes are

drawn to what’s different, not what stays the same.

  • It’s easier to respond to changes in requirements. As your needs

change, you only need to make changes in one place, rather than remembering to change every place that you copied-and-pasted the code.

  • You’re likely to have fewer bugs because each line of code is used in

more places.”

  • --R for Data Science, Grolemund and Wickham
slide-9
SLIDE 9

Th This is year ar I… Did a lot of plotting using default color schemes

ggplot color options – why go past default? 1. Colorblind-friendly graphs 2. Demonstrate a point 3. Just stand out

geom_bar() geom_freqpoly()

slide-10
SLIDE 10

Ma Make y you

  • ur o

r own c col

  • lorb
  • rblind f

fri riendly pa palette e for gg ggplot

slide-11
SLIDE 11

Mor More p palettes

RColorBrewer