hello SWC bootcamp Dr. Jennifer (Jenny) Bryan Department of - - PowerPoint PPT Presentation

hello swc bootcamp
SMART_READER_LITE
LIVE PREVIEW

hello SWC bootcamp Dr. Jennifer (Jenny) Bryan Department of - - PowerPoint PPT Presentation

hello SWC bootcamp Dr. Jennifer (Jenny) Bryan Department of Statistics and Michael Smith Laboratories University of British Columbia write code for humans, write data for computers A place for everything and everything in its place


slide-1
SLIDE 1

hello SWC bootcamp

  • Dr. Jennifer (Jenny) Bryan

Department of Statistics and Michael Smith Laboratories University of British Columbia

slide-2
SLIDE 2

write code for humans, write data for computers

slide-3
SLIDE 3
slide-4
SLIDE 4

A place for everything and everything in its place

slide-5
SLIDE 5

source is real

slide-6
SLIDE 6

“The source code is real. The objects are realizations of the source code. Source for EVERY user modified object is placed in a particular directory or directories, for later editing and retrieval.”

  • - from the Emacs Speaks Statistics (ESS) manual

source is real

slide-7
SLIDE 7

Names matter

slide-8
SLIDE 8

minimize the creation of excerpts and copies of your data ... it will just confuse you later

ain’t nothing like the real thing

slide-9
SLIDE 9

reshape your data

as in real life, it has a tendency to get short and fat, when you’d really prefer tall and skinny

slide-10
SLIDE 10

you won’t believe how important this is: what is your working directory? where is the file or executable you need? you need to be fluent with file paths

slide-11
SLIDE 11

tough love:

get better at typing

typos matter case matters th_is is different from th-is spaces in filenames are EVIL you want computer to tedious work for you, right? then you must give precise instructions

slide-12
SLIDE 12

a few remarks borrowed from Jonah Duckles’ intro (zero-entry R room)

slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16

end sermon

slide-17
SLIDE 17

any questions from first tutorial re: basic use of R via RStudio, RStudio projects? you do know that R ≠ RStudio, right? we use RStudio because it makes us happier in our work, but notice that nothing we produce -- no code, no figures, nothing -- requires RStudio to be created, appreciated or reused you should master various ways of sending code from editor to Console so that saving scripts becomes your R Way of Life

slide-18
SLIDE 18

weak links in the chain: process, packaging and presentation

slide-19
SLIDE 19

a <- 2 b <- 7 sigSq <- 0.5 n <- 400 set.seed(1234) x <- runif(n) y <- a + b * x + rnorm(n, sd = sqrt(sigSq)) (avgX <- mean(x)) write(avgX, "results/avgX.txt") pdf("figs/niftyPlot.pdf") plot(x, y) abline(a, b, col = "blue", lwd = 2) dev.off()

I assume your toyline.R script looks similar to this.

slide-20
SLIDE 20

a <- 2 b <- 7 sigSq <- 0.5 n <- 400 set.seed(1234) x <- runif(n) y <- a + b * x + rnorm(n, sd = sqrt(sigSq)) (avgX <- mean(x)) write(avgX, "results/avgX.txt") pdf("figs/niftyPlot.pdf") plot(x, y) abline(a, b, col = "blue", lwd = 2) dev.off()

It’s great we are saving important things to file with code -- versus letting them cruise by in the Console and/or saving via mouse clicks -- but we can do better.

slide-21
SLIDE 21

You did install knitr and its dependencies, as instructed in the set-up tutorial, right?

install.packages("knitr", dependencies = TRUE)

slide-22
SLIDE 22

You did get an RPubs account, as requested, right?

slide-23
SLIDE 23

a <- 2 b <- 7 sigSq <- 0.5 n <- 400 set.seed(1234) x <- runif(n) y <- a + b * x + rnorm(n, sd = sqrt(sigSq)) (avgX <- mean(x)) plot(x, y) abline(a, b, col = "blue", lwd = 2) sessionInfo()

Edit the script -- more like it was during development, when we were watching results and figures appear on the screen.

slide-24
SLIDE 24

Compile an HTML notebook.

Yes this can be accomplished outside of RStudio, using knitr functions at the command line, so we are not creating unhealthy dependency on RStudio.

slide-25
SLIDE 25

I just accept all these defaults.

slide-26
SLIDE 26

This is where you’ll need that RPubs account.

slide-27
SLIDE 27
slide-28
SLIDE 28

Seems like a good idea to keep script name and slug same, at least as default. Expect me to give you a naming convention for future STAT 545A coursework.

slide-29
SLIDE 29

http://rpubs.com/jennybc/toyline

voilà!