A battle plan D E FE N SIVE R P R OG R AMMIN G Dr . Colin - - PowerPoint PPT Presentation

a battle plan
SMART_READER_LITE
LIVE PREVIEW

A battle plan D E FE N SIVE R P R OG R AMMIN G Dr . Colin - - PowerPoint PPT Presentation

A battle plan D E FE N SIVE R P R OG R AMMIN G Dr . Colin Gillespie J u mping Ri v ers Starting small Great res u lts , can be achie v ed w ith small forces . S u n T zu, The Art of War DEFENSIVE R PROGRAMMING What ' s in a ( file ) name ? All


slide-1
SLIDE 1

A battle plan

D E FE N SIVE R P R OG R AMMIN G

  • Dr. Colin Gillespie

Jumping Rivers

slide-2
SLIDE 2

DEFENSIVE R PROGRAMMING

Starting small

Great results, can be achieved with small forces. Sun Tzu, The Art of War

slide-3
SLIDE 3

DEFENSIVE R PROGRAMMING

What's in a (file)name?

All R scripts are stored in les So lenames are important Consistency in lenames is very important What sort of rules could we have?

slide-4
SLIDE 4

DEFENSIVE R PROGRAMMING

Multiple words

Filenames oen contain multiple words For example

cluster-analysis.R load-survival-data.R plot-residuals.R

slide-5
SLIDE 5

DEFENSIVE R PROGRAMMING

Spaces, dashes or underscores?

Simple question. How should words be separated? Space: analysis clustering.R Underscores: analysis_clustering.R Dashes: analysis-clustering.R Take a second and answer these two questions Which do you use? What should you use?

slide-6
SLIDE 6

DEFENSIVE R PROGRAMMING

Spaces in filenames

Don't use them Really, just don't Spaces in lenames and directories are a bad idea If you put the le on the web file name.R becomes file%20name.R On the command line harder lenames with spaces need to be surrounded by quotes Regular expressions are also more painful Spoing the dierence between

file name.R - one space file name.R - two spaces is hard

slide-7
SLIDE 7

DEFENSIVE R PROGRAMMING

Dashes or underscores

There are a few minor problems with underscores Google treats file_name as a single word So searching for just file won't work The regular expression character \w treats _ as a character The same problems don't apply to dashes Confession time: I usually use underscores but I'm trying to change

slide-8
SLIDE 8

Let's have some practice

D E FE N SIVE R P R OG R AMMIN G

slide-9
SLIDE 9

Human Readable Filenames

D E FE N SIVE R P R OG R AMMIN G

  • Dr. Colin Gillespie

Jumping Rivers

slide-10
SLIDE 10

DEFENSIVE R PROGRAMMING

The humble slug

URL slugs are the end part of a web address Which URL do you prefer? www.datacamp.com/courses/course1963.htm

slide-11
SLIDE 11

DEFENSIVE R PROGRAMMING

We can learn from slugs

Use sensible names

ac.R or analysis-clustering.R 1.R or loading.R

Be consistent Use the same le extension - .R Always lower case

slide-12
SLIDE 12

DEFENSIVE R PROGRAMMING

Dates - what do we want?

Unambiguous So not 01/02/032 Sortable in a le system

slide-13
SLIDE 13

DEFENSIVE R PROGRAMMING

Dates - ISO8601

Dates should be YYYY-MM-DD All dates are now in an obvious and natural order Sorting just works! 2017-01-02 2018-01-01 2018-01-02

slide-14
SLIDE 14

DEFENSIVE R PROGRAMMING

Numbers are good

For this course, I created directories called

chapter01 chapter02

Simple, yet eective

slide-15
SLIDE 15

Let's practice!

D E FE N SIVE R P R OG R AMMIN G

slide-16
SLIDE 16

Organizing a project

D E FE N SIVE R P R OG R AMMIN G

  • Dr. Colin Gillespie

Jumping Rivers

slide-17
SLIDE 17

DEFENSIVE R PROGRAMMING

It starts with something small

All R analyses start with a lile code, but then 1 line becomes 10 1 imported package becomes 5 1 le becomes a mess

slide-18
SLIDE 18

DEFENSIVE R PROGRAMMING

Project Set-up

Every project I work on Has its own directory Has a sensible name The directory name gives the context of the scripts

slide-19
SLIDE 19

DEFENSIVE R PROGRAMMING

Directory: input/

This directory contains data, typically

csv & excel les

No R code Data is only edited in R

slide-20
SLIDE 20

DEFENSIVE R PROGRAMMING

Directory: R/

All R code lives in this directory Notice The directory isn't R_analysis R_code R_survival just plain R/

slide-21
SLIDE 21

DEFENSIVE R PROGRAMMING

Directory: R/

In this directory, I always have a le called

load.R

This le loads the data from input/ Every project I've worked has a similar structure I can give you any project and you can load the data

slide-22
SLIDE 22

DEFENSIVE R PROGRAMMING

The load.R file

All paths are relative

battles <- read_csv("input/battles.csv") foes <- read_xlsx("input/foes.xlsx")

My code is portable

slide-23
SLIDE 23

DEFENSIVE R PROGRAMMING

Other R files

Remember, all R les live in the R directory!

clean.R - for cleaning your data function.R - any helper functions analysis.R - the actual analysis

Standard names used in every project

slide-24
SLIDE 24

Your turn

D E FE N SIVE R P R OG R AMMIN G

slide-25
SLIDE 25

Graphics and Output

D E FE N SIVE R P R OG R AMMIN G

  • Dr. Colin Gillespie

Jumping Rivers

slide-26
SLIDE 26

DEFENSIVE R PROGRAMMING

Project overview 1

So far we have encountered the base project directory

slide-27
SLIDE 27

DEFENSIVE R PROGRAMMING

Project overview 2

So far we have encountered the base project directory

input/ for data les

slide-28
SLIDE 28

DEFENSIVE R PROGRAMMING

Project overview 3

So far we have encountered the base project directory

input/ for data les R/ for R scripts

slide-29
SLIDE 29

DEFENSIVE R PROGRAMMING

Project overview 4

So far we have encountered the base project directory

R/ for R scripts input/ for data sets

In this last video, we'll look at

  • utput/ for output generated data les
slide-30
SLIDE 30

DEFENSIVE R PROGRAMMING

Project overview 5

So far we have encountered the base project directory

R/ for R scripts input/ for data sets

In this last video, we'll look at

  • utput/ for output generated data les

graphics/ for generated plots

slide-31
SLIDE 31

DEFENSIVE R PROGRAMMING

The difference

The scripts in the R/ directory create the contents of

  • utput/ & graphics/

So in theory, we can delete output/ & graphics/ and not cry

slide-32
SLIDE 32

DEFENSIVE R PROGRAMMING

The graphics/ directory

This directory just contains graphics! In my R/ directory I have imaginatively named script

graphics.R

that generates all graphics Make sure to use relative paths!

slide-33
SLIDE 33

DEFENSIVE R PROGRAMMING

The output/ directory

This directory contains output For example List of signicant variables, perhaps p-value Data for the next analysis Personally, I typically don't use this directory

slide-34
SLIDE 34

Let's try it

D E FE N SIVE R P R OG R AMMIN G