CME/STATS 195 CME/STATS 195 Lecture 2: Programming and Lecture 2: - PowerPoint PPT Presentation

CME/STATS 195 CME/STATS 195 Lecture 2: Programming and Lecture 2: Programming and Communicating in R Communicating in R Evan Rosenman Evan Rosenman April 4, 2019 April 4, 2019 1

Announcements Announcements There will be no lecture on Thursday, April 25th. We will meet for the final time instead on Tuesday, April 30th. Please save debugging questions for Piazza or Office Hours. Auditors: please see me after class. 1

Contents Contents Exercise with Data Frames Programming Style Control flow statements Functions Communicating with R Markdown 1

Exercise with Data Frames Exercise with Data Frames 1

Data frames Data frames A data frame is a table or a 2D arraylike structure , whose: Columns can store data of different types e.g. numeric, character, etc. Each column must contain the same number of data items. The column names should be non-empty. The row names should be unique. # Create the data frame. employees <- data.frame ( row.names = c ("E1", "E2", "E3","E4", "E5"), name = c ("Rick","Dan","Michelle","Ryan","Gary"), salary = c (623.3,515.2,611.0,729.0,843.25), start_date = as.Date ( c ("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-27")), stringsAsFactors = FALSE ) # Print the data frame. employees ## name salary start_date ## E1 Rick 623.30 2012-01-01 ## E2 Dan 515.20 2013-09-23 ## E3 Michelle 611.00 2014-11-15 ## E4 Ryan 729.00 2014-05-11 ## E5 Gary 843.25 2015-03-27 1

Useful functions for dataframes Useful functions for dataframes # Get the structure of the data frame. str (employees) ## 'data.frame': 5 obs. of 3 variables: ## $ name : chr "Rick" "Dan" "Michelle" "Ryan" ... ## $ salary : num 623 515 611 729 843 ## $ start_date: Date, format: "2012-01-01" "2013-09-23" "2014-11-15" "2014-05-11" ... # Print first few rows of the data frame. head (employees) ## name salary start_date ## E1 Rick 623.30 2012-01-01 ## E2 Dan 515.20 2013-09-23 ## E3 Michelle 611.00 2014-11-15 ## E4 Ryan 729.00 2014-05-11 ## E5 Gary 843.25 2015-03-27 # Print statistical summary of the data frame. summary (employees) ## name salary start_date ## Length:5 Min. :515.2 Min. :2012-01-01 ## Class :character 1st Qu.:611.0 1st Qu.:2013-09-23 ## Mode :character Median :623.3 Median :2014-05-11 ## Mean :664.4 Mean :2014-01-14 ## 3rd Qu.:729.0 3rd Qu.:2014-11-15 ## Max. :843.2 Max. :2015-03-27 1

Subsetting dataframes Subsetting dataframes We can extract specific We can extract specific rows: columns: # using row names. employees["E1",] employees[ c ("E2", "E3"), ] # using column names. employees$name[1:3] # using integer indexing employees[1, ] employees[ c (2, 3), ] ## [1] "Rick" "Dan" "Michelle" ## name salary start_date ## E1 Rick 623.3 2012-01-01 employees[, c ("name", "salary")] ## name salary ## name salary start_date ## E1 Rick 623.30 ## E2 Dan 515.2 2013-09-23 ## E2 Dan 515.20 ## E3 Michelle 611.0 2014-11-15 ## E3 Michelle 611.00 ## E4 Ryan 729.00 ## E5 Gary 843.25 # or using integer indexing employees[1:3, 1] ## [1] "Rick" "Dan" "Michelle" 1

Practice with data frames Practice with data frames R comes with several built-in datasets. We will use mtcars , from the 1974 Motor Trend US magazine, which comprises information on 32 selected car models. Call str() , head() , and summary() on mtcars Use the $ syntax to extract the mpg column from mtcars Run the hist() function on the mpg column to see the distribution of mpg values Run the plot() function on the mpg and cyl columns to see how they compare 1

Programming: style guide Programming: style guide 1

A general note A general note R is a specialized programming language – this often encourages bad stylistic choices: Poor variable naming Uncommented code Code not optimized for readability Repeated code + failure to abstract functions These bad practices make it harder to utilize code in the future, and to share it with others! 1

Naming conventions Naming conventions The first step of programming is naming things. In the “Hadley Wickam” R style convention : File names are meaningful. Script files end with “.R”, and R Markdown with “.Rmd” # Good # Bad (works but violates convention) fit-models.R foo.r utility-functions.R stuff.r Variable and function names are lowercase or camelcase. # Good # Bad (works but violates convention) day_one first_day_of_the_month dayOne DayOne day_1 1

Spacing Spacing Spacing around all infix operators (=, +, -, <-, etc.): average <- mean (feet / 12 + inches, na.rm = TRUE) # Good average<- mean (feet/12+inches,na.rm=TRUE) # Bad Spacing before left parentheses, except in a function call # Good if (debug) do (x) plot (x, y) # Bad if(debug) do (x) plot (x, y) Assignment use ‘<-’ not ‘=’: # Good # Bad (works but violates convention) x <- 1 + 2 x = 1 + 2 1

Comments and documentation (I) Comments and documentation (I) Comment your code! # 'get_answer' returns the answer to life, the universe and everything else. get_answer <- function(){ return (42)} # This is a comment Comments are not subtitles, i.e. don’t just nearly verbatim repeat the code in the comments. # Bad comments: # Loop through all bananas in the bunch for(banana in bunch) { # make the monkey eat one banana MonkeyEat (b) } 1

Comments and documentation (II) Comments and documentation (II) Section headers can help separate big chunks of code handling different tasks. ####################################### ## data generation ## ####################################### x <- rnorm (100) y <- 12 * x + 5 ####################################### ## make the plots ## ####################################### plot (x, y) 1

Programming: control flow Programming: control flow 1

Booleans/logicals Booleans/logicals Booleans are logical data types # You can combine multiple booleans TRUE & TRUE # AND (TRUE/FALSE) associated with conditional statements. They allow ## [1] TRUE us to modify the “control flow”. TRUE & FALSE # AND # equal "=="" 5 == 5 ## [1] FALSE ## [1] TRUE TRUE | FALSE # OR # not equal: "!="" ## [1] TRUE 5 != 5 !(TRUE) # NOT ## [1] FALSE ## [1] FALSE # greater than/geq: ">" or ">=" c (5 > 4, 5 >= 5) ## [1] TRUE TRUE 1

Booleans/logicals Booleans/logicals When dealing with vectors of booleans, can use & and | to evaluate elementwise. Rember the recycling property for vectors. c (TRUE, TRUE) & c (FALSE, TRUE) ## [1] FALSE TRUE c (5 < 4, 7 == 0, 1< 2) | c (5==5, 6> 2, !FALSE) ## [1] TRUE TRUE TRUE c (TRUE, TRUE) & c (TRUE, FALSE, TRUE, FALSE) # recycling ## [1] TRUE FALSE TRUE FALSE 1

Booleans/logicals Booleans/logicals If we use double operators && or || is used only the first elements are compared: c (TRUE, TRUE) && c (FALSE, TRUE) ## [1] FALSE c (5 < 4, 7 == 0, 1< 2) || c (5==5, 6> 2, !FALSE) ## [1] TRUE c (TRUE, TRUE) && c (TRUE, FALSE, TRUE, FALSE) ## [1] TRUE 1

Control statements Control statements Control flow is the order in which individual statements, instructions or function calls of a program are evaluated. Allow you to do more complicated tasks. Their execution results in a choice between which of two or more paths should be followed. If / else For While 1

If statements If statements ‘if-else’ statements let you Decide on whether a block of introduce more options code should be executed if (traffic_light == "green") { based on the associated print ("Go.") } else { boolean expression. print ("Stay.") } You can also use else if() Syntax . The if statements are followed by a boolean if (traffic_light == "green") { print ("Go.") expression wrapped in } else if (traffic_light == "yellow") { print ("Get ready.") parenthesis. The conditional } else { print ("Stay.") block of code is inside curly } braces {} . if (traffic_light == "green") { print ("Go.") } 1

For loops For loops A for loop is a statement which repeats the execution a block of code a given number of iterations. for (i in 1:5){ print (i^2) } ## [1] 1 ## [1] 4 ## [1] 9 ## [1] 16 ## [1] 25 1

While loops While loops Similar to for loops, but repeat the execution as long as the boolean condition supplied is TRUE . i = 1 while(i <= 5) { cat ("i =", i, "\n") i = i + 1 } ## i = 1 ## i = 2 ## i = 3 ## i = 4 ## i = 5 1

CME/STATS 195 CME/STATS 195 Lecture 2: Programming and Lecture 2: - PowerPoint PPT Presentation

CME/STATS 195 CME/STATS 195 Lecture 2: Programming and Lecture 2: Programming and Communicating in R Communicating in R Evan Rosenman Evan Rosenman April 4, 2019 April 4, 2019 1 Announcements Announcements There will be no lecture on

CME/STATS 195 CME/STATS 195 Lecture 3: Importing and transforming data Lecture 3: Importing and

CME/STATS 195 CME/STATS 195 Lecture 6: Data Modeling and Linear Lecture 6: Data Modeling and

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and

CME/STATS 195 CME/STATS 195 Lecture 8: Hypothesis Testing and Lecture 8: Hypothesis Testing and

CME/STATS 195 CME/STATS 195 Lecture 4: Visualizing data Lecture 4: Visualizing data Evan

CME/STATS 195 CME/STATS 195 Lecture 5: Exploratory Data Analysis Lecture 5: Exploratory Data

CME/STATS 195 Lecture 1: Intro to R Evan Rosenman April 2, 2019 Contents Course Objectives

2017: Into the Future CME Group ISM June 2017 Source: CME Group Nov 2017 Source: CME

CME 101: Debbie Platek, MS Remembering the Basics President, CME Mentors Where were going

Issues in TDS u/s. 195 CA N.C. Hegde 3rd August 2019 The Chamber of Tax Consultants 1 Foreign

Withholding of Tax u/s 195 Withholding of Tax u/s 195 Form 15CA / 15CB Form 15CA / 15CB

Integrated Data at Stats NZ Stats NZ Stats NZ is the public service department of New

Any-Code Completion public static Path[] stat2Paths(FileStatus[] stats) { if (stats == null)

MACRA MIPS and CME Working Group 3/17/16 MACRA, MIPS and CME Enacted in April 2015

V1a 4/29/2017 Statistical Literacy: 2017 V1A 2017 CME 1 V1A 2017 CME 2 . Audience .

CHAMBER OF TAX CONSULTANTS WITHHOLDING U/S 195 RECENT TRENDS 15CA and 15CB Lower Deduction CA

Feedback: Journal 1 1 Exercise 1: Quiz You were not expected to know the answers. It was

The Same is Not The Same Postcorrection of Alphabet Confusion Errors in Mixed-Alphabet OCR

Old English Alphabet 05.23.13 || English 2322: British Literature: Anglo-Saxon Mid 18th Century

WRAC'H 2019 Analysis of Mixed PUF-TRNG Circuit Based on SR-Latches in FD-SOI Technology Jean-Luc

2 4 8 the solution contains the numbers 1 to 9 each once. 2 9 The second condition is almost

11-823 Conlanging Writing Writing Systems Different Writing Systems What makes a writing

11. Reference Types int t = x; x = y; y = t; Reference Types: Definition and Initialization,

Combinatorial representations Peter J. Cameron December 2011 Joint work with Max Gadouleau and