R Basics / Course Business l Well be using a sample dataset in class - PowerPoint PPT Presentation

R Basics / Course Business l We’ll be using a sample dataset in class today: l CourseWeb: Course Documents à Sample Data à Week 2 l Can download to your computer before class l Thanks for answering CourseWeb background survey! l If sitting in on the course, e-mail me so I can add you to CourseWeb

R Basics

R Basics l R commands & functions l Reading in data l Saving R scripts l Descriptive statistics l Subsetting data l Assigning new values l Referring to specific cells l Types & type conversion l NA values l Getting help

R Commands l Simplest way to interact with R is by typing in commands at the > prompt: R STUDIO R

R as a Calculator l Typing in a simple calculation shows us the result: l 608 + 28 l What’s 11527 minus 283? l Some more examples: l 400 / 65 (division) l 2 * 4 (multiplication) l 5 ^ 2 (exponentiation)

Functions l More complex calculations can be done with functions: l sqrt(64) In parenthesis: What What the function we want to perform the is (square root) function on l Can often read these left to right (“square root of 64”) l What do you think this means? l abs(-7)

Arguments l Some functions have settings (“arguments”) that we can adjust: l round(3.14) Rounds off to the nearest integer (zero - decimal places) l round(3.14, digits=1) One decimal place -

Nested Functions

Nested Functions l We can use multiple functions in a row, one inside another sqrt(abs(-16)) - “Square root of the absolute value of -16” - l Don't get scared when you see multiple parentheses! Can often just read left to right - R first figures out the thing nested in - the middle Can you round off the square root of 7? •

Using Multiple Numbers at Once l When we want to use multiple numbers, we concatenate them l c(2,6,16) A list of the numbers 2, 6, and 16 - l Sometimes a computation requires multiple numbers mean(c(2,6,16)) - l Also a quick way to do the same thing to multiple different numbers: sqrt(c(16,100,144)) -

Course Documents: Sample Data: Week 2 l Reading plausible versus implausible sentences l “Scott chopped the carrots with a knife .” Measure reading time on final word “Scott chopped the carrots with a spoon .” Note: Simulated data; not a real experiment.

Course Documents: Sample Data: Week 2 l Reading plausible versus implausible sentences l Reading time on critical word l 36 subjects l Each subject sees 30 items (sentences): half plausible, half implausible l Interested in changes over time, so we’ll track number of trials remaining (29 vs 28 vs 27 vs 26…)

Reading in Data l Make sure you have the dataset at this point if you want to follow along: Course Documents à Sample Data à Week 2

Reading in Data – RStudio l Navigate to the folder in lower-right l More -> Set as Working Directory l Open a “comma-separated value” file: experiment <-read.csv('week2.csv') - Name of the “dataframe” we’re creating (whatever read.csv is the File name we want to call this dataset) function name

Reading in Data – Regular R l Read in a “comma-separated value” file: read.csv is the function name experiment <- read.csv - ('/Users/scottfraundorf/Desktop/week2.csv') Folder & file name Name of the “dataframe” we’re creating (whatever we want to call this dataset) • Drag & drop the file into R to get the full folder & filename

Looking at the Data: Summary l A “big picture” of the dataset: l summary(experiment) l summary() is a very important function! l Basic info & descriptive statistics l Check to make sure the data are correct

Looking at the Data: Summary l A “big picture” of the dataset: l summary(experiment) l We can use $ to refer to a specific column/variable in our dataset: l summary(experiment$ItemName)

Looking at the Data: Raw Data l Let’s look at the data! l experiment l Ack! That’s too much! How about just a few rows? l head(experiment) l head(experiment, n=10)

Reading in Data: Other Formats l Excel: library(gdata) - experiment <- - read.xls('/Users/scottfraundorf/De sktop/week2.xls') l SPSS: library(foreign) - experiment <- - read.spss('/Users/scottfraundorf/D esktop/week2.spss', to.data.frame=TRUE)

R Scripts l Save & reuse commands with a script R File -> New Document R STUDIO

R Scripts l Run commands without typing them all again l R Studio: l Code -> Run Region -> Run All: Run entire script l Code -> Run Line(s): Run just what you’ve highlighted/selected l R: Highlight the section of script you want to run - Edit -> Execute - l Keyboard shortcut for this: Ctrl+Enter (PC), ⌘ +Enter (Mac) -

R Scripts l Saves times when re-running analyses l Other advantages? l Some: Documentation for yourself - Documentation for others - Reuse with new analyses/experiments - Quicker to run—can automatically - perform one analysis after another

R Scripts—Comments l Add # before a line to make it a comment Not commands to R, just notes to self - (or other readers) Can also add a # to make the rest of a • line a comment summary(experiment$Subject) #awesome •

Descriptive Statistics l Remember how we referred to a particular variable in a dataframe? $ - l Combine that with functions: mean(experiment$RT) - median(experiment$RT) - sd(experiment$RT) - l Or, for a categorical variable: levels(experiment$ItemName) - summary(experiment$Subject) -

Descriptive Statistics l We often want to look at a dependent variable as a function of some independent variable(s) tapply(experiment$RT, - experiment$Condition, mean) “Split up the RTs by Condition, then get the mean” - l Try getting the mean RT for each item l How about the median RT for each subject? l To combine multiple results into one table, “column bind” them with cbind() : l cbind( tapply(experiment$RT, experiment$Condition, mean), tapply(experiment$RT, experiment$Condition, sd) )

Descriptive Statistics l Can have 2-way tables... tapply(experiment$RT, - list(experiment$Subject, experiment$Condition), mean) 1 st variable is rows, 2 nd is columns - l ...or more! tapply(experiment$RT, - list(experiment$ItemName, experiment$Condition, experiment$TestingRoom), mean)

Descriptive Statistics l Contingency tables for categorical variables: xtabs (~ Subject + Condition, - data=experiment)

Subsetting Data l Often, we want to examine or use just part of a dataframe l Remember how we read our dataframe? experiment <- read.csv(...) - l Create a new dataframe that's just a subset of experiment: experiment.LongRTsRemoved <- - subset(experiment, RT < 2000) Inclusion criterion: RT Original dataframe New dataframe name less than 2000 ms

Subsetting Data: Logical Operators l Try getting just the observations with RTs 200 ms or more: experiment.ShortRTsRemoved <- - subset(experiment, RT >= 200) l Why not just delete the bad RTs from the spreadsheet? l Easy to make a mistake / miss some of them l Faster to have the computer do it l We’d lose the original data l No documentation of how we subsetted the data

Subsetting Data: AND and OR l What if we wanted only RTs between 200 and 2000 ms? Could do two steps: - experiment.Temp <- - subset(experiment, RT >= 200) experiment.BadRTsRemoved <- - subset(experiment.Temp, RT <= 2000) l One step with & for AND: experiment2 <- subset(experiment, - RT >= 200 & RT <= 2000)

Subsetting Data: AND and OR l What if we wanted only RTs between 200 and 2000 ms? l One step with & for AND: experiment2 <- subset(experiment, - RT >= 200 & RT <= 2000) l | means OR: experiment.BadRTs <- - subset(experiment, RT < 200 | RT > 2000) Logical OR (“either or both”) -

Subsetting Data: == and != l Get a match / equals: experiment.LastTrials <- - subset(experiment, TrialsRemaining == 0) Note DOUBLE equals sign l Words/categorical variables need quotes: experiment.ImplausibleSentences <- - subset(experiment, Condition=='Implausible') l != means “not equal to”: experiment.BadSubjectRemoved <- - subset(experiment, Subject != 'S23') Drops subject “S23”

R Basics / Course Business l Well be using a sample dataset in class - PowerPoint PPT Presentation

R Basics / Course Business l Well be using a sample dataset in class today: l CourseWeb: Course Documents Sample Data Week 2 l Can download to your computer before class l Thanks for answering CourseWeb background survey! l If sitting in

Course Orientation q Course Description q Course Outcomes q Course Requirements q Course Outline

MODULE 6 PLUMBING AND ELECTRICAL BASICS OF MODERN LABORATORY DESIGN 6 6 PLUMBING AND ELECTRICAL

Probability Basics Probabilistic Inference Martin Emms October 1, 2020 Probability Basics

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Outline Random Networks Basics Basics Basics Definitions Definitions How to build

Introduction Course basics Course basics Ruben Hoeksma MZH 3320 Webpage:

Qt 3D Basics Kvin Ottens, Software Craftsman at KDAB Qt 3D Basics Feature Set Entity

Management of Classification Lookup Files The basics of classification The basics of

Course Search Widget Topics StudyLink Course Search Widget Demo Generic Course Search

Course Specifications/Detailed Course Outline Course code : STA 331 2.0 Course title :

Business and Business Environment Business and Business Environment Introduction Business is

DPD Basic Bicycle Course Course Objectives COURSE GOAL: The course will provide the trainee with

CANVAS COURSE PROFILE STUDENT PERFORMANCE COURSE OVERVIEW ASSIGNMENT AND SUBMISSION ANALYSIS

Leadplane Training Course Leadplane Training Course Course Objectives Describe procedures for

Statistics II Xavier Vil Course 2004-2005 1.- Course Contents 2.- Course Resources 3.-

Equivalent Axle Loads Mazda Miata Curb weight = 2300 lb 1 ( ) = d consumption per passage

Clean M Miles Standar ard W Workshop 2018 B Base Y Year Emission ons I Inventor ory

Factoring and RSA Nadia Heninger University of Pennsylvania September 18, 2017 *Some slides

Re Real-tim time e Dis istr trib ibuted ed MIM IMO Sy Systems Hariharan Rahul Ezzeldin

The COMET Experiment Status and Prospects Matthias Dubouchet High Energy Physics Group Imperial

CSCI 304: Computer Organiza6on Spring 2017, TR 11:00-12:20pm

NDNS: DNS in NDN Alex Afanasyev, Yukai Tu, Xiaoke Jiang, Lixia Zhang, and others Overview 2

On On- -Dem Demand Rou outi ting for or Scalable e