r basics course business
play

R Basics / Course Business l Well be using a sample dataset in class - PowerPoint PPT Presentation

R Basics / Course Business l Well be using a sample dataset in class today: l CourseWeb: Course Documents Sample Data Week 2 l Can download to your computer before class l Thanks for answering CourseWeb background survey! l If sitting in


  1. R Basics / Course Business l We’ll be using a sample dataset in class today: l CourseWeb: Course Documents à Sample Data à Week 2 l Can download to your computer before class l Thanks for answering CourseWeb background survey! l If sitting in on the course, e-mail me so I can add you to CourseWeb

  2. R Basics

  3. R Basics

  4. R Basics l R commands & functions l Reading in data l Saving R scripts l Descriptive statistics l Subsetting data l Assigning new values l Referring to specific cells l Types & type conversion l NA values l Getting help

  5. R Commands l Simplest way to interact with R is by typing in commands at the > prompt: R STUDIO R

  6. R as a Calculator l Typing in a simple calculation shows us the result: l 608 + 28 l What’s 11527 minus 283? l Some more examples: l 400 / 65 (division) l 2 * 4 (multiplication) l 5 ^ 2 (exponentiation)

  7. Functions l More complex calculations can be done with functions: l sqrt(64) In parenthesis: What What the function we want to perform the is (square root) function on l Can often read these left to right (“square root of 64”) l What do you think this means? l abs(-7)

  8. Arguments l Some functions have settings (“arguments”) that we can adjust: l round(3.14) Rounds off to the nearest integer (zero - decimal places) l round(3.14, digits=1) One decimal place -

  9. Nested Functions

  10. Nested Functions l We can use multiple functions in a row, one inside another sqrt(abs(-16)) - “Square root of the absolute value of -16” - l Don't get scared when you see multiple parentheses! Can often just read left to right - R first figures out the thing nested in - the middle Can you round off the square root of 7? •

  11. Using Multiple Numbers at Once l When we want to use multiple numbers, we concatenate them l c(2,6,16) A list of the numbers 2, 6, and 16 - l Sometimes a computation requires multiple numbers mean(c(2,6,16)) - l Also a quick way to do the same thing to multiple different numbers: sqrt(c(16,100,144)) -

  12. R Basics l R commands & functions l Reading in data l Saving R scripts l Descriptive statistics l Subsetting data l Assigning new values l Referring to specific cells l Types & type conversion l NA values l Getting help

  13. Course Documents: Sample Data: Week 2 l Reading plausible versus implausible sentences l “Scott chopped the carrots with a knife .” Measure reading time on final word “Scott chopped the carrots with a spoon .” Note: Simulated data; not a real experiment.

  14. Course Documents: Sample Data: Week 2 l Reading plausible versus implausible sentences l Reading time on critical word l 36 subjects l Each subject sees 30 items (sentences): half plausible, half implausible l Interested in changes over time, so we’ll track number of trials remaining (29 vs 28 vs 27 vs 26…)

  15. Reading in Data l Make sure you have the dataset at this point if you want to follow along: Course Documents à Sample Data à Week 2

  16. Reading in Data – RStudio l Navigate to the folder in lower-right l More -> Set as Working Directory l Open a “comma-separated value” file: experiment <-read.csv('week2.csv') - Name of the “dataframe” we’re creating (whatever read.csv is the File name we want to call this dataset) function name

  17. Reading in Data – Regular R l Read in a “comma-separated value” file: read.csv is the function name experiment <- read.csv - ('/Users/scottfraundorf/Desktop/week2.csv') Folder & file name Name of the “dataframe” we’re creating (whatever we want to call this dataset) • Drag & drop the file into R to get the full folder & filename

  18. Looking at the Data: Summary l A “big picture” of the dataset: l summary(experiment) l summary() is a very important function! l Basic info & descriptive statistics l Check to make sure the data are correct

  19. Looking at the Data: Summary l A “big picture” of the dataset: l summary(experiment) l We can use $ to refer to a specific column/variable in our dataset: l summary(experiment$ItemName)

  20. Looking at the Data: Raw Data l Let’s look at the data! l experiment l Ack! That’s too much! How about just a few rows? l head(experiment) l head(experiment, n=10)

  21. Reading in Data: Other Formats l Excel: library(gdata) - experiment <- - read.xls('/Users/scottfraundorf/De sktop/week2.xls') l SPSS: library(foreign) - experiment <- - read.spss('/Users/scottfraundorf/D esktop/week2.spss', to.data.frame=TRUE)

  22. R Basics l R commands & functions l Reading in data l Saving R scripts l Descriptive statistics l Subsetting data l Assigning new values l Referring to specific cells l Types & type conversion l NA values l Getting help

  23. R Scripts l Save & reuse commands with a script R File -> New Document R STUDIO

  24. R Scripts l Run commands without typing them all again l R Studio: l Code -> Run Region -> Run All: Run entire script l Code -> Run Line(s): Run just what you’ve highlighted/selected l R: Highlight the section of script you want to run - Edit -> Execute - l Keyboard shortcut for this: Ctrl+Enter (PC), ⌘ +Enter (Mac) -

  25. R Scripts l Saves times when re-running analyses l Other advantages? l Some: Documentation for yourself - Documentation for others - Reuse with new analyses/experiments - Quicker to run—can automatically - perform one analysis after another

  26. R Scripts—Comments l Add # before a line to make it a comment Not commands to R, just notes to self - (or other readers) Can also add a # to make the rest of a • line a comment summary(experiment$Subject) #awesome •

  27. R Basics l R commands & functions l Reading in data l Saving R scripts l Descriptive statistics l Subsetting data l Assigning new values l Referring to specific cells l Types & type conversion l NA values l Getting help

  28. Descriptive Statistics l Remember how we referred to a particular variable in a dataframe? $ - l Combine that with functions: mean(experiment$RT) - median(experiment$RT) - sd(experiment$RT) - l Or, for a categorical variable: levels(experiment$ItemName) - summary(experiment$Subject) -

  29. Descriptive Statistics l We often want to look at a dependent variable as a function of some independent variable(s) tapply(experiment$RT, - experiment$Condition, mean) “Split up the RTs by Condition, then get the mean” - l Try getting the mean RT for each item l How about the median RT for each subject? l To combine multiple results into one table, “column bind” them with cbind() : l cbind( tapply(experiment$RT, experiment$Condition, mean), tapply(experiment$RT, experiment$Condition, sd) )

  30. Descriptive Statistics l Can have 2-way tables... tapply(experiment$RT, - list(experiment$Subject, experiment$Condition), mean) 1 st variable is rows, 2 nd is columns - l ...or more! tapply(experiment$RT, - list(experiment$ItemName, experiment$Condition, experiment$TestingRoom), mean)

  31. Descriptive Statistics l Contingency tables for categorical variables: xtabs (~ Subject + Condition, - data=experiment)

  32. R Basics l R commands & functions l Reading in data l Saving R scripts l Descriptive statistics l Subsetting data l Assigning new values l Referring to specific cells l Types & type conversion l NA values l Getting help

  33. Subsetting Data l Often, we want to examine or use just part of a dataframe l Remember how we read our dataframe? experiment <- read.csv(...) - l Create a new dataframe that's just a subset of experiment: experiment.LongRTsRemoved <- - subset(experiment, RT < 2000) Inclusion criterion: RT Original dataframe New dataframe name less than 2000 ms

  34. Subsetting Data: Logical Operators l Try getting just the observations with RTs 200 ms or more: experiment.ShortRTsRemoved <- - subset(experiment, RT >= 200) l Why not just delete the bad RTs from the spreadsheet? l Easy to make a mistake / miss some of them l Faster to have the computer do it l We’d lose the original data l No documentation of how we subsetted the data

  35. Subsetting Data: AND and OR l What if we wanted only RTs between 200 and 2000 ms? Could do two steps: - experiment.Temp <- - subset(experiment, RT >= 200) experiment.BadRTsRemoved <- - subset(experiment.Temp, RT <= 2000) l One step with & for AND: experiment2 <- subset(experiment, - RT >= 200 & RT <= 2000)

  36. Subsetting Data: AND and OR l What if we wanted only RTs between 200 and 2000 ms? l One step with & for AND: experiment2 <- subset(experiment, - RT >= 200 & RT <= 2000) l | means OR: experiment.BadRTs <- - subset(experiment, RT < 200 | RT > 2000) Logical OR (“either or both”) -

  37. Subsetting Data: == and != l Get a match / equals: experiment.LastTrials <- - subset(experiment, TrialsRemaining == 0) Note DOUBLE equals sign l Words/categorical variables need quotes: experiment.ImplausibleSentences <- - subset(experiment, Condition=='Implausible') l != means “not equal to”: experiment.BadSubjectRemoved <- - subset(experiment, Subject != 'S23') Drops subject “S23”

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend