the r language
play

The R Language A Hands-on Introduction Venkatesh-Prasad Ranganath - PowerPoint PPT Presentation

The R Language A Hands-on Introduction Venkatesh-Prasad Ranganath http://about.me/rvprasad What is R? A dynamical typed programming language http://cran.r-project.org/ Open source and free Provides common programming language


  1. The R Language A Hands-on Introduction Venkatesh-Prasad Ranganath http://about.me/rvprasad

  2. What is R? • A dynamical typed programming language • http://cran.r-project.org/ • Open source and free • Provides common programming language constructs/features • Multiple programming paradigms • Numerous libraries focused on various data-rich topics • http://cran.r-project.org/web/views/ • Ideal for statistical calculation; lately, the go-to tool for data analysis • Accompanied by RStudio, a simple and powerful IDE • http://rstudio.org

  3. Data Types (Modes) • Numeric • Character • Logical (TRUE / FALSE) • Complex • Raw (bytes)

  4. Data Structures • Vectors • Matrices • Arrays • Lists • Data Frames • Factors • Tables

  5. Data Structures: Vectors • A sequence of objects of the same (atomic) data type • Creation • x <- [ <- is the assignment operator ] b c • y <- seq( 5, 9, 2) = c( 5, 7, 9) • y <- 5: 7 = c( 5, 6, 7) [ m : n is equivalent to seq( m , n, 1) ] • y <- c( 1, 4: 6) = c( 1, 4, 5, 6) [ no nesting / always flattened ] • z <- r ep( 1, 3) = c( 1, 1, 1)

  6. Data Structures: Vectors • Accessing • x[ 1] [ 1-based indexing ] • x[ 2: 3] • x[ c( 2, 3) ] = x[ 2: 3] • x[ - 1] [ Negative subscripts imply exclusion ] • Naming • nam [ Makes equivalent to x[ 1] ] es( x) <-

  7. Data Structures: Vectors • Operations • x <- c( 5, 6, 7) • x + 2 = c( 7, 8, 9) [ Vectorized operations ] • x > 5 = c( FALSE, TR U E, TR U E) • subset ( x, x > 5) = c( 6, 7) • w hi ch( x > 5) = c( 2, 3) • i f el se( x > 5, N aN , x) = c( 5, N aN , N aN ) • sqr <- f unct i on ( n) { n * n } • sappl y( x, sqr ) = c( 25 , 36, 49) • sqr ( x) = c( 25, 36, 49)

  8. Data Structures: Vectors • Operations • x <- c( 5, 6, 7) • any( x > 5) = TR [ How about al l ( x > 5) ? ] U E • sum [ Why is na. r m required? ] ( c( 1, 2, 3, N A) , na. r m = TR U E) = 6 • sor t ( c( 7, 6, 5) ) = c( 5, 6, 7) • or der ( c( 7, 6, 5) ) = ??? • subset ( x, x > 5) = c( 6, 7) • head( 1: 100) = ??? • t ai l ( 1: 100) = ??? • How is x == c( 5, 6, 7) different from i dent i cal ( x, c( 5, 6, 7) ) ? • Tr y st r ( x)

  9. Data Structures: Matrices • A two dimensional matrix of objects of the same (atomic) data type • Creation • y <- m =2, ncol =3) [ empty matrix ] at r i x( nr ow 1 3 5 • y <- m at r i x( c( 1, 2, 3, 4, 5, 6) , nr ow =2) = 2 4 6 • y <- m 1 2 3 at r i x( c( 1, 2, 3, 4, 5, 6) , nr ow =2, byr ow =T) = 4 5 6 • Accessing • y[ 1, 2] = 2 • y[ , 2: 3] [ How about y[ 1, ] ? ] = 2 3 5 6 • What’s the difference between y[ 2, ] and y[ 2, , dr op=FALSE] ?

  10. Data Structures: Matrices • Naming • r ow es( ) and col nam nam es( ) • Operations • nr ow [ number of rows ] ( y) = 2 • ncol ( y) = 3 [ number of columns ] • appl y( y, 1, sum [ apply sum to each row ] ) = c( 6, 15) • appl y( y, 2, sum [ apply sum to each column ] ) = c( 5, 7, 9) • t ( y) = [ transpose a matrix ] 1 4 2 5 3 6

  11. Data Structures: Matrices • Operations 1 2 3 • r bi nd( y, c( 7, 8, 9) ) = 4 5 6 7 8 9 1 2 3 7 • cbi nd( y, c( 7, 8) ) = 4 5 6 8 • Tr y st r ( y)

  12. Data Structures: Matrices • What will this yield? m <- m at r i x( nr ow =4, ncol =4) m <- i f el se( r ow ( m ) == col ( m ) , 1, 0. 3)

  13. Data Structures: Lists • A sequence of objects (of possibly different data types) • Creation • k <- l i st ( c( 1, 2, 3) , • l <- [ f1 and f2 are tags ] • Accessing • k[ 2: 3] • k[ [ 2] ] [ How about k[ 2] ? ] • l $f 1 = c( 1, 2, 3) [ Is it same as l [ 1] or l [ [ 1] ] ? ]

  14. Data Structures: Lists • Naming • nam es( k) <- • Operations • l appl y( l i st ( 1: 2, 9: 10) , sum ) = l i st ( 3, 19) • sappl y( l i st ( 1: 2, 9: 10) , sum ) = c( 3, 19) • l $f 1 <- N U LL = ??? • st r ( l ) = ???

  15. Data Structures: Data Frames • A two dimensional matrix of objects where different columns can be of different types. • Creation • x <- dat a. f r am e j i l l • Accessing • x$nam [ How about x[ [ 1] ] ? ] es j i l l • x[ 1] = ??? • x[ c( 1, 2) ] = ??? • x[ 1, ] = ??? • x[ , 1] = ???

  16. Data Structures: Data Frames • Naming • r ow es( ) and col nam nam es( ) • Operations • x[ x$age > 5, ] = dat a. f r am e j i l l ) ) • subset ( x, age > 5) = ??? • appl y( x, 1, sum ) = ??? • y <- dat a. f r am e( 1: 3, 5: 7) • appl y( y, 1, m ean) = ??? • l appl y( y, m ean) = ??? • sappl y( y, m ean) = ??? • Tr y st r ( y)

  17. Factors (and Tables) • Type for categorical/nominal values. • Example • xf <- f act or ( c( 1: 3, 2, 4: 5) ) • Try xf and st r ( xf ) • Operations • t abl e( xf ) = ??? • w i t h( m t car s, spl i t ( m pg, cyl ) ) = ??? • w i t h( m t car s, t appl y( m pg, cyl , m ean) ) = ??? • by( m t car s, m t car s$cyl , f unct i on( m ) { m edi an( m $m pg) } = ??? • aggr egat e( m t car s, l i st ( m t car s$cyl ) , m edi an) = ??? • You can use cut to bin values and create factors. Try it.

  18. Basic Graphs • w i t h( m t car s, boxpl ot ( m pg) ) • hi st ( m t car s$m pg) • w i t h( m t car s, pl ot ( hp, m pg) ) • dot char t ( VAD eat hs) • Try pl ot ( aggr egat e( m t car s, l i st ( m t car s$cyl ) , m edi an) ) You can get the list of datasets via l s package. dat aset s

  19. Stats 101 using R • m ean • m edi an • What about mode? • f i venum • quant i l e • sd • var • cov • cor

  20. Data Exploration using R Let’s get out hands dirty!!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend