Short Introduction to R Paulino Prez 1 Jos Crossa 2 1 ColPos-Mxico 2 - - PowerPoint PPT Presentation

short introduction to r
SMART_READER_LITE
LIVE PREVIEW

Short Introduction to R Paulino Prez 1 Jos Crossa 2 1 ColPos-Mxico 2 - - PowerPoint PPT Presentation

Short Introduction to R Paulino Prez 1 Jos Crossa 2 1 ColPos-Mxico 2 CIMMyT-Mxico September, 2014. SLU,Sweden Short Introduction to R 1/51 Contents Introduction 1 2 Simple objects 3 User defined functions 4 Graphs 5 Importing


slide-1
SLIDE 1

Short Introduction to R

Paulino Pérez 1 José Crossa 2

1ColPos-México 2CIMMyT-México

September, 2014.

SLU,Sweden Short Introduction to R 1/51

slide-2
SLIDE 2

Contents

1

Introduction

2

Simple objects

3

User defined functions

4

Graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 2/51

slide-3
SLIDE 3

Introduction ¿R?

Contents

1

Introduction ¿R? R and other sofwares Manuals and help Sample R session

2

Simple objects Vectors Matrices data.frame

3

User defined functions

4

Graphs Plotting user defined functions Plot function More functions for graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 3/51

slide-4
SLIDE 4

Introduction ¿R?

¿R?

R is a software for statistical analysis and graphics. It was developed by Ross Ihaka y Robert Gentleman. R was specially designed to perform data analysis, but can be used also as a programming language. R is distributed freely under the GNU license(General Public Licence). The development is done by the R Development Team. R is available as source code and binary files compiled for windows and Mac. The R language allows the user to use loops to perform complex analysis.

SLU,Sweden Short Introduction to R 4/51

slide-5
SLIDE 5

Introduction R and other sofwares

Contents

1

Introduction ¿R? R and other sofwares Manuals and help Sample R session

2

Simple objects Vectors Matrices data.frame

3

User defined functions

4

Graphs Plotting user defined functions Plot function More functions for graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 5/51

slide-6
SLIDE 6

Introduction R and other sofwares

R and other softwares

Why R ? R is a free software, it can be executed in Windows, Linux and Mac. Excellent documentation and graphical capabilities. Most of the programs written in S plus can be run in R. It is powerful and easy to learn. It can be extended through the use of packages.

SLU,Sweden Short Introduction to R 6/51

slide-7
SLIDE 7

Introduction R and other sofwares

Disadvantages

Graphical user interface not as good as in other softwares. Lack of commercial support (partially true).

SLU,Sweden Short Introduction to R 7/51

slide-8
SLIDE 8

Introduction R and other sofwares

Installation in a Windows environment

Go to http://www.r-project.org and download the windows binary,

Figure 1: R web site.

SLU,Sweden Short Introduction to R 8/51

slide-9
SLIDE 9

Introduction R and other sofwares

Continue...

Go to the Downloads section and select CRAN,

Figure 2: CRAN mirrors

SLU,Sweden Short Introduction to R 9/51

slide-10
SLIDE 10

Introduction R and other sofwares

Continue...

Select the software for you OS,

Figure 3: Executables for various platforms.

SLU,Sweden Short Introduction to R 10/51

slide-11
SLIDE 11

Introduction R and other sofwares

Continue...

Download the base software,

Figure 4: Download R-base.

SLU,Sweden Short Introduction to R 11/51

slide-12
SLIDE 12

Introduction R and other sofwares

Continue...

Double click in the installer,

Figure 5: Installing R.

SLU,Sweden Short Introduction to R 12/51

slide-13
SLIDE 13

Introduction Manuals and help

Contents

1

Introduction ¿R? R and other sofwares Manuals and help Sample R session

2

Simple objects Vectors Matrices data.frame

3

User defined functions

4

Graphs Plotting user defined functions Plot function More functions for graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 13/51

slide-14
SLIDE 14

Introduction Manuals and help

Manuals

Once that you install R, you will have access to the following manuals in PDF format: An Introduction to R R Reference Manual R Data Import/Export R Language Definition Writing R Extensions R Internals R Installation and Administration

SLU,Sweden Short Introduction to R 14/51

slide-15
SLIDE 15

Introduction Manuals and help

Continue...

Furthermore: Contributed Docs (http://cran.r-project.org/other-docs.html). R-help mailing list archives (http://cran.r-project.org/search.html). Mailing list. Reference card. Summary of most useful R commands (http://www.rpad.org/Rpad/Rpad-refcard.pdf) S Programming, W. Venables and B. Ripley. See http://www.stats.ox.ac.uk/pub/MASS3/Sprog.

SLU,Sweden Short Introduction to R 15/51

slide-16
SLIDE 16

Introduction Sample R session

Contents

1

Introduction ¿R? R and other sofwares Manuals and help Sample R session

2

Simple objects Vectors Matrices data.frame

3

User defined functions

4

Graphs Plotting user defined functions Plot function More functions for graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 16/51

slide-17
SLIDE 17

Introduction Sample R session

Sample R session

Go to a Start->Programs->R->R-3.x.y, the working environment is as that shown in the next Figure.

Figure 6: R ready to process commands.

SLU,Sweden Short Introduction to R 17/51

slide-18
SLIDE 18

Introduction Sample R session

The symbol > is the command prompt. We can write command there, for example to show some help about matrices, ?matrix then Enter The help system can be accessed through the command line using the following functions: ?text help.start() help.search("text to search") apropos("search for some thing similar to...")

SLU,Sweden Short Introduction to R 18/51

slide-19
SLIDE 19

Introduction Sample R session

Code editors

A set of R commands is usually known as “script". There are several text

  • editors. The one included by default in Windows installations is not as fancy

as others that have syntax highlighting, for example Tinn-R, or R-studio. The standard text editor in R can be accessed from the File menu, File->New Script

Figure 7: Code editor in R

SLU,Sweden Short Introduction to R 19/51

slide-20
SLIDE 20

Introduction Sample R session

Example (performing basic calculations): 7+4 2*3*(1+2) The commands are written in the text editor, then the commands are selected and the pop-up menu is activated, one of the entries in the menu has an

  • ption to execute the code. The result will appear in the R console.

Figure 8: Executing R commands from the text editor.

SLU,Sweden Short Introduction to R 20/51

slide-21
SLIDE 21

Introduction Sample R session

Commands written in the text editor can be saved and restored for editing later. There exists a lot of text editors for writing R scripts, for example: WinEdit (shareware) SciViews (freeware) Tinn-R (freeware) Emacs (free) Rstudio (free)

SLU,Sweden Short Introduction to R 21/51

slide-22
SLIDE 22

Introduction Sample R session

Continue...

Figure 9: Rstudio.

SLU,Sweden Short Introduction to R 22/51

slide-23
SLIDE 23

Simple objects

R works manipulating “objects”. The objects are manipulated using functions and operators. The most basic objects are: vectors (type numeric or character) Matrix data.frame Lists Functions Some useful functions... General pourpouse: sqrt(),log(),exp(),sin(),cos(), etc. Related to statistics: mean(), sd(), var(), quantile(), etc. The assignment operator is = with R>=1.4.0 or <- in any R version.

SLU,Sweden Short Introduction to R 23/51

slide-24
SLIDE 24

Simple objects

Notes: R distinguish between upper and lowercase letters The symbol "#" is used to comment the code Object’s names can contain any combination of characters, except spaces and special symbols, for example "$","%","#", etc. Missing data can be represented with the special symbol "NA" (Not Available), and errors in computations for example dividing by 0 with the special symbol "NaN" (Not a Number) or "Inf"

SLU,Sweden Short Introduction to R 24/51

slide-25
SLIDE 25

Simple objects Vectors

Contents

1

Introduction ¿R? R and other sofwares Manuals and help Sample R session

2

Simple objects Vectors Matrices data.frame

3

User defined functions

4

Graphs Plotting user defined functions Plot function More functions for graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 25/51

slide-26
SLIDE 26

Simple objects Vectors

Vectors

Vectors are created using the functions c(),seq(),:, rep(). Examples:

a=c(1,2,3,4,5) a b=c("a","b","c") b d=1:10 d e=seq(1,10,by=0.5) e f=seq(1,10,length.out=20) f g=rep(10,3) g h=c(e,f) h

SLU,Sweden Short Introduction to R 26/51

slide-27
SLIDE 27

Simple objects Vectors

Vector operations

We can perform most common operations using vectors with the same length. Operations are performed element wise. Examples:

a=c(1,2,3) b=c(2,3,5) a+b #sum a and b a-b #a-b a*b #element wise product a^b #power function 3*a+2*b #product and sum a/b #element wise quotient a^2 #takes the square of each element

It is also possible to apply a function to a vector, for example:

exp(a) #Exponential function log(a) #logarithm function d=sqrt(a)+log(b) #square root and logarithm d #shows d

SLU,Sweden Short Introduction to R 27/51

slide-28
SLIDE 28

Simple objects Vectors

To extract some elements from the vectors we can use the [] operator, for example:

a=c(1,2,5,7,9) b=a[1:3] #first 3 elements in a, assign the result to b b #shows b c=a[-1] #take all elements in b except the first one #and crete a new object a[c(1,5)] #first and last component in a

There exists a lot of operations that can be performed using vectors, for example:

w=c(1,2,3,NA,-1,2) which(is.na(w)) #Missing values which.max(w) #Position of the maximum which.min(w) #Position of the minumum w>2 #numbers that are bigger than 2? which(w>2) #which numbers are bigger than 2? sort(w) #sort in ascending order sort(w,decreasing=T) #sorts in decreasing order

SLU,Sweden Short Introduction to R 28/51

slide-29
SLIDE 29

Simple objects Matrices

Contents

1

Introduction ¿R? R and other sofwares Manuals and help Sample R session

2

Simple objects Vectors Matrices data.frame

3

User defined functions

4

Graphs Plotting user defined functions Plot function More functions for graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 29/51

slide-30
SLIDE 30

Simple objects Matrices

Creating matrices

Matrices can contain numbers or characters. The creation of matrices is shown in the examples below.

#a)Identity matrix, nxn #Identity matrix of order 4x4 Identity=diag(c(1,1,1,1)) Identity #Alternatively... Identity=diag(rep(1,4)) Identity #b)J matrix #J matrix, order 3x3 J=matrix(1,nrow=3,ncol=3) J #c)In general #matrix(data = NA, nrow = 1, ncol = 1,byrow = FALSE, dimnames = NULL) A=matrix(nrow=3,ncol=3) A[1,]=c(1,2,3) A[2,]=c(4,5,6) A[3,]=c(7,8,9)

SLU,Sweden Short Introduction to R 30/51

slide-31
SLIDE 31

Simple objects Matrices

#Alternatively ... A=matrix(c(1:9),nrow=3,ncol=3,byrow=TRUE) A #Alternatively... A=matrix(c(1,4,7,2,5,8,3,4,8),nrow=3,ncol=3,byrow=FALSE) A

SLU,Sweden Short Introduction to R 31/51

slide-32
SLIDE 32

Simple objects Matrices

Matrix operations

#Sum and substraction C=A+J C D=A-J D #Matrix product (%*%) Dsq=D%*%D Dsq DsqA=D%*%D%*%(A) DsqA #Transpose, use t() t(D) #Determinant, det() det(D) #Inverse, use the function solve() InvI=solve(Identidad) InvI

SLU,Sweden Short Introduction to R 32/51

slide-33
SLIDE 33

Simple objects Matrices

#Rangk, use the QR decomposition qr(Identidad) qr(Identidad)$rank Noninvertible=matrix(c(1,2,3,1,2,1,2,4,6),nrow=3,ncol=3,byrow=TRUE) det(Noninvertible) qr(Noninvertible)$rank

SLU,Sweden Short Introduction to R 33/51

slide-34
SLIDE 34

Simple objects Matrices

We can also extract some elements of the matrix, for example:

A #Shows A A[1,1] #Element in row 1, column 1 in A A[1,] #First row of A A[c(1,2),] #First and second row of A A[-3,] #All rows except the third one A[,1] #First column of A A[,c(1,3)] #Columns one and third in A A[,-3] #All columns except third one

Note: When we extract a row or column it is automatically converted to a vector.

SLU,Sweden Short Introduction to R 34/51

slide-35
SLIDE 35

Simple objects data.frame

Contents

1

Introduction ¿R? R and other sofwares Manuals and help Sample R session

2

Simple objects Vectors Matrices data.frame

3

User defined functions

4

Graphs Plotting user defined functions Plot function More functions for graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 35/51

slide-36
SLIDE 36

Simple objects data.frame

data.frame

Tables are created using the function data.frame(v1,...,vn), where v1 is the vector 1 and vn is the vector n. The rows usually represents individuals and the columns covariates. Examples

ID=c("genO","genB","genZ") subj1=c(10,25,33) subj2=c(NA,34,15)

  • ncogen=c(TRUE,TRUE,FALSE)

loc=c(1,30,125) data1=data.frame(ID,subj1,subj2,oncogen,loc) data1 #If you want to display the column names in a #data.frame, use the function names names(data1) #To show or extract a column use the operator $ data1$subj2 data1$subj2 data1$oncogen

SLU,Sweden Short Introduction to R 36/51

slide-37
SLIDE 37

User defined functions

User defined functions

R implements a lot of statistical methodologies using functions. The functions are organized in libraries. The base library contains all the functions that we have been using so far. The libraries can be downloaded freely from the internet. We can create our own functions for data analysis. The syntax for creating a new function is as follows:

funcion_name=function(arg1,...,argn) { function body; return the value; } Examples:

SLU,Sweden Short Introduction to R 37/51

slide-38
SLIDE 38

User defined functions

A function to compute f(x) = x2

f=function(x) { x^2 } f(2) f(c(1,2,3))

A function to compute n

i=1 i my_sum=function(n) { tmp=c(1:n) sum(tmp) } #The result should be identical to n(n+1)/2 n=100 my_sum(n) n*(n+1)/2

SLU,Sweden Short Introduction to R 38/51

slide-39
SLIDE 39

Graphs

R includes many functions to produce high quality graphics ready for

  • publication. We can explore the graphical capabilities of the software included

in the demos. Type demo() in the command prompt and the software will display a list of demos that we can execute, for example: graphics image persp plotmath

demo() demo(graphics) #Some graphical capabilities demo(image) #Working with images demo(persp) demo(plotmath) #Mathematical symbols in graphs

SLU,Sweden Short Introduction to R 39/51

slide-40
SLIDE 40

Graphs Plotting user defined functions

Contents

1

Introduction ¿R? R and other sofwares Manuals and help Sample R session

2

Simple objects Vectors Matrices data.frame

3

User defined functions

4

Graphs Plotting user defined functions Plot function More functions for graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 40/51

slide-41
SLIDE 41

Graphs Plotting user defined functions

Plotting user defined functions

To plot user defined functions we can use the functions curve(), or plot(). We will give more details about the later in the next slides. Examples

#1: Plotting f(x)=x^2, -4<=x<=4 curve(x^2,-4,4) #2: Plotting f(x)=-x^3, -4<=x<=4 curve(-x^2,-4,4) #3:

  • p=par(mfrow=c(2,2))

curve(x^3-3*x, -2, 2) curve(x^2-2, add = TRUE, col = "violet") plot(cos, xlim =c(-pi,3*pi), n = 1001, col = "blue") chippy=function(x) sin(cos(x)*exp(-x/2)) curve(chippy, -8, 7, n=2001) curve(chippy,-8, -5) #4: Standard normal curve(1/sqrt(2*pi)*exp(-1/2*x^2),-3,3)

SLU,Sweden Short Introduction to R 41/51

slide-42
SLIDE 42

Graphs Plot function

Contents

1

Introduction ¿R? R and other sofwares Manuals and help Sample R session

2

Simple objects Vectors Matrices data.frame

3

User defined functions

4

Graphs Plotting user defined functions Plot function More functions for graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 42/51

slide-43
SLIDE 43

Graphs Plot function

Plot function

One of the most useful function for plotting is the plot function. With this function we can plot points (scatterplot), lines (time series), or functions. Examples

y<-c(1,2,3,4,5) x<-c(1,4,9,16,25) plot(x,y,main="",ylab="", xlab="") plot(x,y,type="l") plot(dnorm, -3,3,col = "blue")

SLU,Sweden Short Introduction to R 43/51

slide-44
SLIDE 44

Graphs More functions for graphs

Contents

1

Introduction ¿R? R and other sofwares Manuals and help Sample R session

2

Simple objects Vectors Matrices data.frame

3

User defined functions

4

Graphs Plotting user defined functions Plot function More functions for graphs

5

Importing data

6

Installing packages

7

Questions

SLU,Sweden Short Introduction to R 44/51

slide-45
SLIDE 45

Graphs More functions for graphs

More functions for graphs

barplot pie histogram boxplot

SLU,Sweden Short Introduction to R 45/51

slide-46
SLIDE 46

Importing data

Importing data

There are several routines to import data into the R environment. For ASCII files we can use:

1

read.table

2

read.csv The function setwd is useful for setting the working directory so that we do not have to write the entire PATH of a file each time that we want to read it. R can save and load objects in a native binary format. The functions load and save can be used to that end.

SLU,Sweden Short Introduction to R 46/51

slide-47
SLIDE 47

Importing data

Examples

This data set is from CIMMYT global Wheat breeding program and comprises phenotypic, genotypic and pedigree information of n = 599 wheat lines. The data set was made publicly available by Crossa et al. (2010). Lines were evaluated for grain yield at four different environments. Each of the lines were genotyped for p = 1, 279 Diversity Array Technology (DArT) markers. At each marker two homocygous genotypes were possible and these were coded as 0/1. Marker genotypes are given in the object X. Finally a matrix A provides the pedigree-relationships between lines computed from the pedigree.

SLU,Sweden Short Introduction to R 47/51

slide-48
SLIDE 48

Importing data

Continue...

rm(list=ls()) library(doBy) setwd("~/0. R-Intro/examples") #Load genotypic data load("pedigree_markers.RData") #Load phenotypic data pheno=read.table(file="599_yield_raw-1.prn",header=TRUE) colnames(pheno) pheno=pheno[,c(2,5,6)]

  • ut=summaryBy(GY~env+gen1,data=pheno,FUN=mean)

Y=data.frame(yield=out$GY.mean,VAR=out$gen1,ENV=out$env)

SLU,Sweden Short Introduction to R 48/51

slide-49
SLIDE 49

Installing packages

Installing packages

Many users all over the world are creating software packages for R,

Figure 10: Packages.

SLU,Sweden Short Introduction to R 49/51

slide-50
SLIDE 50

Installing packages

Continue...

A package is just a collection of routines used to perform some calculations. Usually these routines are made available to the user through functions well documented. The function install.packages() is used to install software from the CRAN website, for example:

install.packages("BLR") install.packages("BGLR") install.packagses("doBy")

Once that a package is installed, it should be loaded with the function library,

library(doBy) library(BGLR)

SLU,Sweden Short Introduction to R 50/51

slide-51
SLIDE 51

Questions

Questions?

SLU,Sweden Short Introduction to R 51/51