intro to r 4 base r plots
play

Intro to R - 4. Base R Plots OIT/SMU Libraries Data Science Workshop - PowerPoint PPT Presentation

Intro to R - 4. Base R Plots OIT/SMU Libraries Data Science Workshop Series Michael Hahsler OIT, SMU Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 1 / 24 Simple Plots 1 High-level Graphics Functions 2 Low-level Graphics


  1. Intro to R - 4. Base R Plots OIT/SMU Libraries Data Science Workshop Series Michael Hahsler OIT, SMU Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 1 / 24

  2. Simple Plots 1 High-level Graphics Functions 2 Low-level Graphics Functions 3 Exercises 4 Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 2 / 24

  3. Section 1 Simple Plots Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 3 / 24

  4. Introduction Ploting is an integral part of R. R plots on devices (e.g., X11(), quarz(), windows(), pdf(), png() ) Plotting commands are divided into three basic groups: High-level plotting functions create a new plot on the graphics device, possibly with 1 axes, labels, titles and so on. Low-level plotting functions add more information to an existing plot, such as extra 2 points, lines and labels. Interactive graphics functions allow you interactively add information to, or extract 3 information from, an existing plot, using a pointing device such as a mouse. We will only discuss “base” graphics here. A very popular alternative is ggplot . Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 4 / 24

  5. Section 2 High-level Graphics Functions Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 5 / 24

  6. plot x <- 1:100 y <- x^2 plot(x, y) 10000 8000 6000 y 4000 2000 0 0 20 40 60 80 100 x Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 6 / 24

  7. plot # use type l = line, line width 3 and red plot(x, y, type = "l", lwd = 3, col = "red") # add a vertical dashed (line type 2) line abline(v=50, lty=2) # add a vertical line 10000 8000 6000 y 4000 2000 0 0 20 40 60 80 100 x Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 7 / 24

  8. Getting help for plot ? plot Shows that plot is a so called generic function. Generic functions have implementations dor different data types which get “dispatched” at call-time. ? plot.default This is the default function for plot. ? par Graphical parameters which typically can be passed on as ... to plot. Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 8 / 24

  9. Scatterplot mlb <- read.csv(paste0("https://michael.hahsler.net/SMU/", "DS_Workshop_Intro_R/examples/MLB_cleaned.csv")) plot(x = mlb$Height.inches., y = mlb$Weight.pounds., xlab = "Height", ylab = "Weight", main = "Weight by Height") Weight by Height 280 260 240 Weight 220 200 180 160 70 75 80 Height Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 9 / 24

  10. Scatterplot matrix (pairs plot) plot() is generic which means that it produces different results depending on the type of data. For example, data frames result in a pairwise scatterplot matrix. data(iris) head(iris, n=1) # iris is a data.frame ## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ## 1 5.1 3.5 1.4 0.2 setosa plot(iris[,-5], col= iris[,5]) # use Species (col 5) for color 2.0 2.5 3.0 3.5 4.0 0.5 1.0 1.5 2.0 2.5 7.5 6.5 Sepal.Length 5.5 4.5 4.0 3.5 Sepal.Width 3.0 2.5 2.0 7 6 5 Petal.Length 4 3 2 1 2.5 2.0 1.5 Petal.Width 1.0 0.5 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 1 2 3 4 5 6 7 Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 10 / 24

  11. barplot - Bar Charts gpa <- c(Sandra = 3.8, Michael = 4.0, Peter = NA) gpa.sort <- sort(gpa, na.last = TRUE) barplot(gpa.sort, ylab = "GPA", xlab = "Student") 4 3 GPA 2 1 0 Sandra Michael Peter Student Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 11 / 24

  12. barplot - Bar Charts How many players do the Texas Rangers have for each position? mlb_tex <- mlb[mlb$Team == "TEX", ] # use par to make space for labels see ? par and look for mar (plot margin) oldpar <- par(mar = c(10, 4, 4, 1) + .1) barplot(sort(table(mlb_tex$Position), decreasing = TRUE), las = 2, ylab = "Count", main = "Texas Rangers: Players per position") Texas Rangers: Players per position 10 8 6 Count 4 2 0 Relief Pitcher Starting Pitcher Outfielder Catcher Shortstop Designated Hitter First Baseman Second Baseman Third Baseman par(oldpar) Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 12 / 24

  13. hist - Histogram hist(mlb$Age, breaks=20) Histogram of mlb$Age 100 80 Frequency 60 40 20 0 20 25 30 35 40 45 50 mlb$Age Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 13 / 24

  14. hist - Histogram with estimated density hist(mlb$Age, breaks=20, prob=TRUE) lines(density(mlb$Age), col="red") Histogram of mlb$Age 0.10 0.08 0.06 Density 0.04 0.02 0.00 20 25 30 35 40 45 50 mlb$Age Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 14 / 24

  15. image volcano is a R data set with elevation measurements of Maunga Whau on a 10m by 10m grid. data(volcano) dim(volcano) ## [1] 87 61 image(volcano) 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 15 / 24

  16. Commmon arguments for plot functions add=TRUE : add to an existing plot? axes=FALSE : plot axes? log="x" , log="y" or log="xy" : Use a logarithmic scale? type="l" : plot lines instead of points xlab, ylab : axis labels main : figure title sub : sub-title Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 16 / 24

  17. Section 3 Low-level Graphics Functions Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 17 / 24

  18. Some low-level functions These functions can be used to add elements to a plot. points(x, y) lines(x, y) text(x, y, labels, ...) abline(a, b) or abline(h=y) or abline(v=x) polygon(x, y, ...) legend(x, y, legend, ...) title(main, sub) axis(side, ...) Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 18 / 24

  19. Graphical parameter list: par R maintains a list of graphics parameters to control line style, colors, figure arrangement and text justification. A separate list of graphics parameters is maintained for each active device. oldpar <- par(col = "blue", pch=4, cex = 2) plot(x, y, axes = FALSE) axis(1); axis(2) 8000 y 4000 0 0 20 40 60 80 100 x par(oldpar) # restore the original parameters Many parameters from par() can also be passed to plot() . Try par() and ?par Danger Do not forget to reset par to the original settings! You can also reset par by closing the plotting device or with the little broom symbol in the Plots tab of RStudio. Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 19 / 24

  20. Important parameters in par pch=4 : Plotting symbol (0-25) lty=2 : Line type lwd=2 : Line width col=2 : Color for points, lines, etc. cex=1.5 : Character expansion (e.g., 50% larger than default text size) mai=c(1, 0.5, 0.5, 0) : Widths of the bottom, left, top and right margins, respectively, measured in inches. mar is the same, just measured in rows. mfcol, mfrow : Put multiple plots next to each other. Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 20 / 24

  21. Saving a plot as an image png(file="plot.png") # open device plot(x, y) dev.off() # close device ## pdf ## 2 Other devices are ‘jpeg(), tiff(),‘ ‘pdf(), postscript(),‘ ‘win.metafile()‘ (Windows). Use ‘?Devices‘ for a complete list. Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 21 / 24

  22. Advanced and InteractiveGraphics The most populare advanced graphics packages are: ggplot2 : Grammar of graphics. Produces elegant visualizations (see http://ggplot2.org/). grid : Advanced graphics can be programmed using flexible low level ploting functions (viewports, different coordinate systems and units, lines, points, text, etc.) See also package lattice . Interactive Graphics are available via several extension packages. Here are some examples: https://www.r-graph-gallery.com/interactive-charts.html Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 22 / 24

  23. Section 4 Exercises Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 23 / 24

  24. Exercises Plot a ‘sin(x)/x‘. Hint: Trigonometric functions in R use angles in radians (see ‘sin‘) 1 The “cars” data set gives the speed of cars and the distances taken to stop. Note 2 that the data were recorded in the 1920s. Plot the “cars” data set as a scatter plot. Plot all data points with distances taken to stop greater than 80 in red. Plot histograms for speed and dist in “cars”. 3 Michael Hahsler (OIT, SMU) Intro to R - 4. Base R Plots 24 / 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend