BootcampR AN INTRODUCTION TO R Jason A. Heppler, PhD University of - - PowerPoint PPT Presentation

bootcampr
SMART_READER_LITE
LIVE PREVIEW

BootcampR AN INTRODUCTION TO R Jason A. Heppler, PhD University of - - PowerPoint PPT Presentation

BootcampR AN INTRODUCTION TO R Jason A. Heppler, PhD University of Nebraska at Omaha March 17, 2020 @jaheppler Hi. I'm Jason. I like to gesture at screens. Digital Engagement Librarian , University of Nebraska at Omaha Mentor, Mozilla Open


slide-1
SLIDE 1

BootcampR

AN INTRODUCTION TO R

Jason A. Heppler, PhD University of Nebraska at Omaha March 17, 2020 @jaheppler

slide-2
SLIDE 2
  • Hi. I'm Jason.

I like to gesture at screens.

Digital Engagement Librarian, University of Nebraska at Omaha Mentor, Mozilla Open Leaders Researcher, Humanities+Design, Stanford University

slide-3
SLIDE 3

Today's plan

  • Introduction to spatial data analysis
  • Intro to sf, ggmap, and leaflet
  • Hands-on!

Due to COVID-19 closures, this will be the final workshop. Open up RStudio. We'll start doing a few things together soon.

slide-4
SLIDE 4

The sf package

  • Easy to work with spatial data, minimizes the distinction

between spatial data and other data types

  • Spatial objects are stored as data frames, and

geometries stored in list columns

  • All functions begin with st_ for easy RStudio autofill
  • Functions are pipe friendly %>%
  • dplyr and tidyr have been defined for sf objects
  • ggplot2 can plot sf objects directly
slide-5
SLIDE 5

The sf package

We have to know about a few different data types in GIS before we go much further. Vector data: representing real world features like hills, roads, houses, rivers. Vectors have attributes that consist of text or numeric information that describe features. Vectors are often shapes created through geometry. Raster data: a matrix of pixels containing values that represent conditions for an area. This could be background images of the Earth's surface or continuous data represented

  • n the Earth's surface.
slide-6
SLIDE 6

Vectors in sf

sf objects have a hierarchical structure composed of three classes:

  • 1. sf vector layer: a data frame with more than one

attribute column and one geometry column

  • 2. sfc geometric vector layer: the geometry of an object
  • 3. sfg geometry: representing individual simple features
slide-7
SLIDE 7

Vectors in sf

slide-8
SLIDE 8

Let's look at a case study to see this in action.

slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11
slide-12
SLIDE 12

Let's get started.

library(tidyverse) library(sf) churches <- read_csv("https:// raw.githubusercontent.com/unolibraries/workshops/ master/bootcampr/data/churches.csv")

slide-13
SLIDE 13

Let's get started.

# Let's look at the data structure. glimpse(churches)

Observations: 786 Variables: 13 $ Congregation <chr> "Adventist", "Adventist", "Adventist", "Assemblies of God, Ge… $ National_Affiliation <chr> "Seventh-Day Adventist Denomination", "Seventh-Day Adventist … $ Sub_Affiliations <chr> "General Conference of Seventh-day Adventists; Southwestern U… $ Name <chr> "First Italian Seventh-day Adventist Church", "New Orleans Se… $ Building_Type <chr> "Church", "Church", "Church", "Church", "Church", "Church", "… $ Congregation_Race <chr> "Italian", "White", "Black", "White", "White", "White", "Whit… $ Cross_Street <chr> "full address provided", "full address provided", "full addre… $ Street_Address <chr> "1004 Kerlerec St.", "3500 St. Charles Ave.", "2412 Delachais… $ City_State <chr> "New Orleans, Louisiana", "New Orleans, Louisiana", "New Orle… $ Zip_Code <dbl> 70116, 70115, 70115, 70032, 70114, 70124, 70118, 70119, 70130… $ `Full Address` <chr> "1004 Kerlerec St. New Orleans, Louisiana 70116", "3500 St. C… $ Latitude <dbl> 29.96598, 29.92736, 29.93523, 29.95605, 29.93741, 29.98737, 2… $ Longitude <dbl> -90.06210, -90.09289, -90.09643, -90.00180, -90.04852, -90.11…

slide-14
SLIDE 14

Let's make our first map.

Head over to RStudio and give this a shot. churches %>% ggplot() + geom_point(aes(x = Longitude, y = Latitude)) + labs(title="Churches in New Orleans")

slide-15
SLIDE 15

Let's make our first map.

Head over to RStudio and give this a shot. churches %>% ggplot() + geom_point(aes(x = Longitude, y = Latitude)) + labs(title="Churches in New Orleans") Your turn:

  • 1. Try to add a subtitle to the map. (Hint: ??labs)
  • 2. Change the color of the points on the map to indicate the type of
  • congregation. (Bonus: get rid of the legend for now.)
slide-16
SLIDE 16

Let's make our first map.

basemap <- get_stamenmap(bbox = c(left = min(churches$Longitude), right = max(churches$Longitude), bottom = min(churches$Latitude), top = max(churches$Latitude)), zoom = 12)

To make our map a bit better, we'll use the ggmap package to find a basemap to put underneath the vector layer. The package makes this super easy for us.

slide-17
SLIDE 17

Let's make our first map.

To make our map a bit better, we'll use the ggmap package to find a basemap to put underneath the vector layer. The package makes this super easy for us.

basemap <- get_stamenmap(bbox = c(left = min(churches$Longitude), right = max(churches$Longitude), bottom = min(churches$Latitude), top = max(churches$Latitude)), zoom = 12)

get_stamenmap returns a raster that we can now plot. To do this, however, we need to use the ggmap function, which can also accept ggplot's geoms.

Homework for you: what is bbox? Can you figure out what's happening in the code above?

slide-18
SLIDE 18

Let's make our first map.

Your turn!

  • 1. We can see the raster this way:

ggmap(basemap) How would you add the church data to this?

  • 2. Adjust the size, transparency, and color of the points. Try

coloring the points by congregation again.

  • 3. Add back in the previous ggplot arguments we included

(title, subtitle, and remove legend).

slide-19
SLIDE 19

What can the religious landscape

  • f New Orleans tell us?

Let's look at our data another way. A key history of American cities in the 1930s and 1940s is racial segregation. We can plot the race of the congregations in our dataset to start exploring this. Try this out: basing your work off the previous code we wrote, change the parameters of the data to color points by the Congregation_Race column.

slide-20
SLIDE 20

What can the religious landscape

  • f New Orleans tell us?

Leaflet is a popular open-source JavaScript library for making interactive maps. The leaflet package allows us to easily integrate and control Leaflet maps in R. install.packages("leaflet") library(leaflet)

slide-21
SLIDE 21

What can the religious landscape

  • f New Orleans tell us?

Let's start with a simple Leaflet map and break down what's happening.

leaflet() %>% # we call the leaflet function addTiles() # add provider tiles, by default OSM

Let's jump over to RStudio and try it out.

slide-22
SLIDE 22

What can the religious landscape

  • f New Orleans tell us?

Leaflet works somewhat similarly to ggplot. We can pass in data to the leaflet() function, we can pipe to other functions, and we use Leaflet's grammar of maps to build our visualization. Jump over to RStudio and give this a try.

leaflet() %>% addProviderTiles() %>% addCircleMarkers()

slide-23
SLIDE 23

What can the religious landscape

  • f New Orleans tell us?

Your turn! Start with the code below.

  • 1. How would you change the size of the circles on the map?

(Hint: ?addCircleMarkers)

  • 2. Adjust the opacity of the circles.
  • 3. Remove the stroke around the circles.

leaflet() %>% addProviderTiles() %>% addCircleMarkers()

slide-24
SLIDE 24

What can the religious landscape

  • f New Orleans tell us?

Adding a palette to Leaflet maps isn't quite as simple as ggplot's methods of doing so. However, Leaflet comes with wrappers to make the process a bit easier. colorNumeric(): For shading continuous values colorBin(): Continuous values, discrete colors colorQuantile(): Continuous values, discrete colors colorFactor(): For categorical data To use these, we'll create a palette function to apply to the data.

slide-25
SLIDE 25

What can the religious landscape

  • f New Orleans tell us?

Since we're looking at the race of the congregations, we'll use the colorFactor() function to generate a palette based off our data. Each of the palette functions accept two common parameters

  • 1. palette: a preset palette from RColorBrewer or Viridis,
  • r a character vector of RGB values or named colors.
  • 2. domain: tells the color function the range of input values.

You can pass NULL to have the range inferred from the data.

slide-26
SLIDE 26

What can the religious landscape

  • f New Orleans tell us?

Since we're looking at the race of the congregations, we'll use the colorFactor() function to generate a palette based off our data. pal <- colorFactor(palette, domain) Let's head over to RStudio and walk through what's happening.

slide-27
SLIDE 27

What can the religious landscape

  • f New Orleans tell us?

In addition to our point data of church locations, we can also read in and map shapefiles. Shapefiles are a popular GIS vector data format that can contain polygons, points, and lines that is regulated by Esri. To do this next part, visit the workshop Github page below and download the holc.zip file. Put all of these files into a single folder. https://github.com/unolibraries/workshops/ tree/master/bootcampr/data

slide-28
SLIDE 28

What can the religious landscape

  • f New Orleans tell us?

To get this data in, we'll use the sf function st_read(). holc <- st_read("holc.shp") Let's jump over to RStudio.

slide-29
SLIDE 29

Questions? Troubleshooting?

Next workshop: March 31, 1:30p-3p: Clustering and Classifying (CL 112)