us census data an overview
play

US Census data: an overview Kyle Walker Instructor DataCamp - PowerPoint PPT Presentation

DataCamp Analyzing US Census Data in R ANALYZING US CENSUS DATA IN R US Census data: an overview Kyle Walker Instructor DataCamp Analyzing US Census Data in R Course overview What you'll learn: How to acquire US Census data with the


  1. DataCamp Analyzing US Census Data in R ANALYZING US CENSUS DATA IN R US Census data: an overview Kyle Walker Instructor

  2. DataCamp Analyzing US Census Data in R Course overview What you'll learn: How to acquire US Census data with the tidycensus R package How to wrangle US Census data with tidyverse tools How to use the R tigris package to acquire US Census Bureau boundary data How to visualize and map US Census Bureau data in R with ggplot2

  3. DataCamp Analyzing US Census Data in R About your instructor Fields: spatial demography & spatial data science R developer: tidycensus, tigris, & idbr packages

  4. DataCamp Analyzing US Census Data in R US Census Bureau Data

  5. DataCamp Analyzing US Census Data in R The US Census Bureau API To get started using US Census data in R, sign up for a Census API key library(tidycensus) census_api_key("YOUR KEY GOES HERE", install = TRUE) Example key: "rw6pozt48ur2ugc8kg69x5phdrtnuhb2cb1subd6"

  6. DataCamp Analyzing US Census Data in R Using decennial Census data with tidycensus state_pop <- get_decennial(geography = "state", variables = "P001001") head(state_pop) # A tibble: 6 x 4 GEOID NAME variable value <chr> <chr> <chr> <dbl> 1 01 Alabama P001001 4779736 2 02 Alaska P001001 710231 3 04 Arizona P001001 6392017 4 05 Arkansas P001001 2915918 5 06 California P001001 37253956 6 08 Colorado P001001 5029196

  7. DataCamp Analyzing US Census Data in R Using ACS data with tidycensus state_income <- get_acs(geography = "state", variables = "B19013_001") head(state_income) # A tibble: 6 x 5 GEOID NAME variable estimate moe <chr> <chr> <chr> <dbl> <dbl> 1 01 Alabama B19013_001 44758 314 2 02 Alaska B19013_001 74444 809 3 04 Arizona B19013_001 51340 231 4 05 Arkansas B19013_001 42336 234 5 06 California B19013_001 63783 188 6 08 Colorado B19013_001 62520 287

  8. DataCamp Analyzing US Census Data in R ANALYZING US CENSUS DATA IN R Let's get started!

  9. DataCamp Analyzing US Census Data in R ANALYZING US CENSUS DATA IN R Basic tidycensus functionality Kyle Walker Instructor

  10. DataCamp Analyzing US Census Data in R Geography in tidycensus Legal entities: geography = "county" Statistical entities: geography = "tract" Available geographies

  11. DataCamp Analyzing US Census Data in R Geography and variables in tidycensus county_income <- get_acs(geography = "county", variables = "B19013_001") county_income # A tibble: 3,220 x 5 GEOID NAME variable estimate moe <chr> <chr> <chr> <dbl> <dbl> 1 01001 Autauga County, Alabama B19013_001 53099 2631 2 01003 Baldwin County, Alabama B19013_001 51365 991 3 01005 Barbour County, Alabama B19013_001 33956 2655 4 01007 Bibb County, Alabama B19013_001 39776 3306 5 01009 Blount County, Alabama B19013_001 46212 2443 6 01011 Bullock County, Alabama B19013_001 29335 5435 7 01013 Butler County, Alabama B19013_001 34315 2904 8 01015 Calhoun County, Alabama B19013_001 41954 1381 9 01017 Chambers County, Alabama B19013_001 36027 1870 10 01019 Cherokee County, Alabama B19013_001 38925 2598 # ... with 3,210 more rows

  12. DataCamp Analyzing US Census Data in R Geographic subsets in tidycensus texas_income <- get_acs(geography = "county", variables = c(hhincome = "B19013_001"), state = "TX") texas_income # A tibble: 254 x 5 GEOID NAME variable estimate moe <chr> <chr> <chr> <dbl> <dbl> 1 48001 Anderson County, Texas hhincome 42146 2539 2 48003 Andrews County, Texas hhincome 70121 7053 3 48005 Angelina County, Texas hhincome 44185 2107 4 48007 Aransas County, Texas hhincome 44851 4261 5 48009 Archer County, Texas hhincome 62407 5368 6 48011 Armstrong County, Texas hhincome 65000 9415 7 48013 Atascosa County, Texas hhincome 53181 4114 8 48015 Austin County, Texas hhincome 56681 4903 9 48017 Bailey County, Texas hhincome 40589 8438 10 48019 Bandera County, Texas hhincome 55434 4503 # ... with 244 more rows

  13. DataCamp Analyzing US Census Data in R Wide data with tidycensus get_acs(geography = "county", variables = c(hhincome = "B19013_001", medage = "B01002_001"), state = "TX", output = "wide") # A tibble: 254 x 6 GEOID NAME hhincomeE hhincomeM medageE medageM <chr> <chr> <dbl> <dbl> <dbl> <dbl> 1 48001 Anderson County, Texas 42146 2539 38.9 0.5 2 48003 Andrews County, Texas 70121 7053 31.2 0.8 3 48005 Angelina County, Texas 44185 2107 36.7 0.3 4 48007 Aransas County, Texas 44851 4261 50.7 1.1 5 48009 Archer County, Texas 62407 5368 44.1 0.7 6 48011 Armstrong County, Texas 65000 9415 45.9 2.8 7 48013 Atascosa County, Texas 53181 4114 35.4 0.2 8 48015 Austin County, Texas 56681 4903 40.8 0.4 9 48017 Bailey County, Texas 40589 8438 34.4 1.1 10 48019 Bandera County, Texas 55434 4503 51.3 0.9 # ... with 244 more rows

  14. DataCamp Analyzing US Census Data in R ANALYZING US CENSUS DATA IN R Let's practice!

  15. DataCamp Analyzing US Census Data in R ANALYZING US CENSUS DATA IN R Searching for data with tidycensus Kyle Walker Instructor

  16. DataCamp Analyzing US Census Data in R Searching for Census variables To find Census variable IDs, use: Online resources like Census Reporter Built-in variable searching in tidycensus

  17. DataCamp Analyzing US Census Data in R Choosing a dataset to search v16 <- load_variables(year = 2016, dataset = "acs5", cache = TRUE) v16 # A tibble: 22,815 x 3 name label concept <chr> <chr> <chr> 1 B00001_001 Estimate!!Total UNWEIGHTED... 2 B00002_001 Estimate!!Total UNWEIGHTED... 3 B01001_001 Estimate!!Total SEX BY AGE 4 B01001_002 Estimate!!Total!!Male SEX BY AGE 5 B01001_003 Estimate!!Total!!Male!!Under 5 years SEX BY AGE 6 B01001_004 Estimate!!Total!!Male!!5 to 9 years SEX BY AGE 7 B01001_005 Estimate!!Total!!Male!!10 to 14 years SEX BY AGE 8 B01001_006 Estimate!!Total!!Male!!15 to 17 years SEX BY AGE 9 B01001_007 Estimate!!Total!!Male!!18 and 19 years SEX BY AGE 10 B01001_008 Estimate!!Total!!Male!!20 years SEX BY AGE # ... with 22,805 more rows

  18. DataCamp Analyzing US Census Data in R Filtering a variables dataset library(tidyverse) B19001 <- filter(v16, str_detect(name, "B19001")) B19001 # A tibble: 170 x 3 name label concept <chr> <chr> <chr> 1 B19001_001E Estimate!!Total HOUSEHOLD INCOME… 2 B19001_002E ...Less than $10,000 HOUSEHOLD INCOME… 3 B19001_003E ...$10,000 to $14,999 HOUSEHOLD INCOME… 4 B19001_004E ...$15,000 to $19,999 HOUSEHOLD INCOME… 5 B19001_005E ...$20,000 to $24,999 HOUSEHOLD INCOME… 6 B19001_006E ...$25,000 to $29,999 HOUSEHOLD INCOME… 7 B19001_007E ...$30,000 to $34,999 HOUSEHOLD INCOME… 8 B19001_008E ...$35,000 to $39,999 HOUSEHOLD INCOME… 9 B19001_009E ...$40,000 to $44,999 HOUSEHOLD INCOME… 10 B19001_010E ...$45,000 to $49,999 HOUSEHOLD INCOME… # ... with 160 more rows

  19. DataCamp Analyzing US Census Data in R ACS variable structure Anatomy of an ACS variable B19001_002E : B : refers to base table. Other prefixes: C , DP , S . 19001 : the table ID 002 : the variable code within the table E : refers to estimate . optional in tidycensus functions, which return both E and M for each variable.

  20. DataCamp Analyzing US Census Data in R ANALYZING US CENSUS DATA IN R Let's practice!

  21. DataCamp Analyzing US Census Data in R ANALYZING US CENSUS DATA IN R Visualizing Census data with ggplot2 Kyle Walker Instructor

  22. DataCamp Analyzing US Census Data in R ggplot2: a layered grammar of graphics in R

  23. DataCamp Analyzing US Census Data in R Example: plotting income by state library(tidycensus) library(tidyverse) ne_income <- get_acs(geography = "state", variables = "B19013_001", survey = "acs1", state = c("ME", "NH", "VT", "MA", "RI", "CT", "NY")) ggplot(ne_income, aes(x = estimate, y = NAME)) + geom_point()

  24. DataCamp Analyzing US Census Data in R

  25. DataCamp Analyzing US Census Data in R Customizing ggplot2 graphics of ACS data ggplot(ne_income, aes(x = estimate, y = reorder(NAME, estimate))) + geom_point(color = "navy", size = 4) + scale_x_continuous(labels = scales::dollar) + theme_minimal(base_size = 14) + labs(x = "2016 ACS estimate", y = "", title = "Median household income by state")

  26. DataCamp Analyzing US Census Data in R

  27. DataCamp Analyzing US Census Data in R ANALYZING US CENSUS DATA IN R Let's practice!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend