Spatial Data Science Methods for Improving Models Andy Eschbacher - - PowerPoint PPT Presentation

spatial data science methods for improving models
SMART_READER_LITE
LIVE PREVIEW

Spatial Data Science Methods for Improving Models Andy Eschbacher - - PowerPoint PPT Presentation

2019 Spatial Data Science Methods for Improving Models Andy Eschbacher Data Scientist @MrEPhysics CARTO 2019 Overview of Spatial Data CARTO 2019 Points on a map The most common way we see spatial data: Lat/Lng/Attributes Map by


slide-1
SLIDE 1

2019

Spatial Data Science Methods for Improving Models

Andy Eschbacher Data Scientist @MrEPhysics

slide-2
SLIDE 2

2019 CARTO

Overview of Spatial Data

slide-3
SLIDE 3

CARTO 2019

The most common way we see spatial data: Lat/Lng/Attributes

Points on a map

Map by Mamata Akella

slide-4
SLIDE 4

CARTO 2019

Some tips: Try not to use zip codes. Zip codes are not polygons. They are more akin to postal routes (lines) Zip code != ZCTA from census Zip codes change frequently

Zip Codes

slide-5
SLIDE 5

CARTO 2019

Check your unit of analysis. Zip codes don't obey city boundaries. Or really any boundaries at all.

Real world problems

Figure from Grubesic et al

slide-6
SLIDE 6

CARTO 2019

Another Tip: Don't use zip codes when you are an insurance company that has plans that span a different type of geography (counties).

Real world problems

slide-7
SLIDE 7

CARTO 2019

Boundaries are manipulated

Gerrymandering shapes polygons to favor one group over another

New York Times, Nov 2018

slide-8
SLIDE 8

2019 CARTO

Census

Freely accessible demographic data and more at multiple geographical scales for many countries across the world

slide-9
SLIDE 9

ODSC East Andy Eschbacher 2018

slide-10
SLIDE 10

2019 CARTO

LODES

Origin-Destination Data

slide-11
SLIDE 11

CARTO 2019

Modern Spatial Data Sources

Newer data sources have come about because of changes in technology

slide-12
SLIDE 12

2019 CARTO

Tweets Lat/Lng/Time/Tweet/etc.

slide-13
SLIDE 13

2019 CARTO

GPS Data

Spencer the Cat Lat/Lng/Time

slide-14
SLIDE 14

2019 CARTO

Taxi Trips to/from major airports around NYC

Open Taxi Data

slide-15
SLIDE 15

2019 CARTO

Mobility Data Lat/Lng/Time/Id

slide-16
SLIDE 16

2019 CARTO

Maps by Wenfei Xu

slide-17
SLIDE 17

CARTO 2019

The geometries and their positions relative to one another provides additional data

Data in the Spatial Structure

Figure from PySAL

slide-18
SLIDE 18

2019 CARTO

Working with missing data

slide-19
SLIDE 19

2019 CARTO

Missing data is a common problem

  • Missing because of geographic

anonymization

  • Lack of measurements at

locations

  • Data is messy
slide-20
SLIDE 20

2019 CARTO

Given WeWork locations in NYC, show me potentially successful locations in LA

slide-21
SLIDE 21

2019 CARTO

  • WeWork locations

○ 58 spaces in NYC ○ 21 in LA

  • Demographic from the census
  • Financial data from Mastercard's Retail Location Index
  • Points of Interest (POI) for venues with similar

characteristics (accomodation, eduction, food, entertainment, etc.)

What data do we have?

slide-22
SLIDE 22

2019 CARTO

Compute distances in parameter space, rank potential sites by similarity

Given WeWork locations in NYC, show me potentially successful locations in LA

slide-23
SLIDE 23

2019 CARTO

My data has different variances, scales Comes from many sources, has different scales, etc. I have missing values Not all geographies have values, so we need to fill them in or remove those locations (not ideal) Some of my data is correlated We should remove the redundancy due to correlation

But...

slide-24
SLIDE 24

2019 CARTO

Common Grid

Use Quad Tree to hierarchically divide space, choose zoom level appropriate for aggregation

slide-25
SLIDE 25

2019 CARTO

Transform data to set of orthogonal axes (eigen decomposition) ★ Transformed features, including correlated ones, are linearly independent ★ Drop axes that explain least variance in data up to a threshold ✘ Doesn't work if data is missing

Principal Component Analysis (PCA)

slide-26
SLIDE 26

2019 CARTO

PCA doesn't work if we have missing data. Common imputation falls short for more sizeable amounts of missing data PPCA reconstructs the distribution of the data using the known data as a sample

Probabilistic PCA

Ilin & Raiko, 2010

slide-27
SLIDE 27

2019 CARTO

Results

Analysis by Giulia Carella

slide-28
SLIDE 28

2019 CARTO

Structure of Spatial Data

slide-29
SLIDE 29

2019 CARTO

slide-30
SLIDE 30

2019 CARTO

slide-31
SLIDE 31

2019 CARTO

  • Contiguity
  • Distance
  • kNN

Spatial Weights

slide-32
SLIDE 32

2019 CARTO

Weights are built by 'neighbors', which is problem-dependent in how they are defined

Spatial Weights

slide-33
SLIDE 33

2019 CARTO

Spatial Autocorrelation

Moran's I statistic

Basic statistic for calculating the amount of:

  • Clustering
  • Outliers
slide-34
SLIDE 34

2019 CARTO

Spatial Autocorrelation

slide-35
SLIDE 35

2019 CARTO

Spatial Autocorrelation (Local)

How a geometry compare to its neighbors

slide-36
SLIDE 36

2019 CARTO

Measuring spatial residuals

slide-37
SLIDE 37

2019

Thanks!

Andy Eschbacher Data Scientist @MrEPhysics

slide-38
SLIDE 38

2019 CARTO

Chapter one

slide-39
SLIDE 39

2019 CARTO

Use this layout only if you have a lot

  • f things to say - be mindful

We strongly suggest you to only use this slide if you absolutely need to. The Earth was small, light blue, and so touchingly alone, our home that must be defended like a holy relic. The Earth was absolutely round. I believe I never knew what the word round meant until I saw Earth from space. When I orbited the Earth in a spaceship, I saw for the first time how beautiful

  • ur planet is. Mankind, let us preserve and increase this beauty, and not

destroy it!

slide-40
SLIDE 40

2019 CARTO

Use this layout only if you have a lot

  • f things to say - be mindful

We strongly suggest you to only use this slide if you absolutely need to. The Earth was small, light blue, and so touchingly alone, our home that must be defended like a holy relic. The Earth was absolutely round. I believe I never knew what the word round meant until I saw Earth from space. When I orbited the Earth in a spaceship, I saw for the first time how beautiful

  • ur planet is. Mankind, let us preserve and increase this beauty, and not

destroy it!

slide-41
SLIDE 41

2019 CARTO

What’s better than a list?

  • The Earth was small, light blue, and so touchingly alone,
  • ur home that must be defended like a holy relic.
  • The Earth was absolutely round.
  • I believe I never knew what the word round meant until I

saw Earth from space.

slide-42
SLIDE 42

2019 CARTO

What’s better than a list?

  • The Earth was small, light blue, and so touchingly alone,
  • ur home that must be defended like a holy relic.
  • The Earth was absolutely round.
  • I believe I never knew what the word round meant until I

saw Earth from space.

slide-43
SLIDE 43

2019 CARTO

What’s better than a list?

  • The Earth was small, light blue, and so touchingly alone,
  • ur home that must be defended like a holy relic.
  • The Earth was absolutely round.
  • I believe I never knew what the word round meant until I

saw Earth from space.

slide-44
SLIDE 44

2019 CARTO

A numbered list!

1. The Earth was small, light blue, and so touchingly alone,

  • ur home that must be defended like a holy relic.

2. The Earth was absolutely round. 3. I believe I never knew what the word round meant until I saw Earth from space.

slide-45
SLIDE 45

2019 CARTO

A numbered list!

1. The Earth was small, light blue, and so touchingly alone,

  • ur home that must be defended like a holy relic.

2. The Earth was absolutely round. 3. I believe I never knew what the word round meant until I saw Earth from space.

slide-46
SLIDE 46

2019 CARTO

Column 1. The Earth was small, light blue, and so touchingly alone, our home that must be defended like a holy relic. The Earth was absolutely round. I believe I never knew what the word round meant until I saw Earth from space. Column 2. A self-service business user application for spatial analysis and visualization. Builder’s drag and drop analytics empower business analysts to

  • ptimize operations and quickly

deploy location applications.

Hypnosis Myth Reality

slide-47
SLIDE 47

2019 CARTO

Column 1. The Earth was small, light blue, and so touchingly alone, our home that must be defended like a holy relic. The Earth was absolutely round. I believe I never knew what the word round meant until I saw Earth from space. Column 2. A self-service business user application for spatial analysis and visualization. Builder’s drag and drop analytics empower business analysts to

  • ptimize operations and quickly

deploy location applications.

Hypnosis Myth Reality

slide-48
SLIDE 48

2019 CARTO

Peace On Earth A Wonderful Wish But No Way

Custom basemaps Customized raster and vector maps that support worldwide coverage. Geocoding Multiple geocoding and permanent storage options Routing Global turn-by-turn directions for driving, biking, and walking. Data Observatory Added-value services like Demographics and Segmentation APIs

slide-49
SLIDE 49

2019 CARTO

Peace On Earth A Wonderful Wish But No Way

Custom basemaps Customized raster and vector maps that support worldwide coverage. Geocoding Multiple geocoding and permanent storage options Routing Global turn-by-turn directions for driving, biking, and walking. Data Observatory Added-value services like Demographics and Segmentation APIs

slide-50
SLIDE 50

2019 CARTO

Hypnosis Myth Reality

Custom basemaps Customized raster and vector maps that support worldwide coverage. Geocoding Multiple geocoding and permanent storage options Routing Global turn-by-turn directions for driving, biking, and walking. Data Observatory Added-value services like Demographics and Segmentation APIs

The Earth was small, light blue, and so touchingly alone, our home that must be defended like a holy relic. The Earth was absolutely round.

slide-51
SLIDE 51

2019 CARTO

Hypnosis Myth Reality

Custom basemaps Customized raster and vector maps that support worldwide coverage. Geocoding Multiple geocoding and permanent storage options Routing Global turn-by-turn directions for driving, biking, and walking. Data Observatory Added-value services like Demographics and Segmentation APIs

The Earth was small, light blue, and so touchingly alone, our home that must be defended like a holy relic. The Earth was absolutely round.

slide-52
SLIDE 52

2019 CARTO

“Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua ipsum dolor sit amet.”

slide-53
SLIDE 53

2019 CARTO

“Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua ipsum dolor sit amet.”

slide-54
SLIDE 54

2019 CARTO

“Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua ipsum dolor sit amet.”

slide-55
SLIDE 55

CARTO 2019

Optimization of assignment of parcels and routes to drivers and their vehicles from their depots. Every vehicle not needed saves them $150K/year

Logistics Optimization

Replace this image

slide-56
SLIDE 56

CARTO 2019

Optimization of assignment of parcels and routes to drivers and their vehicles from their depots. Every vehicle not needed saves them $150K/year

Logistics Optimization

Replace this image

slide-57
SLIDE 57

CARTO 2019

Optimization of assignment of parcels and routes to drivers and their vehicles from their depots. Every vehicle not needed saves them $150K/year

Logistics Optimization

Replace this image

slide-58
SLIDE 58

2019 CARTO

“Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut”

slide-59
SLIDE 59

2019 CARTO

“Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut”

slide-60
SLIDE 60

2019 CARTO

“Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut”

slide-61
SLIDE 61

CARTO 2019

Text + 1 image. To replace the placeholder, just select the image and click on the “Replace image”

  • ption.

If you need to use an image smaller than the placeholder; simply click

  • n ‘Insert Image’ and delete the

placeholder once inserted.

Escape the GIS Handcuffs

slide-62
SLIDE 62

2019 CARTO

Text + 1 image. To replace the placeholder, just select the image and click on the “Replace image”

  • ption.

If you need to use an image smaller than the placeholder; simply click

  • n ‘Insert Image’ and delete the

placeholder once inserted.

Escape the GIS Handcuffs

slide-63
SLIDE 63

CARTO 2019

Text + 2 images. To replace the placeholders, just select each image and click on the “Replace image” option. If you need to use an image smaller than the placeholder; simply click

  • n ‘Insert Image’ and delete the

placeholder once inserted.

Escape the GIS Handcuffs

slide-64
SLIDE 64

2019 CARTO

Text + 2 images. To replace the placeholders, just select each image and click on the “Replace image” option. If you need to use an image smaller than the placeholder; simply click

  • n ‘Insert Image’ and delete the

placeholder once inserted.

Escape the GIS Handcuffs

Your Name

slide-65
SLIDE 65

2019 CARTO

Chapter one

slide-66
SLIDE 66
slide-67
SLIDE 67
slide-68
SLIDE 68
slide-69
SLIDE 69
slide-70
SLIDE 70
slide-71
SLIDE 71

2019 CARTO

Peace On Earth A Wonderful Wish But No Way

Remember To name all columns

  • k

Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item

slide-72
SLIDE 72

2019 CARTO

Counting Your Chicken Before They Hatch

slide-73
SLIDE 73

2019 CARTO

slide-74
SLIDE 74

2019 CARTO

The Emerald Buddha

To replace the placeholder, click on the “Replace image” option. If you need to use an image smaller than the placeholder; simply click on ‘Insert Image’ and delete the placeholder once inserted.

slide-75
SLIDE 75

2019 CARTO

Bienvenid@s

Pauline Becker

Demo Developer Intern Madrid — April 24

Rhoda Obrien

Systems Brooklyn — May 1

Henrietta Soto

Research & Data Brooklyn — May 1

slide-76
SLIDE 76

2019 CARTO

Bienvenid@s

Rhoda Obrien

Systems Brooklyn — May 1

Henrietta Soto

Research & Data Brooklyn — May 1

slide-77
SLIDE 77

2019 CARTO

Bienvenid@

Rhoda Obrien

Systems Brooklyn — May 1

slide-78
SLIDE 78

2019 CARTO

Kudos

Pauline Becker

Demo Developer Intern

Rhoda Obrien

Systems

Henrietta Soto

Research & Data

slide-79
SLIDE 79

2019 CARTO

CARTOVersaries

Pauline Becker

Demo Developer Intern 2 Yrs — Nov 23

Rhoda Obrien

Systems 1 Year — Nov 29

Henrietta Soto

Research & Data 2 Years — Nov 30

slide-80
SLIDE 80

2019 CARTO

Hasta luego

Pauline Becker

Demo Developer Intern

Rhoda Obrien

Systems

Henrietta Soto

Research & Data Last Day — July 3 Last Day — July 12 Last Day — July 13

slide-81
SLIDE 81

2019 CARTO

DON’T FORGET! You could get 1,500 if you refer someone we hire!

We’re hiring!

Sales

Account Exec (Financial Sector)

Washington, DC

Sales Development Rep

Washington, DC Product, Eng. & Tech

Front-end Engine & Builder

Madrid

Back-end Engine & Builder

Madrid

slide-82
SLIDE 82

2019 CARTO

DON’T FORGET! You could get 1,500 if you refer someone we hire!

We’re hiring!

Sales

Account Exec (Financial Sector)

Washington, DC

Sales Development Rep

Washington, DC Product, Eng. & Tech

Front-end Engine & Builder

Madrid

slide-83
SLIDE 83

2019 CARTO

DON’T FORGET! You could get 1,500 if you refer someone we hire!

We’re hiring!

Sales

Account Exec (Financial Sector)

Washington, DC Product, Eng. & Tech

Front-end Engine & Builder

Madrid

slide-84
SLIDE 84

2019 CARTO

DON’T FORGET! You could get 1,500 if you refer someone we hire!

We’re hiring!

Sales

Account Exec (Financial Sector)

Washington, DC

slide-85
SLIDE 85

CARTO 2019

Text + code. Just replace the code with the one you need Builder’s drag and drop analytics empower business analysts to

  • ptimize operations and quickly

deploy location applications.

Escape the GIS Handcuffs

#nyc_block_group { polygon-fill: #FFFFB2; polygon-opacity: 0.8; line-color: #FFF; line-width: 0.5; line-opacity: 1; [ commute_60min_more <= 1] { polygon-fill: #B10026; } [ commute_60min_more <= 0.367032967032967] { polygon-fill: #E31A1C; } [ commute_60min_more <= 0.283171521035599] { polygon-fill: #FC4E2A; } }

slide-86
SLIDE 86

2019 CARTO

Text + code. Just replace the code with the one you need. Builder’s drag and drop analytics empower business analysts to

  • ptimize operations and quickly

deploy location applications.

Escape the GIS Handcuffs

#nyc_block_group { polygon-fill: #FFFFB2; polygon-opacity: 0.8; line-color: #FFF; line-width: 0.5; line-opacity: 1; [ commute_60min_more <= 1] { polygon-fill: #B10026; } [ commute_60min_more <= 0.367032967032967] { polygon-fill: #E31A1C; } [ commute_60min_more <= 0.283171521035599] { polygon-fill: #FC4E2A; } }

slide-87
SLIDE 87

#nyc_block_group { polygon-fill: #FFFFB2; polygon-opacity: 0.8; line-color: #FFF; line-width: 0.5; line-opacity: 1; [ commute_60min_more <= 1] { polygon-fill: #B10026; } [ commute_60min_more <= 0.367032967032967] { polygon-fill: #E31A1C; } [ commute_60min_more <= 0.283171521035599] { polygon-fill: #FC4E2A; } }

slide-88
SLIDE 88

2019 CARTO

If you really need to use this

The Earth was small, light blue, and so touchingly alone, our home that must be defended like a holy relic. A self-service business user application for spatial analysis and visualization. Builder’s drag and drop analytics empower business analysts to optimize operations and quickly deploy location applications.

Builder’s drag and drop analytics empower business analysts to

  • ptimize operations and quickly

deploy location applications. Just select the image and click on the “Replace image” option. Free yourself from reliance on GIS specialists and put the power

  • f location intelligence directly in

your hands.

slide-89
SLIDE 89

2019 CARTO

If you really need to use this

The Earth was small, light blue, and so touchingly alone, our home that must be defended like a holy relic. A self-service business user application for spatial analysis and visualization. Builder’s drag and drop analytics empower business analysts to optimize operations and quickly deploy location applications.

Builder’s drag and drop analytics empower business analysts to

  • ptimize operations and quickly

deploy location applications. Just select the image and click on the “Replace image” option. Free yourself from reliance on GIS specialists and put the power

  • f location intelligence directly in

your hands.

slide-90
SLIDE 90

2019 CARTO

If you really need to use this

The Earth was small, light blue, and so touchingly alone, our home that must be defended like a holy relic. A self-service business user application for spatial analysis and visualization. Builder’s drag and drop analytics empower business analysts to optimize operations and quickly deploy location applications.

Builder’s drag and drop analytics empower business analysts to

  • ptimize operations and quickly

deploy location applications. Just select the image and click on the “Replace image” option. Free yourself from reliance on GIS specialists and put the power

  • f location intelligence directly in

your hands.

slide-91
SLIDE 91

2019 CARTO

Questions?

slide-92
SLIDE 92

2019 CARTO

Questions?

slide-93
SLIDE 93

2019 CARTO

Questions?

slide-94
SLIDE 94

2019

Thanks !

slide-95
SLIDE 95

2019

Thanks !