Geospatial and MongoDB MongoDB Geospatial Features Agenda Query - - PowerPoint PPT Presentation

geospatial and mongodb
SMART_READER_LITE
LIVE PREVIEW

Geospatial and MongoDB MongoDB Geospatial Features Agenda Query - - PowerPoint PPT Presentation

Geospatial and MongoDB MongoDB Geospatial Features Agenda Query Examples Optimizations 2 Norberto Leite Developer Advocate Curriculum Engineer Twitter: @nleite norberto@mongodb.com 3 The Basics [Longitude, Latitude] Quiz Time! Which


slide-1
SLIDE 1

Geospatial and MongoDB

slide-2
SLIDE 2

2

Agenda

MongoDB Geospatial Features Query Examples Optimizations

slide-3
SLIDE 3

3

Norberto Leite

Developer Advocate Curriculum Engineer Twitter: @nleite norberto@mongodb.com

slide-4
SLIDE 4

The Basics

slide-5
SLIDE 5

[Longitude, Latitude]

slide-6
SLIDE 6

Which of these shapes is the must similar with Planet Earth?

Quiz Time!

slide-7
SLIDE 7

7

Stripped version of Earth: Geoid

http://www.ngs.noaa.gov/GEOID/

slide-8
SLIDE 8

8

Surface Types

Flat Spherical

2d Indexes 2dsphere Indexes

slide-9
SLIDE 9

9

2D Indexes

var var place place = = { type: : "building" "building", name: : "AW1 - Building" "AW1 - Building" location: : [4.380717873573303 4.380717873573303,50.81219570880462 50.81219570880462] } Defined by [lon,lat] arrays var var checkin checkin = = { type: : "checkin checkin", message: : "this place is awesome!" "this place is awesome!" location: : { lng: : 4.348099529743194 4.348099529743194, lat: : 50.850980167854615 50.850980167854615 } } Or use an embedded document

slide-10
SLIDE 10

10

2D Indexes

//index creation db.col.createIndex .col.createIndex( { ( {'location' 'location': : '2d' '2d'}) }) db.col.createIndex .col.createIndex( { ( {'location' 'location': : '2d' '2d'}, { }, {'sparse' 'sparse': : true true}) })

slide-11
SLIDE 11

11

Spherical Surface

var var place place = = { type: : "building" "building", name: : "AW1" "AW1", location: : { //Line, MultiLine, Polygon, MultiPolygon, GeometryCollection type: : "Point" "Point", coordinates: : [4.380717873573303 4.380717873573303,50.81219570880462 50.81219570880462] } } Defined by a subdocument – GeoJSON Requires a type Coordinates array

slide-12
SLIDE 12

Spherical Surface

var var place place = = { type: : "building" "building", name: : "AW1" "AW1", location: : { type: : "Polygon" "Polygon", coordinates: : [ [ [ 4.380406737327576 4.380406737327576, 50.812253331704625 50.812253331704625 ], ], [ 4.380889534950256 4.380889534950256, 50.81239569385869 50.81239569385869 ], ], [ 4.381093382835388 4.381093382835388, 50.812134696244804 50.812134696244804 ], ], [ 4.380605220794678 4.380605220794678, 50.81198894369594 50.81198894369594 ], ], [ 4.380406737327576 4.380406737327576, 50.812253331704625 50.812253331704625 ] ] ] } }

http://geojson.org/

slide-13
SLIDE 13

2dsphere Indexes

//index creation db.collection.createIndex .collection.createIndex( { location ( { location : : "2dsphere" "2dsphere" } ) } ) //compound with more than 2 members db.collection.createIndex .collection.createIndex( { location ( { location : : "2dsphere" "2dsphere", name , name: : 1, type , type: : 1 1 } ) } )

slide-14
SLIDE 14

14

2d vs 2dsphere

2d index 2dsphere Legacy Spatial Support Coordinates Pair GeoJson Format Manual Earth-like geometry calculations WGS84 Datum Single extra field for compound indexes Multiple fields

slide-15
SLIDE 15

Indexation

slide-16
SLIDE 16

How does MongoDB generate index keys? db.places.insert .places.insert({ ({ name: : "Starbucks" "Starbucks" loc: : { type: : "Point" "Point", coordinates: : [1.3 1.3,45 45] } }) }) db.places.createIndex .places.createIndex({ ({loc loc: : "2dsphere" "2dsphere"}) })

Points to Index Keys

slide-17
SLIDE 17

Project spherical point to bounding cube Each face is a Quadtree

Points to Index Keys

slide-18
SLIDE 18

Points to Index Keys

Every cm2 can be represented with 30 levels

...

Face 5 Key 5F

0 1

5F1 5F12 5F120

3 2

slide-19
SLIDE 19

19

S2 Library

slide-20
SLIDE 20

Points to Index Keys

A key is a prefix of another iff it is a parent cell

Key 5F1 5F120

slide-21
SLIDE 21

Query Examples

slide-22
SLIDE 22

22

Geospatial operators

$geoWithin $geoIntersects $near/$nearSphere

slide-23
SLIDE 23

23

$geoWithin

slide-24
SLIDE 24

24

$geoWithin

{ location { location: : { { $geoWithin: : { $geometry : : { 'type': : "Polygon" "Polygon", 'coordinates': : [ [ [ [ -73.975181 73.975181, , 40.758494 40.758494 ], ], [ [ -73.973336 73.973336, , 40.760965 40.760965 ], ], [ [ -73.974924 73.974924, , 40.761663 40.761663 ], ], [ [ -73.976748 73.976748, , 40.759160 40.759160 ], ], [ [ -73.975181 73.975181, , 40.758494 40.758494 ] ] ] }}}} }}}}

slide-25
SLIDE 25

25

$geoIntersects

slide-26
SLIDE 26

26

$geoIntersects

db.coll.find db.coll.find({ location: { ({ location: { $geoIntersects: : { $geometry: : { { "type" "type": : "LineString LineString", "coordinates": : [ [-73.979543 73.979543, , 40.761132 40.761132], ], [-73.974715 73.974715, , 40.759127 40.759127], ], [-73.973363 73.973363, , 40.760969 40.760969], ], [-73.970059 73.970059, , 40.759600 40.759600] ] } } } ) } } )

slide-27
SLIDE 27

27

Stage 1 Stage 2 Stage 3

$near

slide-28
SLIDE 28

Polygons

slide-29
SLIDE 29

Coordinates System

135 180

  • 180

90 10 45 90

  • 10
  • 45
  • 120
slide-30
SLIDE 30

1 2 3 4

But what’s the inside of the polygon?

Define a polygon by specifying 4 points

slide-31
SLIDE 31

1 2 3 4

Convention for deciding “inside”: Winding Order + Right Hand Rule

slide-32
SLIDE 32

Posi)ve = Inside of Polygon

slide-33
SLIDE 33

1 2 3 4

But how does MongoDB pick the inside of the Polygon?

slide-34
SLIDE 34

1 2 3 4

MongoDB 2.6 behavior

slide-35
SLIDE 35

35

Small Areas Polygon

slide-36
SLIDE 36

36

Polygon

Defines a business service area

var var polygon polygon = = { "type": : "Polygon" "Polygon", "coordinates" : : [ [ [ [ -73.969581 73.969581, , 40.760331 40.760331 ], ], [ [ -73.974487 73.974487, , 40.762245 40.762245 ], ], [ [ -73.977692 73.977692, , 40.763598 40.763598], ], [ [ -73.979508 73.979508, , 40.761269 40.761269 ], ], [ [ -73.982364 73.982364, , 40.762358 40.762358 ], ], [ [ -73.983692 73.983692, , 40.760497 40.760497 ], ], [ [ -73.972821 73.972821, , 40.755861 40.755861 ], ], [ [ -73.969581 73.969581, , 40.760331 40.760331 ] ] ] }

slide-37
SLIDE 37

Big Polygon

slide-38
SLIDE 38

I am an airplane at [0, 0] What airports are within my flight range?

1 2 3 4

Start with a plane with a medium sized flight range polygon

slide-39
SLIDE 39

I am an airplane at [0, 0] What airports are within my flight range?

1 2 3 4

If it’s a longer range plane, that polygon gets bigger

slide-40
SLIDE 40

I am an airplane at [0, 0] What airports are within my flight range?

1 2 3 4

Eventually polygon get so big it covers more than 50%

  • f the planet
slide-41
SLIDE 41

41

urn:x-mongodb:crs:strictwinding:EPSG:4326

  • urn

– Uniform resource name

  • x-mongodb

– MongoDB extension

  • strictwinding

– Enforces explicit “counter-clockwise” winding – a.k.a.

  • anGclockwise,
  • right hand rule
  • the correct way
  • ESPG:4326

– Another name for WGS84 (the standard web geo coordinate system)

slide-42
SLIDE 42

42

How do I do that?

// build some geo JSON var var crs crs = = "urn:x-mongodb:crs:strictwinding:EPSG:4326" "urn:x-mongodb:crs:strictwinding:EPSG:4326" var var bigCRS bigCRS = = { type { type : : "name" "name", properties , properties : : { name { name : : crs crs } }; } }; var var bigPoly bigPoly = = { type { type : : "Polygon" "Polygon", , coordinates : : [ [[ [[-10.0 10.0, , -10.0 10.0], ], [10.0 10.0, , -10.0 10.0], ], [10.0 10.0, , 10.0 10.0], ], [-10.0 10.0, , 10.0 10.0], ], [-10.0 10.0, , -10.0 10.0]]], ]]], crs : : bigCRS bigCRS }; }; var var cursor cursor = = db db.<collection collection>.find({ .find({ loc : : { $ { $geoWithin geoWithin : : { $ { $geometry geometry : : bigPoly bigPoly } } } } }); }); var var cursor cursor = = db db.<collection collection>.find find({ ({ loc : : { $ { $geoIntersects geoIntersects : : { $ { $geometry geometry : : bigPoly bigPoly } } } } }); });

slide-43
SLIDE 43

43

Complex Polygons

slide-44
SLIDE 44

44

Complex Polygons

{ { "type" "type": : "Polygon" "Polygon", "coordinates" : : [ [ [ [ -73.969581 73.969581, , 40.760331 40.760331 ], ], [ [ -73.974487 73.974487, , 40.762245 40.762245 ], ], [ [ -73.977692 73.977692, , 40.763598 40.763598], ], … ], ], [ [ [ -73.975181 73.975181, , 40.758494 40.758494 ], ], [ [ -73.973336 73.973336, , 40.760965 40.760965 ], ], [ [ -73.974924 73.974924, , 40.761663 40.761663 ], ], .. .. ], ], [ [ [ -73.979437 73.979437, , 40.755390 40.755390 ], ], [ [ -73.976953 73.976953, , 40.754362 40.754362 ], ], [ [ -73.978364 73.978364, , 40.752448 40.752448 ], ], …] ] ] ] }

slide-45
SLIDE 45

45

Complex Polygons

$err : Can't canonicalize query BadValue Secondary loops not contained by first exterior loop - secondary loops must be holes { { "type" "type": : "Polygon" "Polygon", "coordinates" : : [ [ [ [ -73.969581 73.969581, , 40.760331 40.760331 ], ], [ [ -73.974487 73.974487, , 40.762245 40.762245 ], ], [ [ -73.977692 73.977692, , 40.763598 40.763598], ], … ], ], [ [ [ -73.975181 73.975181, , 40.758494 40.758494 ], ], [ [ -73.973336 73.973336, , 40.760965 40.760965 ], ], [ [ -73.974924 73.974924, , 40.761663 40.761663 ], ], .. .. ], ], [ [ -73.979437, 40.755390 ], [ -73.979437, 40.755390 ], [ -73.976953, 40.754362 ], [ -73.976953, 40.754362 ], [ -73.978364, 40.752448 ], [ -73.978364, 40.752448 ], …] ] ] }

slide-46
SLIDE 46

46

Complex Polygons

$or : : [ { [ { geometry : : { $geoWithin : : { $geometry : : { "type" : : "Polygon Polygon", "coordinates" : : [ ... ... ] }…, { geometry : : { $geoWithin : : { $geometry : : { "type" : : "Polygon Polygon", "coordinates" : : [ ... ... ] }…

slide-47
SLIDE 47

Optimizations

slide-48
SLIDE 48

$geoNear Algorithm

Series of $geoWithin + sort

1 2 3

slide-49
SLIDE 49

Problem 1: Repeated Scans

Stage 2 Stage 3

slide-50
SLIDE 50

Buffer every document in covering

Original Algorithm New Algorithm

slide-51
SLIDE 51

Avoid repeated index scans

Covering Visited Cells Difference

slide-52
SLIDE 52

Avoid repeated index scans

Stage 2 Stage 3

slide-53
SLIDE 53

Problem 1.1: Unnecessary fetches

Last Interval Index Scan Filter Out Disjoint Keys Fetch

slide-54
SLIDE 54

Original $geoNear Algorithm

Filter out disjoint keys then filter out disjoint docs

slide-55
SLIDE 55

Problem 2: Large Initial Radius

Initial radius is over 1km

Finding one document 1cm away

slide-56
SLIDE 56

Determining the Radius

The minimum distance to find documents

slide-57
SLIDE 57

Indexing Points to Finest Level

Bounded by cell size

Original indexed level New indexed level

slide-58
SLIDE 58

Why were Points Indexed Coarsely?

Polygons have a tradeoff in storage size

slide-59
SLIDE 59

Problem 2.1: New Index Version

  • Finer index level means different index keys
  • 2dSphere index version 3 introduced
slide-60
SLIDE 60

Problem 2.2: Larger Index Size

1F12031 00101100011011

String (v2) NumberLong (v3)

slide-61
SLIDE 61

Problem 2.3: More intervals

Because we no longer repeat index scans, there is little to no performance hit

slide-62
SLIDE 62

Problem 3: Query Level still Coarse

Covering of radius is constrained by index covering levels

slide-63
SLIDE 63

Split Index and Query Constraints

Set query maximum to finest level

slide-64
SLIDE 64

Before and After

Finding one document 1cm away

slide-65
SLIDE 65

65

Results

slide-66
SLIDE 66

66

Results

slide-67
SLIDE 67

More info on Optimization

https://www.mongodb.com/blog/post/geospatial-performance-improvements-in-mongodb-3-2

slide-68
SLIDE 68

http://cl.jroo.me/z3/v/D/C/e/a.baa-Too-many-bicycles-on-the-van.jpg

Norberto Leite Engineer norberto@mongodb.com @nleite

Obrigado!