Building Spatial Search Algorithms for Neo4j Craig Taverner Neo4j - - PowerPoint PPT Presentation

building spatial search algorithms for neo4j
SMART_READER_LITE
LIVE PREVIEW

Building Spatial Search Algorithms for Neo4j Craig Taverner Neo4j - - PowerPoint PPT Presentation

Building Spatial Search Algorithms for Neo4j Craig Taverner Neo4j Cypher and Spatial @craigtaverner Agenda Two Minute History Neo4j 3.4 WITH point({latitude: 55.612149, longitude: 12.995090}) AS poi MATCH


slide-1
SLIDE 1

Building Spatial Search Algorithms for Neo4j

@craigtaverner Craig Taverner Neo4j Cypher and Spatial

slide-2
SLIDE 2

Agenda

  • Two Minute History
  • Neo4j 3.4
  • Which Spatial Algorithms?
  • Route Finding
  • Point in Polygon
  • Polygon Area
  • Polygon Intersection
  • Polygon Distance
  • Convex Hull, oh my!

2

WITH point({latitude: 55.612149, longitude: 12.995090}) AS poi MATCH (l:Location)<-[:AT]-(b:Business)-[:OF]->(c:Category) WHERE c.name = "coffee" AND distance(l.location, poi) < 10000 RETURN distance(l.location, poi) as distance, b.name as coffee_shop ORDER BY distance DESC

slide-3
SLIDE 3

Agenda

  • Two Minute History
  • Neo4j 3.4
  • Which Spatial Algorithms?
  • Route Finding
  • Point in Polygon
  • Polygon Area
  • Polygon Intersection
  • Polygon Distance
  • Convex Hull, oh my!

3

slide-4
SLIDE 4

2010 - “Neo4j Spatial”

  • Core
  • GeometryEncoder
  • SimplePointEncoder
  • WKTGeometryEncoder
  • RTree
  • addNodes bulk inserver
  • OpenStreetMap
  • OSMImporter
  • OSMGeometryEncoder
  • OSMLayer

4

a d a t a m

  • d

e l l i n g l i b r a r y n

  • t

a s p a t i a l d a t a b a s e !

  • Libraries
  • JTS
  • GeoTools
  • CoordinateReferenceSystems
  • thousands (geotools-crs)
  • Dimensions
  • 2D only!
  • Integrations
  • uDIG
  • GeoServer
slide-5
SLIDE 5

OpenStreeMap

5

slide-6
SLIDE 6

2011 - GeoProcessing

6

slide-7
SLIDE 7

2016 - Procedures

  • addLayer (OSM, Point, WKT)
  • Import (ESRI Shapefile, OSM)
  • addNode, addGeometry
  • addNodes (bulk import to RTree)
  • Find within bounding box
  • Find within polygon
  • etc.

7

slide-8
SLIDE 8

Agenda

  • Two Minute History
  • Neo4j 3.4
  • What Spatial Algorithms?
  • Route Finding
  • Point in Polygon
  • Polygon Area
  • Polygon Intersection
  • Polygon Distance
  • Convex Hull, oh my!

8

WITH point({latitude: 55.612149, longitude: 12.995090}) AS poi MATCH (l:Location)<-[:AT]-(b:Business)-[:OF]->(c:Category) WHERE c.name = "coffee" AND distance(l.location, poi) < 10000 RETURN distance(l.location, poi) as distance, b.name as coffee_shop ORDER BY distance DESC

slide-9
SLIDE 9

Neo4j 3.4

Space Time

9

  • Point Type
  • Distance
  • Spatial index
  • 2D and 3D
  • Geographic and Cartesian
  • Types (Date, DateTime,

LocalDateTime)

  • Duration
  • Temporal index
  • Cypher functions
slide-10
SLIDE 10

Spatial: Location Based Search

Neo4j 3.4 focuses on very specific use case:

  • Find points within region
  • Special case of region => circle (distance from point)
slide-11
SLIDE 11

Spatial: Location Based Search

WITH point({latitude: 55.612149, longitude: 12.995090}) AS poi MATCH (l:Location)<-[:AT]-(b:Business)-[:OF]->(c:Category) WHERE c.name = "coffee" AND distance(l.location, poi) < 10000 RETURN distance(l.location, poi) as distance, b.name as coffee_shop ORDER BY distance DESC 9 index 5 hits

https://neo4j.com/docs/cypher-manual/current/functions/spatial/

slide-12
SLIDE 12

Coordinate Reference Systems

  • Geographic (2D and 3D)
  • Points on an ellipsoid model of the earth
  • Units in decimal degrees
  • Latitude, longitude (y, x)
  • Height above ellipse (z) - not altitude
  • Distance function returns value in meters
  • Cartesian (2D and 3D)
  • Points on cartesian axes in euclidean space
  • Units undefined (whatever the user means them to be)
  • Distance function returns same units using Pythagoras

https://neo4j.com/docs/cypher-manual/current/syntax/spatial/

slide-13
SLIDE 13

Agenda

  • Two Minute History
  • Neo4j 3.4
  • Which Spatial Algorithms?
  • Route Finding
  • Point in Polygon
  • Polygon Area
  • Polygon Intersection
  • Polygon Distance
  • Convex Hull, oh my!

13

slide-14
SLIDE 14

Internship on Spatial Algorithms

14

slide-15
SLIDE 15

15

Complex Polygons

slide-16
SLIDE 16

16

Cartesian versus Geographic

slide-17
SLIDE 17

Agenda

  • Two Minute History
  • Neo4j 3.4
  • Which Spatial Algorithms?
  • Route Finding
  • Point in Polygon
  • Polygon Area
  • Polygon Intersection
  • Polygon Distance
  • Convex Hull, oh my!

17

slide-18
SLIDE 18

MATCH (a:PointOfInterest) WHERE a.name = 'The View Restaurant & Lounge' MATCH (b:PointOfInterest) WHERE b.name = 'Gregory Coffee' MATCH p=shortestPath((a)-[:ROUTE*..100]-(b)) RETURN p;

18

Routing With Cypher’s shortestPath Function

https://neo4j.com/docs/developer-manual/3.4/cypher/execution-plans/shortestpath-planning/

slide-19
SLIDE 19

MATCH (a:PointOfInterest) WHERE a.name = 'The View Restaurant & Lounge' MATCH (b:PointOfInterest) WHERE b.name = 'Gregory Coffee' MATCH p=shortestPath((a)-[:ROUTE*..100]-(b)) RETURN p;

19

Routing With Cypher’s shortestPath Function

slide-20
SLIDE 20

20

Weighted Paths

Total weight: 1258.48m

slide-21
SLIDE 21

Shortest Weighted Path Algorithms Dijkstra A*

21

slide-22
SLIDE 22

GraphConnect 2018 NY

22

slide-23
SLIDE 23

Agenda

  • Two Minute History
  • Neo4j 3.4
  • Which Spatial Algorithms?
  • Route Finding
  • Point in Polygon
  • Polygon Area
  • Polygon Intersection
  • Polygon Distance
  • Convex Hull, oh my!

23

slide-24
SLIDE 24

24

slide-25
SLIDE 25

MATCH (m:OSMRelation) WHERE m.name = 'Manhattan' MATCH (p:PointOfInterest) WHERE distance(p.location,$mapCenter) < $circleRadius AND amanzi.withinPolygon(m.polygon,p.location) RETURN p

25

Find Points of Interest in Manhattan

slide-26
SLIDE 26

MATCH (m:OSMRelation) WHERE m.name = 'Manhattan' WITH m.polygon as manhattan, amanzi.boundingBoxFor(m.polygon) as bbox MATCH (p:PointOfInterest) WHERE bbox.min < p.location < bbox.max AND amanzi.withinPolygon(manhattan,p.location) RETURN p

26

Find Points of Interest in Manhattan

slide-27
SLIDE 27

27

Point in Polygon

X X X X X

slide-28
SLIDE 28

28

Live Demo

slide-29
SLIDE 29

Agenda

  • Two Minute History
  • Neo4j 3.4
  • Which Spatial Algorithms?
  • Route Finding
  • Point in Polygon
  • Polygon Area
  • Polygon Intersection
  • Polygon Distance
  • Convex Hull, oh my!

29

slide-30
SLIDE 30

30

Shoelace Formula

slide-31
SLIDE 31

31

Girard’s Theorem

https://vanderbei.princeton.edu/WebGL/GirardThmProof.html

slide-32
SLIDE 32

MATCH (r:OSMRelation)-[:POLYGON_STRUCTURE*]->(p) USING INDEX r:OSMRelation(relation_osm_id) WHERE r.relation_osm_id IN $regionIds AND exists(p.polygon) RETURN r.relation_osm_id AS region_id, spatial.algo.area(p.polygon) AS area;

32

Querying for Polygon Area

slide-33
SLIDE 33

33

Algorithms UDF Code

slide-34
SLIDE 34

Agenda

  • Two Minute History
  • Neo4j 3.4
  • Which Spatial Algorithms?
  • Route Finding
  • Point in Polygon
  • Polygon Area
  • Polygon Intersection
  • Polygon Distance
  • Convex Hull, oh my!

34

slide-35
SLIDE 35

35

Intersection: monotone chain sweep line

slide-36
SLIDE 36

36

Intersection in Geographic Coordinates

slide-37
SLIDE 37

37

Intersection in Geographic Coordinates

slide-38
SLIDE 38

Agenda

  • Two Minute History
  • Neo4j 3.4
  • Which Spatial Algorithms?
  • Route Finding
  • Point in Polygon
  • Polygon Area
  • Polygon Intersection
  • Polygon Distance
  • Convex Hull, oh my!

38

slide-39
SLIDE 39

Polygon Distance

39

slide-40
SLIDE 40

UNWIND $pairs AS pair MATCH (r1:OSMRelation)-[:POLYGON_STRUCTURE*]->(p1) USING INDEX r1:OSMRelation(relation_osm_id) WHERE r1.relation_osm_id=pair[0] AND exists(p1.polygon) OPTIONAL MATCH (r2:OSMRelation)-[:POLYGON_STRUCTURE*]->(p2) USING INDEX r2:OSMRelation(relation_osm_id) WHERE r2.relation_osm_id=pair[1] AND exists(p2.polygon) RETURN pair, spatial.algo.distance.ends(p1.polygon, p2.polygon) AS distance;

40

Querying for Polygon Distances

slide-41
SLIDE 41

Agenda

  • Two Minute History
  • Neo4j 3.4
  • Which Spatial Algorithms?
  • Route Finding
  • Point in Polygon
  • Polygon Area
  • Polygon Intersection
  • Polygon Distance
  • Convex Hull, Oh My!

41

slide-42
SLIDE 42

Convex Hull, Oh My!

42

slide-43
SLIDE 43

43

Convex Hull: Graham Scan

slide-44
SLIDE 44

44

Convex Hull in Geographic Coordinates

slide-45
SLIDE 45

MATCH (r:OSMRelation)-[:POLYGON_STRUCTURE*]->(p) USING INDEX r:OSMRelation(relation_osm_id) WHERE r.relation_osm_id IN $regionIds AND exists(p.polygon) RETURN p.polygon as region;

45

Querying for Polygons and Convex Hull

MATCH (r:OSMRelation)-[:POLYGON_STRUCTURE*]->(p) USING INDEX r:OSMRelation(relation_osm_id) WHERE r.relation_osm_id IN $regionIds AND exists(p.polygon) RETURN spatial.algo.convexHull(p.polygon) as region;

slide-46
SLIDE 46

The end...

46

slide-47
SLIDE 47

Hunger Games Questions

47

1. Easy: What is the difference between Cartesian and Geographic Coords?

a. Cartesian uses latitude and longitude while geographic uses x and y b. Cartesian is euclidean while geographic uses polar coordinates c. They are actually the same

2. Medium: Which library has the most spatial algorithms?

a. Old “Neo4j Spatial” from 2010 b. OpenStreetMap importing library c. New “Spatial Algorithms” library from 2019

3. Hard: What is the name of the function to calculate the distance between polygons and return the result together with the two closest points? Answer here: r.neo4j.com/hunger-games

slide-48
SLIDE 48

Links to the demo application as well as library used in the data modeling and building of the app:

  • Demo “OSM Routing App”:
  • https://github.com/johnymontana/osm-routing-app
  • OSM importer and route graph generation procedures:
  • https://github.com/neo4j-contrib/osm
  • The 'Spatial Algorithms' library and functions:
  • https://github.com/neo4j-contrib/spatial-algorithms
  • Dijkstra, A-star and batching large data imports:
  • https://github.com/neo4j-contrib/neo4j-apoc-procedures

48

References

slide-49
SLIDE 49

Reference Slides from Graph Connect

49

Importing OSM & Building a Route Graph

slide-50
SLIDE 50

50

https://github.com/neo4j-contrib/osm

java -Xms1280m -Xmx1280m \

  • cp "target/osm-1.0-SNAPSHOT.jar:target/dependency/*" org.neo4j.gis.osm.OSMImportTool \
  • -skip-duplicate-nodes --delete --into target/databases/NewYork NewYork.osm.gz

IMPORT DONE in 2m 31s 430ms. Imported: 21401587 nodes 22647165 relationships 53754931 properties Peak memory usage: 1.24 G

OSMImportTool 55M properties in 2min!

slide-51
SLIDE 51
  • OSM Import is a raw graph
  • No distances on relationships
  • Optimized for editing the graph, not querying or routing
  • We need to post-process
  • Add distances to relationships, ways and relations
  • Add indexes for names, categories, locations, dates
  • Build routing graph
  • Build PointOfInterest graph

51

The bad news ... This takes hours!

slide-52
SLIDE 52

CREATE INDEX ON :OSMTags(amenity); ← collect relevant points of interest CREATE INDEX ON :OSMTags(description); CREATE INDEX ON :OSMTags(food); CREATE INDEX ON :OSMTags(highway); ← route graph limited to streets CREATE INDEX ON :OSMTags(restaurant); CREATE INDEX ON :Intersection(location); ← help find routes CREATE INDEX ON :Routable(location); ← help find routes CREATE INDEX ON :PointOfInterest(location); ← help find routes CREATE INDEX ON :OSMNode(location); CREATE INDEX ON :PointOfInterest(name); ← search for points of interest

52

Create Indexes

slide-53
SLIDE 53

MATCH (awn:OSMWayNode)-[r:NEXT]-(bwn:OSMWayNode) WHERE NOT exists(r.distance) WITH awn,bwn,r LIMIT 10000 MATCH (awn)-[:NODE]->(a:OSMNode), (bwn)-[:NODE]->(b:OSMNode) SET r.distance=distance(a.location,b.location) RETURN COUNT(*);

53

Distances

CALL apoc.periodic.iterate( 'MATCH (awn:OSMWayNode)-[r:NEXT]-(bwn:OSMWayNode) WHERE NOT exists(r.distance) RETURN awn,bwn,r', 'MATCH (awn)-[:NODE]->(a:OSMNode), (bwn)-[:NODE]->(b:OSMNode) SET r.distance=distance(a.location,b.location)', {batchSize:10000, parallel:false});

10min for 19 million!

slide-54
SLIDE 54

MATCH (awn:OSMWayNode)-[r:NEXT]-(bwn:OSMWayNode) WHERE NOT exists(r.distance) WITH awn,bwn,r LIMIT 10000 MATCH (awn)-[:NODE]->(a:OSMNode), (bwn)-[:NODE]->(b:OSMNode) SET r.distance=distance(a.location,b.location) RETURN COUNT(*);

54

Distances

slide-55
SLIDE 55

55

Routing Graph

MATCH (n:OSMNode) WHERE size((n)<-[:NODE]-()) > 2 AND NOT (n:Intersection) WITH n LIMIT 100 MATCH (n)<-[:NODE]-(wn:OSMWayNode), (wn)<-[:NEXT*0..100]-(wx), (wx)<-[:FIRST_NODE]-(w:OSMWay)-[:TAGS]->(wt:OSMTags) WHERE exists(wt.highway) SET n:Intersection RETURN count(*);

3 min for 50 thousand!

slide-56
SLIDE 56

56

Routing Graph

MATCH (x:Intersection) WITH x LIMIT 100 CALL spatial.osm.routeIntersection(x,true,false,false) YIELD fromNode, toNode, distance, fromRel, toRel WITH fromNode, toNode, distance, fromRel, toRel MERGE (fromNode)-[r:ROUTE {fromRel:id(fromRel),toRel:id(toRel)}]->(toNode) ON CREATE SET r.distance = distance RETURN count(*);

2 min for 50 thousand!

slide-57
SLIDE 57

57

PointOfInterest Graph 2min for 8 thousand!

UNWIND ["restaurant","fast_food","cafe","bar","pub","ice_cream","cinema"] AS amenity MATCH (x:OSMNode)-[:TAGS]->(t:OSMTags) WHERE t.amenity = amenity AND NOT (x)-[:ROUTE]->() WITH x, x.location as poi LIMIT 100 MATCH (n:OSMNode) WHERE distance(poi, n.location) < 100 WITH x, n MATCH (n)<-[:NODE]-(wn:OSMWayNode), (wn)<-[:NEXT*0..10]-(wx), (wx)<-[:FIRST_NODE]-(w:OSMWay)-[:TAGS]->(wt:OSMTags) WITH x, w, wt WHERE exists(wt.highway) WITH x, collect(w) as ways CALL spatial.osm.routePointOfInterest(x,ways) YIELD node SET x:PointOfInterest RETURN count(node);

slide-58
SLIDE 58

58

PointOfInterest Graph 1min for 5 thousand!

MATCH (x:Routable:OSMNode) WHERE NOT (x)-[:ROUTE]->(:Intersection) WITH x LIMIT 100 CALL spatial.osm.routeIntersection(x,true,false,false) YIELD fromNode, toNode, fromRel, toRel, distance, length, count WITH fromNode, toNode, fromRel, toRel, distance, length, count MERGE (fromNode)-[r:ROUTE {fromRel:id(fromRel),toRel:id(toRel)}]->(toNode) ON CREATE SET r.distance = distance, r.length = length, r.count = count RETURN count(*);

slide-59
SLIDE 59

59

Searching for routes

MATCH (a:PointOfInterest) WHERE a.name = 'The View Restaurant & Lounge' MATCH (b:PointOfInterest) WHERE b.name = 'Gregory Coffee' MATCH p=shortestPath((a)-[:ROUTE*..100]-(b)) RETURN p;

slide-60
SLIDE 60

The end...

60