Building Spatial Search Algorithms for Neo4j Craig Taverner Neo4j - - PowerPoint PPT Presentation
Building Spatial Search Algorithms for Neo4j Craig Taverner Neo4j - - PowerPoint PPT Presentation
Building Spatial Search Algorithms for Neo4j Craig Taverner Neo4j Cypher and Spatial @craigtaverner Agenda Two Minute History Neo4j 3.4 WITH point({latitude: 55.612149, longitude: 12.995090}) AS poi MATCH
Agenda
- Two Minute History
- Neo4j 3.4
- Which Spatial Algorithms?
- Route Finding
- Point in Polygon
- Polygon Area
- Polygon Intersection
- Polygon Distance
- Convex Hull, oh my!
2
WITH point({latitude: 55.612149, longitude: 12.995090}) AS poi MATCH (l:Location)<-[:AT]-(b:Business)-[:OF]->(c:Category) WHERE c.name = "coffee" AND distance(l.location, poi) < 10000 RETURN distance(l.location, poi) as distance, b.name as coffee_shop ORDER BY distance DESC
Agenda
- Two Minute History
- Neo4j 3.4
- Which Spatial Algorithms?
- Route Finding
- Point in Polygon
- Polygon Area
- Polygon Intersection
- Polygon Distance
- Convex Hull, oh my!
3
2010 - “Neo4j Spatial”
- Core
- GeometryEncoder
- SimplePointEncoder
- WKTGeometryEncoder
- RTree
- addNodes bulk inserver
- OpenStreetMap
- OSMImporter
- OSMGeometryEncoder
- OSMLayer
4
a d a t a m
- d
e l l i n g l i b r a r y n
- t
a s p a t i a l d a t a b a s e !
- Libraries
- JTS
- GeoTools
- CoordinateReferenceSystems
- thousands (geotools-crs)
- Dimensions
- 2D only!
- Integrations
- uDIG
- GeoServer
OpenStreeMap
5
2011 - GeoProcessing
6
2016 - Procedures
- addLayer (OSM, Point, WKT)
- Import (ESRI Shapefile, OSM)
- addNode, addGeometry
- addNodes (bulk import to RTree)
- Find within bounding box
- Find within polygon
- etc.
7
Agenda
- Two Minute History
- Neo4j 3.4
- What Spatial Algorithms?
- Route Finding
- Point in Polygon
- Polygon Area
- Polygon Intersection
- Polygon Distance
- Convex Hull, oh my!
8
WITH point({latitude: 55.612149, longitude: 12.995090}) AS poi MATCH (l:Location)<-[:AT]-(b:Business)-[:OF]->(c:Category) WHERE c.name = "coffee" AND distance(l.location, poi) < 10000 RETURN distance(l.location, poi) as distance, b.name as coffee_shop ORDER BY distance DESC
Neo4j 3.4
Space Time
9
- Point Type
- Distance
- Spatial index
- 2D and 3D
- Geographic and Cartesian
- Types (Date, DateTime,
LocalDateTime)
- Duration
- Temporal index
- Cypher functions
Spatial: Location Based Search
Neo4j 3.4 focuses on very specific use case:
- Find points within region
- Special case of region => circle (distance from point)
Spatial: Location Based Search
WITH point({latitude: 55.612149, longitude: 12.995090}) AS poi MATCH (l:Location)<-[:AT]-(b:Business)-[:OF]->(c:Category) WHERE c.name = "coffee" AND distance(l.location, poi) < 10000 RETURN distance(l.location, poi) as distance, b.name as coffee_shop ORDER BY distance DESC 9 index 5 hits
https://neo4j.com/docs/cypher-manual/current/functions/spatial/
Coordinate Reference Systems
- Geographic (2D and 3D)
- Points on an ellipsoid model of the earth
- Units in decimal degrees
- Latitude, longitude (y, x)
- Height above ellipse (z) - not altitude
- Distance function returns value in meters
- Cartesian (2D and 3D)
- Points on cartesian axes in euclidean space
- Units undefined (whatever the user means them to be)
- Distance function returns same units using Pythagoras
https://neo4j.com/docs/cypher-manual/current/syntax/spatial/
Agenda
- Two Minute History
- Neo4j 3.4
- Which Spatial Algorithms?
- Route Finding
- Point in Polygon
- Polygon Area
- Polygon Intersection
- Polygon Distance
- Convex Hull, oh my!
13
Internship on Spatial Algorithms
14
15
Complex Polygons
16
Cartesian versus Geographic
Agenda
- Two Minute History
- Neo4j 3.4
- Which Spatial Algorithms?
- Route Finding
- Point in Polygon
- Polygon Area
- Polygon Intersection
- Polygon Distance
- Convex Hull, oh my!
17
MATCH (a:PointOfInterest) WHERE a.name = 'The View Restaurant & Lounge' MATCH (b:PointOfInterest) WHERE b.name = 'Gregory Coffee' MATCH p=shortestPath((a)-[:ROUTE*..100]-(b)) RETURN p;
18
Routing With Cypher’s shortestPath Function
https://neo4j.com/docs/developer-manual/3.4/cypher/execution-plans/shortestpath-planning/
MATCH (a:PointOfInterest) WHERE a.name = 'The View Restaurant & Lounge' MATCH (b:PointOfInterest) WHERE b.name = 'Gregory Coffee' MATCH p=shortestPath((a)-[:ROUTE*..100]-(b)) RETURN p;
19
Routing With Cypher’s shortestPath Function
20
Weighted Paths
Total weight: 1258.48m
Shortest Weighted Path Algorithms Dijkstra A*
21
GraphConnect 2018 NY
22
Agenda
- Two Minute History
- Neo4j 3.4
- Which Spatial Algorithms?
- Route Finding
- Point in Polygon
- Polygon Area
- Polygon Intersection
- Polygon Distance
- Convex Hull, oh my!
23
24
MATCH (m:OSMRelation) WHERE m.name = 'Manhattan' MATCH (p:PointOfInterest) WHERE distance(p.location,$mapCenter) < $circleRadius AND amanzi.withinPolygon(m.polygon,p.location) RETURN p
25
Find Points of Interest in Manhattan
MATCH (m:OSMRelation) WHERE m.name = 'Manhattan' WITH m.polygon as manhattan, amanzi.boundingBoxFor(m.polygon) as bbox MATCH (p:PointOfInterest) WHERE bbox.min < p.location < bbox.max AND amanzi.withinPolygon(manhattan,p.location) RETURN p
26
Find Points of Interest in Manhattan
27
Point in Polygon
X X X X X
28
Live Demo
Agenda
- Two Minute History
- Neo4j 3.4
- Which Spatial Algorithms?
- Route Finding
- Point in Polygon
- Polygon Area
- Polygon Intersection
- Polygon Distance
- Convex Hull, oh my!
29
30
Shoelace Formula
31
Girard’s Theorem
https://vanderbei.princeton.edu/WebGL/GirardThmProof.html
MATCH (r:OSMRelation)-[:POLYGON_STRUCTURE*]->(p) USING INDEX r:OSMRelation(relation_osm_id) WHERE r.relation_osm_id IN $regionIds AND exists(p.polygon) RETURN r.relation_osm_id AS region_id, spatial.algo.area(p.polygon) AS area;
32
Querying for Polygon Area
33
Algorithms UDF Code
Agenda
- Two Minute History
- Neo4j 3.4
- Which Spatial Algorithms?
- Route Finding
- Point in Polygon
- Polygon Area
- Polygon Intersection
- Polygon Distance
- Convex Hull, oh my!
34
35
Intersection: monotone chain sweep line
36
Intersection in Geographic Coordinates
37
Intersection in Geographic Coordinates
Agenda
- Two Minute History
- Neo4j 3.4
- Which Spatial Algorithms?
- Route Finding
- Point in Polygon
- Polygon Area
- Polygon Intersection
- Polygon Distance
- Convex Hull, oh my!
38
Polygon Distance
39
UNWIND $pairs AS pair MATCH (r1:OSMRelation)-[:POLYGON_STRUCTURE*]->(p1) USING INDEX r1:OSMRelation(relation_osm_id) WHERE r1.relation_osm_id=pair[0] AND exists(p1.polygon) OPTIONAL MATCH (r2:OSMRelation)-[:POLYGON_STRUCTURE*]->(p2) USING INDEX r2:OSMRelation(relation_osm_id) WHERE r2.relation_osm_id=pair[1] AND exists(p2.polygon) RETURN pair, spatial.algo.distance.ends(p1.polygon, p2.polygon) AS distance;
40
Querying for Polygon Distances
Agenda
- Two Minute History
- Neo4j 3.4
- Which Spatial Algorithms?
- Route Finding
- Point in Polygon
- Polygon Area
- Polygon Intersection
- Polygon Distance
- Convex Hull, Oh My!
41
Convex Hull, Oh My!
42
43
Convex Hull: Graham Scan
44
Convex Hull in Geographic Coordinates
MATCH (r:OSMRelation)-[:POLYGON_STRUCTURE*]->(p) USING INDEX r:OSMRelation(relation_osm_id) WHERE r.relation_osm_id IN $regionIds AND exists(p.polygon) RETURN p.polygon as region;
45
Querying for Polygons and Convex Hull
MATCH (r:OSMRelation)-[:POLYGON_STRUCTURE*]->(p) USING INDEX r:OSMRelation(relation_osm_id) WHERE r.relation_osm_id IN $regionIds AND exists(p.polygon) RETURN spatial.algo.convexHull(p.polygon) as region;
The end...
46
Hunger Games Questions
47
1. Easy: What is the difference between Cartesian and Geographic Coords?
a. Cartesian uses latitude and longitude while geographic uses x and y b. Cartesian is euclidean while geographic uses polar coordinates c. They are actually the same
2. Medium: Which library has the most spatial algorithms?
a. Old “Neo4j Spatial” from 2010 b. OpenStreetMap importing library c. New “Spatial Algorithms” library from 2019
3. Hard: What is the name of the function to calculate the distance between polygons and return the result together with the two closest points? Answer here: r.neo4j.com/hunger-games
Links to the demo application as well as library used in the data modeling and building of the app:
- Demo “OSM Routing App”:
- https://github.com/johnymontana/osm-routing-app
- OSM importer and route graph generation procedures:
- https://github.com/neo4j-contrib/osm
- The 'Spatial Algorithms' library and functions:
- https://github.com/neo4j-contrib/spatial-algorithms
- Dijkstra, A-star and batching large data imports:
- https://github.com/neo4j-contrib/neo4j-apoc-procedures
48
References
Reference Slides from Graph Connect
49
Importing OSM & Building a Route Graph
50
https://github.com/neo4j-contrib/osm
java -Xms1280m -Xmx1280m \
- cp "target/osm-1.0-SNAPSHOT.jar:target/dependency/*" org.neo4j.gis.osm.OSMImportTool \
- -skip-duplicate-nodes --delete --into target/databases/NewYork NewYork.osm.gz
IMPORT DONE in 2m 31s 430ms. Imported: 21401587 nodes 22647165 relationships 53754931 properties Peak memory usage: 1.24 G
OSMImportTool 55M properties in 2min!
- OSM Import is a raw graph
- No distances on relationships
- Optimized for editing the graph, not querying or routing
- We need to post-process
- Add distances to relationships, ways and relations
- Add indexes for names, categories, locations, dates
- Build routing graph
- Build PointOfInterest graph
51
The bad news ... This takes hours!
CREATE INDEX ON :OSMTags(amenity); ← collect relevant points of interest CREATE INDEX ON :OSMTags(description); CREATE INDEX ON :OSMTags(food); CREATE INDEX ON :OSMTags(highway); ← route graph limited to streets CREATE INDEX ON :OSMTags(restaurant); CREATE INDEX ON :Intersection(location); ← help find routes CREATE INDEX ON :Routable(location); ← help find routes CREATE INDEX ON :PointOfInterest(location); ← help find routes CREATE INDEX ON :OSMNode(location); CREATE INDEX ON :PointOfInterest(name); ← search for points of interest
52
Create Indexes
MATCH (awn:OSMWayNode)-[r:NEXT]-(bwn:OSMWayNode) WHERE NOT exists(r.distance) WITH awn,bwn,r LIMIT 10000 MATCH (awn)-[:NODE]->(a:OSMNode), (bwn)-[:NODE]->(b:OSMNode) SET r.distance=distance(a.location,b.location) RETURN COUNT(*);
53
Distances
CALL apoc.periodic.iterate( 'MATCH (awn:OSMWayNode)-[r:NEXT]-(bwn:OSMWayNode) WHERE NOT exists(r.distance) RETURN awn,bwn,r', 'MATCH (awn)-[:NODE]->(a:OSMNode), (bwn)-[:NODE]->(b:OSMNode) SET r.distance=distance(a.location,b.location)', {batchSize:10000, parallel:false});
10min for 19 million!
MATCH (awn:OSMWayNode)-[r:NEXT]-(bwn:OSMWayNode) WHERE NOT exists(r.distance) WITH awn,bwn,r LIMIT 10000 MATCH (awn)-[:NODE]->(a:OSMNode), (bwn)-[:NODE]->(b:OSMNode) SET r.distance=distance(a.location,b.location) RETURN COUNT(*);
54
Distances
55
Routing Graph
MATCH (n:OSMNode) WHERE size((n)<-[:NODE]-()) > 2 AND NOT (n:Intersection) WITH n LIMIT 100 MATCH (n)<-[:NODE]-(wn:OSMWayNode), (wn)<-[:NEXT*0..100]-(wx), (wx)<-[:FIRST_NODE]-(w:OSMWay)-[:TAGS]->(wt:OSMTags) WHERE exists(wt.highway) SET n:Intersection RETURN count(*);
3 min for 50 thousand!
56
Routing Graph
MATCH (x:Intersection) WITH x LIMIT 100 CALL spatial.osm.routeIntersection(x,true,false,false) YIELD fromNode, toNode, distance, fromRel, toRel WITH fromNode, toNode, distance, fromRel, toRel MERGE (fromNode)-[r:ROUTE {fromRel:id(fromRel),toRel:id(toRel)}]->(toNode) ON CREATE SET r.distance = distance RETURN count(*);
2 min for 50 thousand!
57
PointOfInterest Graph 2min for 8 thousand!
UNWIND ["restaurant","fast_food","cafe","bar","pub","ice_cream","cinema"] AS amenity MATCH (x:OSMNode)-[:TAGS]->(t:OSMTags) WHERE t.amenity = amenity AND NOT (x)-[:ROUTE]->() WITH x, x.location as poi LIMIT 100 MATCH (n:OSMNode) WHERE distance(poi, n.location) < 100 WITH x, n MATCH (n)<-[:NODE]-(wn:OSMWayNode), (wn)<-[:NEXT*0..10]-(wx), (wx)<-[:FIRST_NODE]-(w:OSMWay)-[:TAGS]->(wt:OSMTags) WITH x, w, wt WHERE exists(wt.highway) WITH x, collect(w) as ways CALL spatial.osm.routePointOfInterest(x,ways) YIELD node SET x:PointOfInterest RETURN count(node);
58
PointOfInterest Graph 1min for 5 thousand!
MATCH (x:Routable:OSMNode) WHERE NOT (x)-[:ROUTE]->(:Intersection) WITH x LIMIT 100 CALL spatial.osm.routeIntersection(x,true,false,false) YIELD fromNode, toNode, fromRel, toRel, distance, length, count WITH fromNode, toNode, fromRel, toRel, distance, length, count MERGE (fromNode)-[r:ROUTE {fromRel:id(fromRel),toRel:id(toRel)}]->(toNode) ON CREATE SET r.distance = distance, r.length = length, r.count = count RETURN count(*);
59
Searching for routes
MATCH (a:PointOfInterest) WHERE a.name = 'The View Restaurant & Lounge' MATCH (b:PointOfInterest) WHERE b.name = 'Gregory Coffee' MATCH p=shortestPath((a)-[:ROUTE*..100]-(b)) RETURN p;
The end...
60