Hierarchical Graph Traversal for Aggregate k Nearest Neighbors - - PowerPoint PPT Presentation

β–Ά
hierarchical graph traversal for
SMART_READER_LITE
LIVE PREVIEW

Hierarchical Graph Traversal for Aggregate k Nearest Neighbors - - PowerPoint PPT Presentation

Hierarchical Graph Traversal for Aggregate k Nearest Neighbors Search in Road Networks ICAPS 2020 SEMINAR PRESENTER: TENINDRA ABEYWICKRAMA CO-AUTHORS: MUHAMMAD AAMIR CHEEMA, SABINE STORANDT Background: Road Network Graph Input: Road


slide-1
SLIDE 1

Hierarchical Graph Traversal for Aggregate k Nearest Neighbors Search in Road Networks

ICAPS 2020 SEMINAR

PRESENTER: TENINDRA ABEYWICKRAMA CO-AUTHORS: MUHAMMAD AAMIR CHEEMA, SABINE STORANDT

slide-2
SLIDE 2

Background: Road Network Graph

  • Input: Road network graph 𝐻 = π‘Š, 𝐹
  • Vertex set π‘Š: Road intersections
  • Edge set 𝐹: Road segments
  • Each edge has weight: e.g. travel time

2

Source: https://magazine.impactscool.com/en/spec iali/google-maps-e-la-teoria-dei-grafi/

slide-3
SLIDE 3

Background: k Nearest Neighbour (kNN) Queries

  • Input: Object set 𝑃 βŠ† π‘Š (e.g. all restaurants)
  • Input: Agent location π‘Ÿ ∈ π‘Š (e.g. a diner)
  • kNN Query: What is the nearest object to π‘Ÿ?
  • By Euclidean Distance: 𝑝2
  • By Network Distance: 𝑝1
  • More accurate + versatile

3

q

  • 1
  • 2
slide-4
SLIDE 4

Our Problem: Aggregate k Nearest Neighbours (AkNN)

  • AkNN: Find the nearest object to multiple agents
  • Example: Three friends (agents) want to meet at a

McDonalds (objects). Which object to meet at?

4

Sources: Google Maps, McDonalds, Flaticon.com

𝑝1 𝑝2 π‘Ÿ1 π‘Ÿ2 𝑝3 π‘Ÿ3

slide-5
SLIDE 5

Our Problem: AkNN

  • Input: Aggregate Function (e.g. SUM), Agent Set 𝑅 βŠ† π‘Š
  • Aggregate individual distances from each agent
  • Rank objects by their aggregate score

5

𝑝1 𝑝2 π‘Ÿ1 π‘Ÿ2 𝑝3 π‘Ÿ3 𝑒(π‘Ÿ2, 𝑝2) 𝑒(π‘Ÿ1, 𝑝2) 𝑒(π‘Ÿ3, 𝑝2) 𝐡𝑕𝑕_𝑇𝑑𝑝𝑠𝑓(𝑝2) = 𝐡𝑕𝑕_πΊπ‘£π‘œπ‘‘π‘’π‘—π‘π‘œ 𝑒(π‘Ÿ1, 𝑝2 , 𝑒(π‘Ÿ2, 𝑝2), 𝑒(π‘Ÿ3, 𝑝2))

Sources: Google Maps, McDonalds, Flaticon.com

slide-6
SLIDE 6

Our Problem: AkNN

  • Still using network distance for accuracy/versatility
  • Example: Which McDonalds minimises the SUM of

travel times over all diners?

6

𝑝1 π‘Ÿ1 π‘Ÿ2 𝑝3 π‘Ÿ3 𝑝2

Sources: Google Maps, McDonalds, Flaticon.com

slide-7
SLIDE 7

Motivation

  • Inefficient to compute distance to every object
  • Typical Solution: heuristically retrieve likely

candidates until all results found

  • But existing heuristics are either:
  • (a) borrowed from kNN => not suitable for AkNN
  • (b) not accurate enough for network distance

7

slide-8
SLIDE 8

Expansion Heuristics

  • Borrowed from kNN search heuristics: expand

from each query vertex

  • But best AkNN candidates unlikely to be near any
  • ne query vertex

8

𝑝1 𝑝2 π‘Ÿ1 π‘Ÿ2 𝑝3 π‘Ÿ3 𝑝4

slide-9
SLIDE 9

Hierarchical Search Heuristic

  • Divide space to group objects => recursively
  • Search β€œpromising” regions top-down (recursively)
  • Pinpoint best candidate anywhere in space

𝑝3 π‘Ÿ2 𝑝2 π‘Ÿ1 π‘Ÿ3 𝑝4 𝑝1 𝑝5 𝑝6

𝑝2 𝑝5 𝑅1 𝑅2 𝑅3 𝑅4 𝑅4𝑏 𝑅4𝑐 𝑅4𝑑 𝑅4𝑒

Root Level Children

  • f Q4
slide-10
SLIDE 10

Hierarchical Search

  • How do we decide which regions are β€œpromising”?
  • Use lower-bound score for all objects in a region
  • Past Work: R-tree + Euclidean distance lower-bound
  • Not accurate for road network distance

q

  • 1
  • 2

Data structure needed for accurate hierarchical lower- bound search in graphs

slide-11
SLIDE 11

Landmark Lower-Bounds

  • Pre-compute distances from landmark vertices
  • Use triangle inequality to compute lower-bound
  • Only allows small numbers of landmarks (space cost)
  • Not suitable for hierarchical search

11

q d(l,o)

  • l

d(l,q) d(q,o) 𝑒 π‘Ÿ, 𝑝 ≀ |𝑒 π‘š, 𝑝 βˆ’ 𝑒(π‘š, π‘Ÿ)| Choose tightest LLB over a set

  • f multiple landmarks
slide-12
SLIDE 12

Compacted-Object Landmark Tree (COLT) Index

  • Partition graph recursively => subgraph tree
  • Choose localised landmarks in every subgraph
  • Compact based on object set 𝑃

12

S0 S1 S2 S2A S2B

l1 l2

  • 1,1
  • 2,3
  • 2,1
  • 1,4
slide-13
SLIDE 13

COLT

  • Non-leaf + leaf nodes stores
  • π‘βˆ’: min distance to any object in subgraph from landmark
  • 𝑁+: max distance to any object in subgraph from landmark
  • Enables accurate lower-bound for any tree node
slide-14
SLIDE 14

Hierarchical Traversal in COLT

  • Top-down search from root node
  • Compute lower-bound for child using equation
  • Recursively evaluate child with best score

S0 S1 S2 S2A S2B

l1 l2

  • 1,1
  • 2,3
  • 2,1
  • 1,4
slide-15
SLIDE 15

Hierarchical Traversal in COLT

  • Leaf nodes store Object Distance List
  • Find object with minimum aggregate lower-bound
  • Interestingly common functions preserve convexity!
  • Easily found using modified binary search

Object 𝑝4 𝑝2 𝑝5 𝑝1 𝑝3 Distance 2 5 7 8 12

𝑔(𝑦) 𝑦

slide-16
SLIDE 16

Experimental Setup

  • Dataset: US Road Network Graph from DIMACS
  • π‘Š = 23,947,347 vertices, 𝐹 = 57,708,624 edges
  • Real-World POIs from OSM for US
  • Comparison against IER and NVD
  • IER: hierarchical search using Euclidean heuristic
  • NVD: state-of-the-art expansion heuristic

16

slide-17
SLIDE 17

Query Time: Real-World POIs

  • COLT up to an order of magnitude faster!
  • COLT performs better on dense POI sets
  • Heuristics is less important on sparse POI sets

17

slide-18
SLIDE 18

Sensitivity Analysis

  • COLT maintains improvement for
  • Varying parameters (𝑙, number of agents)
  • Varying aggregate functions (MAX, SUM)
  • Heuristic efficiency metrics
  • Comes at a lightweight pre-processing cost

18

slide-19
SLIDE 19

Thank You! Questions?

19