the v diagram a query dependent approach to moving knn
play

The V*-Diagram: A Query-Dependent Approach to Moving KNN Queries - PowerPoint PPT Presentation

The V*-Diagram: A Query-Dependent Approach to Moving KNN Queries Sarana Nutanong, Rui Zhang, Egemen Tanin, Lars Kulik Dept. of Computer Science and Software Engineering University of Melbourne p.1/25 Motivation Consider two scenarios:


  1. The V*-Diagram: A Query-Dependent Approach to Moving KNN Queries Sarana Nutanong, Rui Zhang, Egemen Tanin, Lars Kulik Dept. of Computer Science and Software Engineering University of Melbourne – p.1/25

  2. Motivation Consider two scenarios: • a driver in a GPS-equipped car finding the nearest gas station along the route of a trip; • a tourist walking in the city looking for the nearest ATM. These scenarios are examples of moving k nearest neighbor queries (M k NN) . – p.2/25

  3. Simple Approach The Voronoi Diagram Figure 1: Voronoi diagrams Drawbacks: 1. Expensive precomputations 2. Inefficient update operations 3. No support for dynamically changing k values – p.3/25

  4. Best Existing Approach Influence-set Retrieval [Zhang et al., 2003] (a) Bisector B ad is discovered as a (b) All boundaries are discovered boundary. Figure 2: Computing a Voronoi cell locally – p.4/25

  5. Our Approach: V*-Diagram Objectives: 1. Requires no precomputation 2. Supports dynamic insertions / deletions of objects 3. Handles dynamically changing k – p.5/25

  6. Our Approach: V*-Diagram Objectives: 1. Requires no precomputation 2. Supports dynamic insertions / deletions of objects 3. Handles dynamically changing k Result: Outperforms the best practice [Zhang et al.] by 2 orders of magnitude – p.5/25

  7. The V*-Diagram Known Region If the known NNs to q are { d , f , j } , the know region W ( q , j ) is { v : dist ( q , v ) ≤ dist ( q , j ) } . – p.6/25

  8. The V*-Diagram Safe region wrt a data point We retrieve ( k + x ) objects. In this example, k and x are 1 , so we retrieve p and z . If q ′ ∈ S ( q b , z , p ) then, ∀ p ′ / ∈ W ( q b , z ) , dist ( q ′ , p ) < dist ( q ′ , p ′ ) . S ( q b , z , p ) = { q ′ : dist ( p , q ′ ) ≤ dist ( q b , z ) − dist ( q b , q ′ ) } – p.7/25

  9. The V*-Diagram The Fixed-rank Region (FRR) [Kulik and Tanin, 2006] (a) � a , c , b , f , e , d � (b) � a , c , b , e , f , d � Figure 3: Incremental rank update – p.8/25

  10. The V*-Diagram Integrated Safe Region (ISR) and V*- k NN ISR is an intersection of 1. the safe region wrt k th NN, S ( q b , z , p k ) ; 2. the FRR of the ( k + x ) NNs of q b . Figure 4: V*- k NN Example ( k = 2 , x = 2 ) – p.9/25

  11. V*- k NN Algorithm http://www.csse.unimelb.edu.au/~sarana/demo.html – p.10/25

  12. Experiments • Data Structure: R*-trees (1-kB block size). • Comparative Method: RIS- k NN [Zhang et al.] • Datasets: • (U) 25,000 of data points in uniform distribution • (Z) 25,000 of data points in Zipfian distribution • (C) 65,743 postal addresses from California • (N) 119,897 postal addresses from North-Eastern USA – p.11/25

  13. Experiments Trajectories 6500 2950 2925 6000 2900 5500 2875 5000 2850 5000 5500 6000 6500 2150 2175 2200 2225 (a) Directional (D) (b) Random (R) Figure 5: Trajectory types – p.12/25

  14. Experiments total cost wrt x 100 U U 100 Z Z C C 10 N N 10 time (sec) time (sec) 1 1 0.1 0.1 0.01 0.01 3 6 9 12 15 18 21 24 3 6 9 12 15 18 21 24 x x (a) Total cost (D) (b) Page access (D) Figure 6: Effect of x – p.13/25

  15. Experiments total cost wrt k 1000 1000 100 100 time (sec) time (sec) V* (D) V* (D) V* (R) V* (R) RIS (D) 10 RIS (D) RIS (R) 10 RIS (R) 1 1 10 20 30 40 10 20 30 40 k k (a) Total Cost (California) (b) Total Cost (North-Eastern USA) Figure 7: Effect of k – p.14/25

  16. Experiments total cost wrt n 100 100 V* (D) V* (R) V* (D) time (sec) time (sec) RIS (D) 10 V* (R) 10 RIS (R) RIS (D) RIS (R) 1 1 25 50 75 100 25 50 75 100 n (x1000) n (x1000) (a) Total Cost (Uniform) (b) Total Cost (Zipfian) Figure 8: Effect of dataset size – p.15/25

  17. Cost model RIS- k NN The number of the k VD cells in 2 D space is approximated as 2 kn [Okabe et al., 1992]. For a given trajectory length l , the number n v of k VD cells crossed by the trajectory is given by √ n v = l 2 kn. – p.16/25

  18. Cost model V*- k NN Directional: n b = l/d e . Random: n b = ls/d 2 e , where s is the step size. – p.17/25

  19. Experiments Cost Model 1000 100 100 #accesses #accesses 10 V* (D) V* (R) 10 RIS (D) V* (D) RIS (R) 1 V* (R) Est. RIS (D) RIS (R) 0.1 1 Est. 25 50 75 100 10 20 30 40 n (x1000) k (a) Effect of n (b) Effect of k Figure 9: Cost model validation – p.18/25

  20. The V*-Diagram in a spatial network Figure 10: Safe region Figure 12: ISR is S ( q 1 , u , s ) ∩ F � s , t , u � Figure 11: Fixed-rank region – p.19/25

  21. Experiments The V*-Diagram in a spatial network Figure 13: Road network in north America (175,813 nodes and 179,179 edges) – p.20/25

  22. Experiments The V*-Diagram in a spatial network 110 40 k=2 k=2 100 k=4 35 k=4 90 k=6 k=6 30 k=8 k=8 80 #accesses k=10 k=10 time (sec) 70 25 60 20 50 40 15 30 10 20 10 5 2 4 6 8 10 2 4 6 8 10 x x (a) Total Response Time (b) Access Cost Figure 14: Spatial network: effect of x – p.21/25

  23. Experiments The V*-Diagram in a spatial network 220 55 k=2 k=2 200 50 k=4 k=4 180 45 k=6 k=6 k=8 k=8 160 40 #accesses k=10 k=10 time (sec) 140 35 120 30 100 25 80 20 60 15 40 10 20 5 250 500 750 1000 250 500 750 1000 l l (a) Total Response Time (b) Access Cost Figure 15: Spatial network: effect of l – p.22/25

  24. Conclusions • The V*-Diagram constructs a safe region using: 1. the location of the query point, 2. k NN-search coverage (known region), 3. known data points. • V*- k NN is local , incremental and dynamic . • V*- k NN outperforms the best existing technique by two orders of magnitude. • The V*-diagram is a general philosophy, which can be applied to most safe region based techniques. – p.23/25

  25. Related Publications • S. Nutanong, R. Zhang, E. Tanin, L. Kulik: Analysis and Evaluation of V*- k NN: An Efficient Algorithm for Moving k Nearest Neighbor Queries. To appear in VLDB Journal. • S. Nutanong, R. Zhang, E. Tanin, L. Kulik: V*- k NN: An Efficient Algorithm for Moving k Nearest Neighbor Queries (Demo). ICDE 2009: 1519-1522. • S. Nutanong, R. Zhang, E. Tanin, L. Kulik: The V*-Diagram: a query-dependent approach to moving KNN queries. PVLDB 1(1): 1095-1106 (2008). – p.24/25

  26. Key References • Lars Kulik, Egemen Tanin: Incremental Rank Updates for Moving Query Points. GIScience 2006:251-268. • Atsuyuki Okabe, Berry Boots, Kokichi Sugihara, Sung Nok Chiu: Spatial Tessellations: Concepts and Applications of Voronoi Diagrams. John Wiley & Sons, Inc., 1992. • Jun Zhang, Manli Zhu, Dimitris Papadias, Yufei Tao, Dik Lun Lee: Location-based Spatial Queries. SIGMOD 2003:443-454. – p.25/25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend