r trees
play

R-Trees Albert-Jan Yzelman October 22, 2007 Albert-Jan Yzelman - PowerPoint PPT Presentation

R-Trees R-Trees Albert-Jan Yzelman October 22, 2007 Albert-Jan Yzelman R-Trees > Introduction Outline R-trees Introduction 1 Basics 2 Tree Construction 3 Conclusions 4 Albert-Jan Yzelman R-Trees > Introduction Background


  1. � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � ✁ � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ✁ R-Trees > Basics Demonstration of a line query Albert-Jan Yzelman

  2. ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✂ ✄ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✁ ✁ ✄ � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � ✁ � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✄ R-Trees > Basics Demonstration of a line query Albert-Jan Yzelman

  3. ✂ ☎ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ☎ ☎ ☎ ☎ ☎ ✆ ☎ ☎ ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✂ ✂ ✆ � � � � � � � � � � � � ✁ ✂ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✂ ✆ R-Trees > Basics Demonstration of a line query Albert-Jan Yzelman

  4. ✂ ✄ ☎ ☎ ☎ ☎ ☎ ☎ ✄ ✄ ✄ ☎ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✂ ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ☎ ✆ ✆ ✆ ☎ ☎ ☎ ☎ ☎ ☎ ✂ ✂ ✆ � ✁ � � � � � � � � ✁ � � � � � � � � � ✁ ✁ ✂ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✆ R-Trees > Basics Demonstration of a line query Albert-Jan Yzelman

  5. R-Trees > Basics Asymptotic query time t query ≤ c · log m n , n → ∞ = O (log n ) Because: the tree is tallest when each internal node has precisely m children, and the tree is balanced. Albert-Jan Yzelman

  6. R-Trees > Tree Construction Outline R-trees Introduction 1 Basics 2 Tree Construction 3 Conclusions 4 Albert-Jan Yzelman

  7. R-Trees > Tree Construction Actively researched R-tree variations: The original R-tree, introduced in 1984 Top-down Greedy Split (TGS) Hilbert R-tree Hilbert TGS Albert-Jan Yzelman

  8. ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ☎ ✆ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ✆ ✆ ✄ ✝ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✝ ✆ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✞ ✄ ✞ � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � � ✁ � � � � � � � � � � � � ✁ ✁ ✄ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✂ ✂ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ ✞ R-Trees > Tree Construction Grouping criteria Figure: Four to-be grouped MBRs Albert-Jan Yzelman

  9. ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ☎ ✆ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ✆ ✆ ✄ ✝ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✝ ✆ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✞ ✄ ✞ � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � � ✁ � � � � � � � � � � � � ✁ ✁ ✄ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✂ ✂ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ ✞ R-Trees > Tree Construction Grouping criteria Figure: Minimum overlap criteria Albert-Jan Yzelman

  10. ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ☎ ✆ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ✆ ✆ ✄ ✝ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✝ ✆ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✞ ✄ ✞ � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � � ✁ � � � � � � � � � � � � ✁ ✁ ✄ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✂ ✂ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ ✞ R-Trees > Tree Construction Grouping criteria Figure: Minimum total volume criteria Albert-Jan Yzelman

  11. R-Trees > Tree Construction Original dynamic R-tree We start out with an empty R-tree and insert new objects one-by-one. For this we need an insertion algorithm which works on arbitrary R-trees. Consider the following example with m = 2 and M = 3. Albert-Jan Yzelman

  12. R-Trees > Tree Construction Original dynamic R-tree Albert-Jan Yzelman

  13. R-Trees > Tree Construction Original dynamic R-tree Albert-Jan Yzelman

  14. R-Trees > Tree Construction Original dynamic R-tree Albert-Jan Yzelman

  15. R-Trees > Tree Construction Original dynamic R-tree Albert-Jan Yzelman

  16. R-Trees > Tree Construction Original dynamic R-tree Differences in overflow handling yield the linear , quadratic and the polynomial R-tree variants. Consider the following example with m = 2 and M = 3. Albert-Jan Yzelman

  17. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  18. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  19. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  20. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  21. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  22. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  23. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  24. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  25. R-Trees > Tree Construction Top-down Greedy Split (TGS) Observation: Query efficiency is determined top-down by the shape of the bounding boxes Why not build the R-tree top-down? Albert-Jan Yzelman

  26. R-Trees > Tree Construction Top-down Greedy Split (TGS) Uses an collection of orderings S . Subdivide the input set into a maximum of M subsets each containing no more than ˜ n elements: With respect to each ordering in S , subdivide the input set into groups of ˜ n elements Find the best binary split with repect to all s ∈ S Recursively split both groups until all groups contain less than m elements So either we sort | S | times to find this best binary split, or we duplicate the input set | S | times Albert-Jan Yzelman

  27. R-Trees > Tree Construction Top-down Greedy Split (TGS) n=28, M=3, and so: h=4 Albert-Jan Yzelman

  28. R-Trees > Tree Construction Top-down Greedy Split (TGS) n=26, M=3, and so: h=3 at the first tree level, each subtree of height h−1=2 may contain 3^2=9 elements x x y y Albert-Jan Yzelman

  29. R-Trees > Tree Construction Top-down Greedy Split (TGS) n=26, M=3, and so: h=3 at the first tree level, each subtree of height h−1=2 may contain 3^2=9 elements x x y Albert-Jan Yzelman

  30. R-Trees > Tree Construction Top-down Greedy Split (TGS) n=26, M=3, and so: h=3 at the first tree level, each subtree of height h−1=2 may contain 3^2=9 elements At the second tree level, each subtree of height h−2=1 may contain 3 elements Albert-Jan Yzelman

  31. R-Trees > Tree Construction Random TGS For each of the K times divide some set in two, we sort | S | times:   c 11 c 21 · · · c | S | 1 c 12 c 22 · · · c | S | 2    . .  ... . .   . .   · · · c 1 M c 2 M c | S | M Albert-Jan Yzelman

  32. R-Trees > Tree Construction Ordering on MBRs It would be useful to have an ordering on MBRs where: X < Y < Z (1) Would imply that X is closer to Y than to Z . Albert-Jan Yzelman

  33. R-Trees > Tree Construction Ordering on MBRs G G=max(E,F) E F E=max(A,B) < F=max(C,D) A B C D A < B < C < D Albert-Jan Yzelman

  34. R-Trees > Tree Construction Ordering on MBRs Let us use the centre coordinate of MBRs for ordering. This is trivial in one dimension; x < y , with x , y ∈ ❘ is well-defined. Albert-Jan Yzelman

  35. R-Trees > Tree Construction Ordering on MBRs Let us use the centre coordinate of MBRs for ordering. But for a higher number of dimensions d ∈ ◆ , d > 1: y ∈ ❘ d � x < � y , with � x ,� is not well-defined. Albert-Jan Yzelman

  36. R-Trees > Tree Construction Ordering on MBRs To solve this, we find a mapping: h : ❘ d → ❘ by using the Hilbert curve . Albert-Jan Yzelman

  37. R-Trees > Tree Construction Ordering on MBRs Figure: First-order Hilbert curve Albert-Jan Yzelman

  38. R-Trees > Tree Construction Ordering on MBRs Figure: Recursion of the Hilbert curve Albert-Jan Yzelman

  39. R-Trees > Tree Construction Ordering on MBRs Figure: Second-order Hilbert curve Albert-Jan Yzelman

  40. R-Trees > Tree Construction Ordering on MBRs Figure: Third-order Hilbert curve Albert-Jan Yzelman

  41. ❘ R-Trees > Tree Construction Ordering on MBRs Map points in ❘ d to a point y ∈ ❘ d on the n th order Hilbert curve i Calculate the distance d = 2 dn of y on the n th order Hilbert curve i Write h n ( x ) = d = 2 dn for the n th order Hilbert coordinate transform of x Albert-Jan Yzelman

  42. R-Trees > Tree Construction Ordering on MBRs Map points in ❘ d to a point y ∈ ❘ d on the n th order Hilbert curve i Calculate the distance d = 2 dn of y on the n th order Hilbert curve i Write h n ( x ) = d = 2 dn for the n th order Hilbert coordinate transform of x The true Hilbert coordinate transform h is then defined by: x ∈ ❘ d h ( x ) = lim n →∞ h n ( x ) , Albert-Jan Yzelman

  43. R-Trees > Tree Construction Ordering on MBRs In software calculation we use instead: c ˜ h ( x ) = 2 dn where c is the cell number containing x . Albert-Jan Yzelman

  44. R-Trees > Tree Construction Hilbert R-tree Invariants: Each leaf node stores the Hilbert coordinate of the centre coordinate of the MBR of the object stored there Each internal node stores the maximum Hilbert coordinate value h max found at its children Albert-Jan Yzelman

  45. R-Trees > Tree Construction Hilbert R-tree Invariants: Each leaf node stores the Hilbert coordinate of the centre coordinate of the MBR of the object stored there Each internal node stores the maximum Hilbert coordinate value h max found at its children Insertion: Get the Hilbert coordinate h of the MBR of the new object to-be inserted Find the deepest internal node v with the smallest h max larger than h and insert the new object there Update the h max value at the v and all its parent nodes Check if v overflows Albert-Jan Yzelman

  46. R-Trees > Tree Construction Hilbert R-tree: overflow handling When a single internal node v overflows: 0.8 0.8 0.3 0.5 0.8 0.5 0.8 0.7 0.3 0.5 0.7 0.8 Figure: Overflow handling when there are no neighbour nodes Albert-Jan Yzelman

  47. R-Trees > Tree Construction Hilbert R-tree: overflow handling When a single internal node v overflows: 0.8 0.8 0.1 0.5 0.8 0.4 0.8 0.3 0.4 0.5 0.7 0.8 0.1 0.3 0.4 0.5 0.7 0.8 Figure: Overflow handling when there is a non-full neighbour Albert-Jan Yzelman

  48. R-Trees > Tree Construction Hilbert Top-down Greedy Split Like normal TGS, but with S containing only the Hilbert coordinate based ordering. Albert-Jan Yzelman

  49. R-Trees > Conclusions Outline R-trees Introduction 1 Basics 2 Tree Construction 3 Conclusions 4 Albert-Jan Yzelman

  50. R-Trees > Conclusions Experiments Some variations have been implemented in C++. The resulting library recently went public as an open-source project: http://www.sourceforge.net/projects/rtree-lib Applied to datasets supplied by Shell we obtained the following experimental results. Albert-Jan Yzelman

  51. R-Trees > Conclusions Experiments: construction time Grid size vs. building time 6 10 Bisection tree. 2/4 RandomTGS bulk−loaded basic R−tree. 2/4 HilbertTGS bulk−loaded basic R−tree. 5 10 Building time (in processor ticks) 4 10 3 10 2 10 1 10 0 10 3 4 5 6 7 10 10 10 10 10 Number of grid elements Albert-Jan Yzelman

  52. R-Trees > Conclusions Experiments: point query time Grid size vs. query time −− point query 0 10 Bisection tree. 2/4 RandomTGS bulk−loaded basic R−tree. 2/4 HilbertTGS bulk−loaded basic R−tree. Average required time per point query (in seconds) −1 10 −2 10 −3 10 −4 10 −5 10 2 3 4 5 6 7 10 10 10 10 10 10 Number of grid elements Albert-Jan Yzelman

  53. R-Trees > Conclusions Experiments: line query time Grid size vs. query time −− line query 1 10 Bisection tree. 2/4 RandomTGS bulk−loaded basic R−tree. 2/4 HilbertTGS bulk−loaded basic R−tree. 0 Average required time per line query (in seconds) 10 −1 10 −2 10 −3 10 −4 10 2 3 4 5 6 7 10 10 10 10 10 10 Number of grid elements Albert-Jan Yzelman

  54. R-Trees > Conclusions Experiments: box query time Grid size vs. query time −− box query 2 10 Bisection tree. 2/4 RandomTGS bulk−loaded basic R−tree. 2/4 HilbertTGS bulk−loaded basic R−tree. 1 10 Average required time per box query (in seconds) 0 10 −1 10 −2 10 −3 10 −4 10 2 3 4 5 6 7 10 10 10 10 10 10 Number of grid elements Albert-Jan Yzelman

  55. R-Trees > Conclusions Other query types: k -nn query (implemented) Hyperplane query (not implemented) Grid size vs. query time −− knn query 2 10 Bisection tree. 2/4 RandomTGS bulk−loaded basic R−tree. 2/4 HilbertTGS bulk−loaded basic R−tree. 1 Average required time per knn query (in seconds) 10 0 10 −1 10 −2 10 −3 10 2 3 4 5 6 7 10 10 10 10 10 10 Number of grid elements Figure: k nn query time Albert-Jan Yzelman

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend